Introduction

Gene expression in eukaryotic organisms is regulated by enhancer or silencer elements present in the promoter region. In animals, an additional element called an insulator is also present between these enhancer and silencer regions, which is involved in activating or repressing genes (Bushey et al. 2008). Approximately, 2 decades ago, the GAGA factor (GAF) was first identified in Drosophila melanogaster. This GAF is encoded by the gene Trithorax (Trl) that recognizes and binds to the GAGA motif of the insulator regions. Ten years later, another protein encoding Pipsqueak (Psq) was identified and predicted to bind to the GAGA motif with a different DNA-binding domain structure (Lehmann 2004). Based on the information available in Drosophila, proteins binding with the GAGA motif were also identified in humans and many plant species (Lehmann 2004). The GAGA-binding factors (GAFs) have been reported to be involved in growth and development. Recent research has demonstrated that GAGA-binding proteins assemble into higher order complexes that locally substitute nucleosomes to establish a specific chromatin environment. However, the roles of proteins that bind to the GAGA motif are more varied and have been linked to the epigenetic regulation of homeotic genes, such as by attracting silencing factors to specific sites and affecting the promoter-proximal pausing of RNA Polymerase II (Pol II) (Lehmann 2004).

GAFs binding transcription factor was first identified in soybean among the plant species through the dinucleotide repeat (GA/TC)n binding to the enhancer element of the heme and chlorophyll synthesis gene (GSA1) promoter region. However, the first functionally characterized GAGA-motif transcription factor-binding protein namely Barley B Recombinant (BBR) was reported in barley. It acts as an essential regulator of the homeobox gene Barley Knotted3 (BKn3), similar to the ortholog gene of Zea mays homeobox gene Knotted 1 (Kn1) (Santi et al. 2003). Due to the constitutive and ectopic expression of the BKn3 gene, homeotic transformation of the floral organs has been observed in the dominant hooded phenotype in barley (Santi et al. 2003). The mutation of 305 bp observed in the hooded phenotype was mapped and tagged to an intragenic duplication region in the fourth intron of the BKn3 gene which contains a (GA/TC)8- dinucleotide repeat motif (Santi et al. 2003). The BARLEY B-RECOMBINANT/BASIC PENTACYSTEINE (BBR/BPC) protein family binds to the GAGA-binding motif (GAFs) (Santi et al. 2003; Meister et al. 2004). As mentioned earlier, the two protein families identified in animals are reported to control gene expression by DNA-looping or through histone-modifying complex interaction at polycomb repressive DNA-elements (PREs) (Lehmann 2004). Even though a different set of protein families bind to the GAFs, they still share the same molecular function in the PREs or polycomb repressive complex (PRC) (Santi et al. 2003). These PRCs regulate chromatin compaction and gene expression through Histone 3 Lys27 trimethylation (H3K27me3), which is conserved in all eukaryotic organisms. The DNA-binding domain with a highly conserved basic pentacysteine at the C-terminus is a characteristic of the BBR/BPC family (Santi et al. 2003; Meister et al. 2004).

Earlier studies reported that the pentacysteine at the C-terminal is proposed to form a zinc-finger structure, which is involved in the DNA-binding process (Santi et al. 2003; Meister et al. 2004). Later, detailed in vitro studies conducted on these proteins reported that during oxidizing conditions, the production of less or excess zinc ions did not affect the BPC1 protein binding to GAGA motifs containing DNA probe in Arabidopsis (Theune et al. 2017). Based on this study, it was reported that the five cysteines play an essential role in protein structure dimerization and also in maintaining the stability of the protein through disulfide bonds (Theune et al. 2017). Covalent inter- and intramolecular S–S bonds have been consistently observed to be responsible for the strong interactions between BBR/BPC gene members, as seen in the previous experiments such as gel shift assays (Meister et al. 2004; Theune et al. 2017). Although it is suggested that the tetranucleotide GAGA/TCTC is the minimum required for proper binding, even dinucleotide repeats GA/TC can bind to BBR/BPC proteins in vivo experiments (Shanks et al. 2018). Motif prediction and its distribution in the whole genome analysis were conducted and an orientation-dependent enrichment of extended GAG-motifs was discovered to be close to the transcription start site and in introns in both Arabidopsis and rice (Santi et al. 2003).

In Arabidopsis, Basic Pentacysteine (BPC) was crucial for INNER NO OUTER (INO) activation by binding to the dinucleotide repeat (GA/TC)n in the promoter region (Meister et al. 2004). Seven BPC genes were identified in Arabidopsis, and based on sequence similarity, they were categorized into three classes. BPC1, BPC2, and BPC3 belongs to class I; BPC4, BPC5, and BPC6 belong to class II and BPC7 belongs to class III (Meister et al. 2004). Although this gene has been identified in many plant species, such as soybean, barley, rice, and Arabidopsis, genome-wide functional characterization is yet to be studied. Recently, genome-wide analysis was conducted for the BPC gene in cucumber and wheat (Li et al. 2019a, b; Ahad et al. 2021).

Tomato (Solanum lycopersicum) belongs to the major vegetable crop cultivated across the globe and serves as a model crop for developmental biology (Liu et al. 2022a, b). However, its growth and production are affected by various stresses, such as temperature, salt stress, drought stress, and pathogen attacks. Therefore, it is necessary to identify and understand the molecular functions of stress-related genes to elucidate the defense mechanism of tomato against different stresses.

In the present study, we identified and characterized the BPC gene family. We examined various parameters, including the number of exons and introns, chromosomal location, phylogenetic analysis, gene duplication, cis-elements, miRNA, protein 3D structure, gene-co-expression analysis, subcellular localization, and expression profiles in different organs. Additionally, we investigated the response of BPC genes to five abiotic stresses in tomato. Our study will serve as a foundation for enhancing our understanding of BPC transcription factors structure and molecular function involved in the stress tolerance of tomato crop.

Materials and methods

Identification of BPC genes in the tomato genome

Using the keyword search “BPC” genes in the TAIR database, all the BPC genes from Arabidopsis thaliana (https: //www.Arabidopsis.org)’ were downloaded. These AtBPC protein sequences were then blasted in the tomato database (http://www.solgenomics.net/tools/blast) with default parameters (Mueller et al. 2005). Totally, six BPC genes were identified in the Sol genomic database and their genomic, and protein sequences were downloaded. Based on their physical position and chromosome details, these genes were mapped to their respective chromosomes using the online server MapGene2Chrom web v2 (http://mg2c.iask.in/mg2c_v2.0/). To visualize the gene structure, including UTR, exon–intron the genomic and coding sequences were uploaded to the Gene Structure Display Server-2.0 (GSDS-2.0) web server (http://gsds.cbi.pku.edu.cn/) (Guo et al. 2007). Furthermore, the BPC protein sequences from nine species were used for domain prediction using the online web servers SMART tool (http://smart.emblheidelberg.de/) and NCBI CDD webtool (https://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi). To understand the BPC genes better, their molecular weight, GRAVY values, amino acid length, and isoelectric point were noted using the online webserver Expasy (http://cn.expasy.org/tools/protparam.html) (Gasteiger et al. 2005). Additionally, the open-reading frames (ORF) of the six BPC genes using the NCBI-ORF finder (https://www.ncbi.nlm.nih.gov/orffinder/). The protein sequences of the six SlBPC genes were aligned using the online tools Clustal Omega and ESPript (Robert and Gouet 2014). Moreover, conserved domains common between tomato, rice, and Arabidopsis were analyzed using the online webserver Multiple EM for Motif Elicitation (MEME) with parameters set to 10 motifs with a length of 6–50 amino acids (http://meme-suite.org/) (Bailey et al. 2009). In addition, the homology of SlBPC proteins was performed using the online webserver “Immunomedicine Group” (http://imed.med.ucm.es/Tools/sias.html). Subcellular localization prediction for the six SlBPC genes was performed using the online webserver “WoLF-PSORT” (https://wolfpsort.hgc.jp/) (Horton et al. 2007).

Phylogenetic analysis of BPC proteins

The protein sequences of BPC genes from nine different plant species, including Amborella trichopoda, Arabidopsis thaliana, Cucumis sativus, Oryza sativa, Sphagnum fallax, Solanum lycopersicum, Solanum tuberosum, Triticum aestivum, and Sorghum bicolor, were downloaded from their respective databases. These sequences were then used for multiple sequence alignment using the ClustalW webserver. The aligned file was utilized as input for phylogenetic analysis in MEGA 6.0 version using the neighbor-joining (NJ) method with the 1000 bootstrap replications (Tamura et al. 2021). The details of all the nine species protein sequence including their gene name and IDs are given in Table S11.

Microsynteny analysis

To identify gene duplication among the six SlBPC genes, we used the one-step MCscanX of TB tool software (Chen et al. 2020). We then used the BLASTP program with an E-value of < 10−10. The ratio of synonymous (Ks) and nonsynonymous (Ka) in duplicated gene pairs was calculated using the Nei and Gojobori (NG) program of the TB tool software (Koch et al. 2000). The mode of selection was determined based on the Ka/Ks ratio (Chen et al. 2020). The divergence time (T) of these duplicated gene pairs was calculated using the formula T = Ks/2r MYA (millions of years ago). Where Ks-represents the synonymous substitution rate per site and r stands for 1.5 × 10−8 substitutions per site per year for dicotyledonous plants (Koch et al. 2000). Similarly, to identify BPC duplicate genes among the three species tomato, Arabidopsis, and rice, synteny analysis was performed using the reciprocal BLAST search program of the TB tool software. The visualization of BPC duplicate genes across the genomes was presented using the circus program of the TB tool software (Chen et al. 2020).

Analysis of cis-acting elements, miRNA target sites, phosphorylation sites, and N-glycosylation Sites

Using the online webserver psRNATarget (http://plantgrn.noble.org/psRNATarget/analysis), miRNA in the exonic region of six BPC genes was predicted. Similarly, using the Plant-Care database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/), cis-acting elements present in the promoter region (2000 bp) were detected in the SlBPC genes (Lescot et al. 2002). In addition, using the online web-based tools NetPhos 3.1 (Blom et al. 1999) and the NetNGlyc 1.0 server (http://www.cbs.dtu.dk/ services/NetNGlyc/) phosphorylation sites (Ser/Thr/Tyr), and N‐glycosylation sites (typo Asn‐X‐Ser/Thr) were predicted in the SlBPC genes.

Homology modeling of SlBPC proteins

The six SlBPC protein’s function and its 3D structure were analyzed using the online webserver I-TASSER. The six SlBPC gene 3D models were generated using the LOMETS and TASSER assembly program of the I-TASSER software (Yang et al. 2015). Using the template sequence as input, the analogs were identified with the maximum score and homology match in the PDB database. The best 3D structures were then selected and refined using the ModRefiner program (Xu and Zhang 2011). The visualization of the 3D protein structure and its ligand-binding sites was performed using Discovery Studio v.21.1 software.

Sample collection and stress treatments

Tomato seeds of the Ailsa Craig genotype were obtained from the Giovannoni laboratory at the Boyce Thompson Institute. The seeds were germinated in nursery trays and maintained for 28 days in a growth chamber under optimal conditions at 25 °C day/20 °C night temperatures, with a 16-h light/8-h dark photoperiod, 55–70% humidity, and a light intensity of 300 μmol m−2S−1. Samples for organ-specific expression studies (roots, stems, and leaves) were collected from the 28-day-old seedlings. The remaining plants were maintained in the greenhouse under controlled condition at 25/20 °C day/night temperatures. Flower and fruit samples were collected from the plants when they reached the reproductive stage. Fruit samples were collected at different development stages: 1 cm fruits, IM fruits, MG fruits, B fruits, and B5 fruits as described in (Wai et al. 2022).

The 28-day-old tomato seedlings were treated with different abiotic stress, such as salt (NaCl), cold, drought, abscisic acid (ABA), and heat. The leaf samples were collected at five different time points 0, 1, 3, 9, and 24 h after exposure to heat stress at 40 °C, cold at 4 °C, salt (200 mM NaCl) and 100 μM ABA sprayed on the leaf samples (Wai et al. 2022). In the case of drought treatment, the tomato plants allowed to withhold the water and leaf samples were collected at different time points 0, 24, 48, 60, and 72 h. For the control samples (0 h), the tomato seedlings were grown in soil under normal conditions at (25 °C) without exposure to heat, cold, drought, and ABA stress (Wai et al. 2022). All the samples were collected for three biological replicates, frozen in liquid nitrogen, and stored at − 80 °C for further use.

Expression profiling of SlBPC genes by qRT − PCR

Total RNA was isolated from all samples using an RNeasy Mini kit (Qiagen, Germany), followed by purification with an RNase-free DNase I kit (Qiagen, Germany) as per the manufacturer’s instructions. Both the quality and quantity of the RNA were checked using a gel and NanoDrop® 1000 spectrophotometer (Wilmington, USA). cDNA was synthesized for all samples with the Superscript® III First-Strand kit (Invitrogen, USA) as per the manufacturer’s instructions. The gene-specific primers were designed for SlBPC using Primer3 software http://frodo.wi.mit.edu/primer3/input.html (Table S13), and each primer pair was validated for specificity using a melting curve analysis. The 18S rRNA (F: AAAAGGTCGACGCGGGCT, R: CGACAGAAGGGACGAGAC) gene from tomato was used as a normalization gene (Balestrini et al. 2007). qRT-PCR analysis was performed in a total volume of 10 μL reaction mixture containing 1 μL (50 ng) cDNA, 2 μL of 5 pmol concentration primers, 5 μL of iTaq SYBR Green (Qiagen, Germany), and 2 μL of double distilled water. The PCR was performed using a Quantstudio 5 ® 96 (Applied Biosystems, USA) with the following conditions: pre-denaturation at 94 °C for 5 min, followed by 40 cycles at 94 °C for 15 s, annealing at 60 °C for 20 s, and extension at 72 °C for 30 s. The relative expression for each gene was calculated using the 2−∆∆Ct method (Schmittgen and Livak 2008).

Co-expression network analysis of SlBPC genes

The RNA-seq data were downloaded from the NCBI database (https://www.ncbi.nlm.nih.gov/sra) for the co-expression analysis. The following accessions:SRR12026415 to SRR12026426, SRR7652562 to SRR7652573, SRR15607667 to SRR15607684, SRR12443369 to SRR12443371 SRR12443402 to SRR12443404, SRR12443419, SRR12443430, SRR12443435 to SRR12443437 and SRR12443441 were used for analysis (Table 1). FastQC was performed to read the raw sequences (Andrews 2010). Followed by the removal of low-quality sequences and trimming of adaptor sequences if any. The good quality sequences were then mapped to the tomato reference genome ITAG4.0 version using HISAT software, and a features annotation file in gff format was created as described by Kim et al. (2019). Those unique reads that mapped to the non-redundant gff annotation file were used for further expression analysis. Using the cuffdiff software, the expression levels were calculated as fragments per kb per million reads (FPKM). Further, to analyze the co-expression of genes with SlBPC genes, weighted gene co-expression network analysis (WGCNA) was done with the genes having FPKM values greater than 1 (Langfelder and Horvath 2008). To visualize the co-expressed genes, network cytoscape (https://cytoscape.org/) software was used. The GO and KEGG annotations of co-expressed genes were conducted using PANTHER and Kyoto Encyclopedia of Genes and Genomes (KEGG) Server (http://www.genome.jp/kegg/kaas/), respectively.

Table 1 Data pertaining to samples in RNA sequencing analysis

Subcellular localization

The coding regions of SlBPC genes were amplified using gene-specific primers (Table S12) and then cloned into the sGFP-tagged vector pGA3452 driven by the maize ubiquitin1 promoter to generate transiently expressed SlBPC–sGFP fusion proteins (Kim et al. 2009a, b). A vector expressing NLS-mRFP and OsAsp1–mRFP fusion proteins served as nuclear marker (Chang et al. 2020). The SlBPC–sGFP fusion construct and the NLS-mRFP and OsAsp1–mRFP construct were introduced into rice cell protoplasts using PEG-mediated transformation (Page et al. 2019). After overnight incubation at 28 °C in the dark for about 12–16 h, and the transformants were observed for fluorescent signal detection under a confocal fluorescence microscope (BX61; Olympus, Japan) with bright-field illumination, GFP channel, and RFP channel.

Statistical analysis

All the data were analyzed with three replications using Student’s t tests in SigmaPlot 14 (SYSTAT and MYSTAT Products, United States and Canada). The symbols (*, **, and ***) were used to represent statistical significance based on the p values of < 0.05, < 0.01, and < 0.001, respectively.

Results

In silico identification and characterization of BPC proteins in tomato

We identified a total of 6 non-redundant BPC genes in the tomato genome and named them as SlBPC1SlBPC6 according to their chromosomal distribution. The open-reading frames and amino acid lengths of tomato BPC proteins range from 660 and 219 for SlBPC6 to 972 and 323 for SlBPC4, with a mean of 874 and 290, respectively. The molecular weights (MW) of all SlBPC proteins are similar (31–36 kDa), except for SlBPC6, which has a molecular weight of 24 kDa. The predicted isoelectric points (pI) and grand average of hydropathicity (GRAVY) values of SlBPC proteins ranged from 8.69 and −0.502 to 9.69 and −0.886, respectively, indicating that the proteins exhibit basic and hydrophilic characteristics (Table 2). The protein homology among the SlBPC ranged from 26 to 55% similarity, and the highest protein homology clustered in the same group of phylogenetic analysis (Table S1).

Table 2 In silico characterization of the BPC genes and corresponding proteins in tomato

Phylogenetic analysis of BPC proteins

To understand the evolution of the BPC family proteins in tomatoes compared to monocot and other dicot species (Amborella trichopoda, Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Solanum tuberosum, Sphagnum fallx, Triticum aestivum, Cucumis sativus, and Sorghum bicolor), gene sequences were downloaded from their respective database. A total of 45 BPC genes collected from the nine plant species were used for phylogenetic analysis using Mega software (Fig. 1). These 45 proteins were classified into two major clades, and further categorized into four groups. The six tomato BPC proteins were scattered into three groups. Among the six genes, three proteins (SlBPC1, SlBPC2, and SlBPC4) were distributed in group II, followed by two proteins SlBPC3 and SlBPC5 in group I and remaining one protein (SlBPC6) in group III. None of the proteins were detected in group IV. Both the SlBPC and StBPC were clustered into the same branch, indicating a close relationship with potatoes. BPC genes from eight species except Sphagnum fallax were clustered into three groups I, II, and III.

Fig. 1
figure 1

Phylogenetic analysis of BPC proteins from tomato and different plant species. The neighbor-joining tree was constructed with the full-length BPC proteins using ClustalW and MEGA11 with 1000 bootstrap replicates. A species abbreviation was provided prior to each BPC protein name: Amtr, Amborella trichopoda; At, Arabidopsis thaliana; Cs, Cucumis sativus; Sf, Sphagnum fallx; St, Solanum tuberosum; Ta, Triticum aestivum; and Sb, Sorghum biocolor Sl, Solanum lycopersicum; Os, Oryza sativa

Gene structure and conserved motif and domain analysis of tomato BPC genes

Based on the gene structure analysis, it is revealed that exon and intron distribution differ among the SlBPC family genes (Fig. S1). The two SlBPC2 and SlBPC6 genes have only one exon, and these two genes were distributed in group II and group III. The remaining genes (SlBPC1, SlBPC3, SlBPC4, and SlBPC5) have two exons and one intron, but these four genes were distributed in groups I and II in the phylogenetic analysis. According to the motif analysis results, it is revealed that some motifs (motif 1, motif 2, and motif 3) are commonly present in all six BPC genes (Fig. S2). Likewise, motif 9 was present in the four genes, SlBPC1, SlBPC3, SlBPC4, and SlBPC5; motif 4 was present in SlBPC1, SlBPC2, and SlBPC4. However, motif 6 is uniquely present in the SlBPC5 gene. We also included Arabidopsis and rice along with tomato for motif analysis, which resulted in identifying the motifs (motif 1, motif 2, and motif 3) commonly present in both monocot and dicot indicating that these were evolutionarily conserved in both dicots and monocots plants (Figs. S2 and S3). In another case, three motifs (motif 5, motif 8, and motif 10) were specifically present in rice suggesting it has a specific role in monocots. For domain analysis, we included around BPC genes and the results showed that most of the genes have a single GAGA-binding domain (Fig. 2). All the BPC genes, domain architecture was composed of other additional domains, such as BPC, Pentacysteine, GRKMS, and WAR/KHGTN domains (Fig. 2). We aligned the BPC genes from Arabidopsis, rice, and tomato for multiple sequence alignment. The tomato BPC proteins with a length of 100 amino acids were highly conserved and started with the conserved Asn residue (except SlBPC1 which started with an Asp residue; SlBPC2 starts with a Lys residue; and SlBPC6 starts with a Gly residue). It was also noticed that BPC proteins contain five cysteine residues, the RxxGRKMS, and WAKHGTN motifs which were highlighted in the multiple sequence alignment (Fig. 3).

Fig. 2
figure 2

Schematic depiction of the domain organization of SlBPC proteins. GAGA-binding domain, BPC domain, Penta cysteine, GRKMS, and WAR/KHGTN identified in SlBPC proteins are shown

Fig. 3
figure 3

Alignment of GAGA domains of SlBPC proteins of Arabidopsis thaliana, Solanum lycopersicum, and Oryza sativa. The secondary structural elements determined by the Espript 3.0 web tool are indicated above the alignment

Chromosomal position, gene duplication, and microsynteny analysis of SlBPC genes

The six SlBPC genes were unevenly distributed on 5 chromosomes with most of them located near the distal portions (Fig. S4). Out of the six genes, 5 were distributed on Chr 1, 2, 4, 6, and 8 each with a single gene, while the Chr 4 contained two BPC genes. Among the six tomato BPC genes, only one segmental duplication (SlBPC2/SlBPC4) was predicted and these two genes belong to the same phylogenetic group II but are located on different chromosomes (Fig. S5). No tandem duplication was noted as the genes were not mapped within the 100-kb region on the same chromosome. The Ka/Ks ratio value was calculated and observed to be less than one, revealing that this gene duplicated during evolution through intense purification selection (Table 3). It was estimated that this duplicated gene pair separated 113.86 million years ago. We also conducted a comparative microsynteny analysis among Arabidopsis, rice, and tomato to understand the BPC orthologous gene pairs across the three genomes (Fig. 4). Three orthologous gene pairs were observed between Arabidopsis and tomato. In contrast, no duplicated gene pairs were predicted between Arabidopsis and rice, as well as between tomato and rice.

Table 3 Predicted Ka/Ks ratio of the duplicated gene pairs along with its divergence time
Fig. 4
figure 4

Microsyntenic relationship of BPC genes across Arabidopsis, tomato, and rice. The chromosomes of the three species are depicted by different colors: Arabidopsis, green; tomato, red; and rice, blue. All chromosomes are showed with the scale in megabase pairs (Mbp). The duplicated SlBPC genes in tomato genomes are indicated by red lines

Prediction of cis-regulatory elements, microRNA (miRNA) target sites, and phosphorylation sites

Gene expression is regulated by different hormones against the various stresses. In the present study, we identified many cis-regulating elements related with abiotic stress and phytohormones in the promoter region of BPC genes (Fig. S6). The CGTCA-motifs (JA response)/TGACG-motifs as well as ABA responsive cis-elements ARE and or ABRE were present in all six BPC genes. The cis-elements P-box, TATC, and GARE found in SlBPC1 and SlBPC3 were reported to be associated with gibberellic acid. TCA-elements reported to be involved in SA response was found in the SlBPC5 gene promoter region. The auxin response-related cis-elements TGA was present in SlBPC6 gene. MBS cis-regulating elements was present in the SlBPC1 and SlBPC4 may be associated to drought tolerance (Fig. S6 and Table S2). Similarly, miRNAs were predicted in the BPC genes to understand its regulation against abiotic stress (López-Galiano et al. 2019). We identified around 28 mirNA target sites in four BPC genes in tomato except in SlBPC4 and SlBPC5 genes (Table S3). We also predicted phosphorylation sites and N-glycosylation sites reported to be involved in post-translational regulation of stress-related proteins. Numerous phosphorylation sites, including CKII and many N-glycosylation sites, were detected in all six BPC genes in tomato (Table S4).

Comparative modeling of tomato SlBPC proteins

The 3D protein model of BPC genes shows a GAGA-binding domain (100 aa) composed of α-helixes, β-sheets, and coils (Fig. 5). The tomato BPC proteins have 3–10 -helixes, 2–5 -beta sheets, and 7–13 coils as secondary structural elements (Table S5). The c-scores, RMSD values, TM-scores, number of decoys, and cluster density of these six BPC proteins were calculated to estimate the validity of the conserved models (Tables S6, S7) (Yang et al. 2015). Along with that, we also predicted the ligand-binding sites which can interact with diverse molecules in the 3D protein models (Table S8). In addition, based on the gene ontology (GO) terms, the biological function, cellular process, and molecular function of SlBPC genes, such as transporter activity, binding ability to a variety of ligands, and transferase activity, were determined using the I-TASSER server (Table S9).

Fig. 5
figure 5

Predicted three‐dimensional homology structure of tomato BPC proteins. The final 3D structures of SlBPC proteins were built by Discovery Studio v.21.1. The secondary structural components: α‐helices (red), β‐sheets (cyan), coils (green), and loops (gray) as well as the top four putative-binding sites: site 1 (yellow sphere), site 2 (green sphere), site 3 (red sphere), and site 4 (blue sphere) are indicated in the predicted 3D models of A SlBPC1; B SlBPC2; C SlBPC3; D SlBPC4; E SlBPC5; and F SlBPC6

Gene co-expression network analysis

The co-expression analysis of SlBPC genes was performed using RNA sequencing data with weighted gene co-expression network analysis. The results showed that around 21,180 genes were co-expressed with five SlBPC genes (Fig. 6). The hub genes (SlBPC1, SlBPC3, SlBPC4, SlBPC5, and SlBPC6) were co-expressed with 5128, 5128, 5462, and 5462 genes, respectively. GO and KEGG pathway enrichment analysis revealed that some genes present in the co-expression network were not annotated with any biological process. Moreover, many genes were involved in the different biological pathways, such as metabolic pathways, biosynthesis of secondary metabolites, pyrimidine metabolism, and fatty acid metabolism (Fig. 6, Fig. S7, Table S10). The co-expressed genes with SlBPC1 were related to HSP (Solyc12g006940), WD domain (Solyc07g044850), zinc finger (Solyc09g031650), ATP inhibitor (Solyc11g068510), and GTP-binding protein (Solyc11g071910). Furthermore, genes like zinc finger C3HC4 (Solyc11g069410), RING finger protein (Solyc04g007500), 60S ribosomal protein L35a (Solyc05g052800), Calreticulin 2 calcium-binding protein (Solyc05g056230), and ribosomal protein S27 (Solyc01g008090) were co-expressed with the SlBPC3 gene. Genes namely calcium-dependent protein kinase3 (Solyc01g112250), 40 s ribosomal protein S12 (Solyc03g083290), Peptide methionine sulfoxide reductase msrb (Solyc10g047930), and Shikimate kinase like protein (Solyc08g076410) were co-expressed with the SlBPC4 gene. Likewise, some of the genes related to Glutathione-disulfide reductase (Solyc09g091840), Threonyl-trna synthase (Solyc08g074550), Vesicle-associated membrane protein 7B (Solyc03g083070), Zinc-transporter like protein (Solyc06g076440), and serine/threonine-protein phosphate (Solyc09g074320) were co-expressed with the SlBPC5 gene. The SlBPC6 gene was co-expressed with serine hydroxymethyltransferase (Solyc01g104000), Fructose-bisphosphate aldolase (Solyc10g083570), RNA helicase DEAD13 (Solyc03g114370), and BHLH transcription factor 054 (Solyc07g053290).

Fig. 6
figure 6

Weighted gene co‐expression network analysis (WGCNA) of SlBPC genes. The co‐expressed genes in the network of SlBPC1, SlBPC3, SlBPC4, SlBPC5, and SlBPC6. The SlBPC genes are marked in red

Subcellular localization of SlBPC proteins

The subcellular localization was predicted for all six BPC genes in tomato using an online server. The results showed that BPC proteins were localized in various parts of the cell including nucleus, mitochondria, and cytoplasm (Table S12). To confirm the prediction, six BPC proteins were fused with GFP in rice protoplasts. It was revelated that SlBPC1, SlBPC2, SlBPC3, and SlBPC6 were predominately localized in the nucleus (Fig. 7).

Fig. 7
figure 7

Subcellular localization of SlBPC proteins. SlBPCs‐SGFP fusion constructs were used to analyze the localization of SlBPC1, SlBPC2, SlBPC3, and SlBPC6, and the fluorescence signals were visualized with the confocal microscope. The NLS-mRFP construct was utilized as a Nucleus marker. Scale bars = 10 μm

Expression profiling of tomato BPC genes in different organs

The expression profiles of SlBPC genes in tomato organs revealed differential expression patterns in all tested organs (Fig. 8). We tested the expression level of these BPC genes in nine different organs: leaf, root, stem, flower 1 cm, IM, MG, B, B5 with the leaf sample considered as the control. Among the six genes tested, the duplicated gene pairs SlBPC2 and SlBPC6 showed expression levels more than the threshold of ≥ twofold compared to the control. SlBPC1 showed higher expression in B5 (1.2 fold), but the expression was less than the control in the remaining samples. SlBPC2 gene was highly expressed in B5 (2.8 fold change) followed by IM (1.36 fold change), B (1.17 fold change), and MG (1.02 fold change), while the remaining samples stem, root, flower, and 1 cm showed lower expression than the control. The third gene SlBPC3 was highly expressed in B5 (1.29 fold) followed by flower and stem (1.17- and 1.05-fold) compared to control (Fig. 8). The expression level of SlBPC4 was less than the control leaf sample. In the SlBPC5 gene, only two samples stem and flower (1.40 and 1.07) fold change were highly expressed than the control sample. Similarly, SlBPC6 showed more than twofold increase in the B5 organ (2.26 fold) followed by stem, B, IM, and root (1.60, 1.21, 1.15, 1.13) compared to the control sample.

Fig. 8
figure 8

Expression profiles of SlBPC genes in various organs: leaves, stems, roots, flowers, 1 cm fruits, immature fruits (IM), mature green fruits (MG), breaker fruits (B), and fruits 5 days after the breaker stage (B5). The standard deviations of the means of three independent biological replicates are represented by the error bars. The different asterisks above the bars indicate the significant variations between the control samples (leaves) and the other organs by Student’s t test with p values less than 0.05 for *, 0.01 for **, and 0.001 for ***, respectively

Expression profiling of SlBPC genes under abiotic stresses and phytohormone treatment

The relative expression of the six BPC genes exposed to various abiotic stresses including heat, cold, salt, drought, and ABA phytohormone treatment at five different time points was analyzed (Fig. 9). We tested the six BPC genes at five different time points, resulting in the expression level ranging from a 1.02-to-5.32-fold change at one or more time points after exposure to salt treatment (Fig. 9A). The expression level of the SlBPC1 gene gradually increased from 1.18 at 1 h to 3.12 at 9 h and 5.32 fold change at 24 h after being exposed to salt stress. Likewise, SlBPC4 was gradually induced and showed significant upregulation of 2.14 fold at 24 h. Similarly, the SlBPC6 gene showed significant upregulation by 2.11, 2.84, and 4.02 at 3 h, 9 h, and 24 h against salt stress. The SlBPC2 gene expressed 1.74 fold at 9 h and decreased at 24 h. SlBPC3 gene induced low expression compared to the control sample until 9 h and increased to a 1.34 fold change at 24 h. In contrast, SlBPC5 remained unchanged compared to the control (0 h).

Fig. 9
figure 9

Expression profiling of SlBPC genes under different abiotic stresses: A salt, B heat, C drought, D cold, and E ABA treatment. Error bars represent standard deviations of the means of three independent biological replicates. The various asterisk marks (* for p value < 0.05, ** for p value < 0.01, and *** for p value < 0.001) indicate statistically significant compared to its respective control using the Student t test

The transcript level of SlBPC1 was downregulated at 1 h after heat stress, but increased by 1.64 and 1.87 fold at 3 and 9 h, respectively (Fig. 9B). The expression of SlBPC2 was upregulated by a 2.55 fold change at 3 h and maintained until 24 h after heat exposure. Whereas SlBPC3, SlBPC4, and SlBPC6 showed low expression compared to the control at all time points. The transcript of SlBPC5 was initially downregulated and gradually increased by 1.52 at 3 h when compared to the control sample.

Under drought stress, most of the BPC genes were downregulated. The expression level of the SlBPC1 gene was initially downregulated but increased to a 1.20 fold change at 72 h (Fig. 9C). The SlBPC2 gene expression was unstable at different time points it reached to 1.15 fold change at 72 h compared to the control. The three genes (SlBPC3, SlBPC4, and SlBPC5) were significantly downregulated at all time points. SlBPC6 gene showed initial downregulation but increased by 1.33- and 1.48-fold at 48 h and 60 h after drought treatment.

The cold treatment triggered the expression of BPC genes in tomato (Fig. 9D). The expression level of SlBPC1 induced by 1.89 fold change at 3 h, but decreased at later time points. The SlBPC2 induced by 2.64 fold at 3 h after the cold stress and gradually decreased at subsequent time points. A similar trend was observed for SlBPC3, SlBPC4, and SlBPC5 genes with fold changes of 1.64, 1.90, and 1.55 at 3 h and decreased at later time points. In case of SlBPC6, gene was significantly downregulated at all time points.

After ABA treatment, most of the BPC gene was highly expressed in the tomato samples (Fig. 9E). The expression level of the SlBPC1 gene increased gradually to a 2.94 fold change at 24 h. Similarly, SlBPC2 was highly induced by 2.25 and 2.43 at 9 h and 24 h, respectively. The expression level of the SlBPC3 gene was unstable at different time points. The transcript of SlBPC4 slowly increased gradually by 1.97 fold change at 24 h after ABA treatment. Likewise, SlBPC6 showed a gradual increase at each time point and increased by 3.72 fold change at 24 h when compared to the control. In contrast, the SlBPC5 gene was downregulated at all time points.

Discussion

The Barley B recombinant (BBR) transcription factor domain is highly conserved and has a high percentage of sequence analogy. Members of the BBR/BPC family protein are reported to be involved in plant developmental regulation. According to the recent reports, BBR proteins play an important role in hormone signaling specifically auxin, cytokinin, and ethylene (Marius et al. 2017). Based on the diversity of their N-terminus, BPC proteins are classified into three classes class I (BPC1–BPC3), class II (BPC4–BPC6), and class III (BPC7) (Theune et al. 2019).

This study is the first attempt to characterize the BPC genes in a solanaceous species. A total of 6 BPC genes were identified to understand their function at a genome level, as well as their expression in different tissues and against five abiotic stresses. We performed a phylogenetic analysis with 6 BPC tomato genes, comparing them with four genes from Sphagnum fallax, four from Amborella trichopoda, six from potato, seven from Arabidopsis, four from cucumber, four from rice, five from wheat, and five from Sorghum bicolor (Fig. 1).

The phylogenetic results suggest that group I clustered both monocots and dicots that originated from a common ancestor before their separation (Wolfe et al. 1989). Group II and III were clustered with most dicot species, indicating that these proteins evolved after the monocot and dicot separation. In contrast, group IV clustered only with four proteins of Sphagnum fallax a moss that may have evolved from the common ancestor of chlorophytes and streptophytes (Merchant et al. 2007). Similarly, the phylogenetic analysis of cucumber BPC genes clustered both monocot and dicot species in group I and group II with many dicot species (Li et al. 2019a, b). Plant BPC and animal GAF TF interact and share similar molecular functions by regulating the polycomb group (PcG) to the (GA)n regions of the target genes for silencing them (Xiao et al. 2017). Animal GAFs TF have three domains: BTB/POZ domain at N-terminal, a central C2H2-type zinc-finger region, and glutamine-rich Q domain toward the C-terminal (Sahu et al. 2023). In case of plant BPCs TF, they contain conserved C-terminal region with a pentacysteine and central zinc finger-like DNA-binding domain and a coil–coil structure containing an alanine zipper domain which shares similarities with the leucine zipper at the N-terminal region. Interestingly, the coiled-coil domain of plants and the BTB/POZ domain at the N-terminal region were reported to have a similar functional role as well as being involved in dimerization but are structurally different (Theune et al. 2017; Sahu et al. 2023). Based on the protein sequence alignment and phylogenetic analysis, there is no structural or evolutionary similarity observed between plant BPC and animal GAF TF (Sahu et al. 2023). Adaptation is necessary for plants and animals for its better survival in the ecosystem. For survival, plants or animals can change their shape, physiology, and behavior. This GAF/BPC family protein has evolved with lot of changes in their sequences. Structural changes were observed in GAF proteins of Drosophila, Danio, and honey bee containing a single, four, and two zinc-finger domains, respectively (Matharu et al. 2010). Similarly, the X position of the fourth zinc finger is lysine in mammals but arginine in fishes. Likewise, plants have a unique five cysteine residue at the C-terminal for its adaptation (Sahu et al. 2023). For the domain analysis, all 17 BPC proteins from three species have been performed. Each BPC protein contained five additional domains in addition to the GAGA-binding domain. These results suggested that the additional domains may differ from the GAGA-binding domain, potentially with changes in the number of repeats to enhance diversification (Fig. 2). The gene structure contains introns and exons revealed that conserved motifs within the same cluster group shared similar number of exons and introns as well as motif compositions. This indicates that genes with similar functions have a closer evolutionary relationship (Fig. S1). The different gene structures of BPC genes arranged in different phylogenetic groups suggest potential functional differences that have emerged over the course of evolution. Additionally, a multiple alignment of BPC proteins consists of three motifs present in the protein sequences: five cysteine residues, RxxGRKMS, and WAKHGTN. These motifs were also observed in the BPC genes of wheat (Ahad et al. 2021).

A single duplication pair (SlBPC2/SlBPC4) was identified in tomato (Fig. S5; Table 3). Both genes are located on different chromosomes suggesting that duplication occurred segmentally due to a purification selection process to increase the number of BPC genes in tomato to adapt against abiotic stress. Gene duplication of BPC family genes was not reported in cucumber and wheat plant species (Li et al. 2019a, b; Ahad et al. 2021). Comparative microsynteny was conducted between tomato and Arabidopsis among the BPC genes (Fig. 4). Three orthologous duplicated genes were identified (AtBPC2/SlBPC3, AtBPC4/SlBPC1, and AtBPC7/SlBPC6), while none between rice and Arabidopsis or tomato. These results are well correlated with the closer evolutionary relationship with dicot species Arabidopsis than with monocot species rice. It is also possible that these genes in tomato could have originated from Arabidopsis during species separation.

The cis-acting elements can regulate the transcription factor in active or silence mode under different abiotic stresses (Biłas et al. 2016). We identified several cis-acting elements related to different hormones and abiotic stress responsive elements in the promoter region of BPC genes (Fig. S6; Table S2). The CGTCA-motifs (JA response) or TGACG-motifs cis-elements and ABA responsive cis-elements ARE and ABRE were present in all six BPC genes (Fig. S6). The cis-elements P-box, TATC, and GARE were present in SlBPC1 and SlBPC3, which were reported to be associated with gibberellic acid. TCA-elements reported to be involved in SA response, were present in the SlBPC5 gene promoter region. The auxin response-related cis-elements TGA were present in the SlBPC6 gene. MBS cis-regulating elements were present in SlBPC 1 and SlBPC4 which may be associated with drought tolerance. Similar MBS cis-elements and their role in abiotic stress have been reported in the PLATZ transcription factor in tomato (Wai et al. 2022).

miRNAs play an important role in regulating gene expression by inhibiting or suppressing translation. We identified 11 miRNAs (sly-miR3124, sly-miR3128, sly-miR10531, sly-miR396a, sly-miR396b, sly-miR396c, sly-miR10532, sly-miR7981, sly-miR4423, and sly-miR3164) that cleave four SlBPC genes with the remaining miRNAs (sly-miR5302, sly-miR4423, sly-miR395a, sly-miR395b, sly-miR395d, sly-miR395h, sly-miR395f, sly-miR395i, sly-miR395k, sly-miR395l, sly-miR395m, sly-miR395n, sly-miR395o, sly-miR395p, sly-miR10532, and sly-miR7981)involved in translation (Table S3). It has been reported that miR395 is involved in cucumber against salt stress (Lu et al. 2022). Likewise, miR396 has been reported to play a role in combating abiotic stress in pitaya (Li et al. 2019a, b). The miR395 has been shown to be differentially induced in Arabidopsis shoots under super dioxide (SO2) stress. However, two miRNAs sly-miR10532 and sly-miR7981 have been reported to suppress the mRNAs of galacturonosyltransferase-10 the main enzyme of pectin biosynthesis, in immature tomato fruit under drought stress (Asakura et al. 2023). Interestingly, the SlBPC3 gene was predicted to have many miRNA target sites. These results suggest that miR395, miR396, miR7981, and miR10532 may regulate the gene expression of SlBPC.

Phosphorylation is involved in post-translational modification by kinases and phosphatases in signaling pathways and stress response (Ardito et al. 2017). Around nine casein kinases (CKI and CKII) were identified in these six BPC genes (Table S4). In many plant species such as Arabidopsis, tomato, wheat, and soybean, it has been reported that protein kinases play an important role under cold and heat exposure (Vu et al. 2021). In another report, around 1098 phosphorylation sites were reported in grass species under heat and/or drought stress (Zhang et al. 2020). Based on these reports, it is suggested that phosphorylation sites (S154 for SlBPC1; S79, T102, T113, S158 for SlBPC2; S7 for SlBPC3; T182 for SlBPC4; S32 for SlBPC5 and S80 for SlBPC6) may be involved in the protein modification. However, to understand the exact role of these sites with additional experiments should be conducted.

To understand the protein interactions with ligand-binding sites and other molecules, a 3D structure of six SlBPCs was performed (Fig. 5). When compared to the α‐helices, and β‐strands, and number of coil coils is higher in all SlBPCs (Table S5) suggested that it could help the proteins to rotate and bind to ligands, viz., zinc, nucleic acid, magnesium, and calcium (Table S8). In addition to ligand-binding sites, SlBPC proteins can interact with other molecules like receptors, ions, peptides, and intercellular messengers to alter cell function. Additionally, we conducted a prediction of Go terms which classified the SlBPCs into biological processes (associated with signaling, regulation of cellular processes, and regulation of transcription and translation) cellular components (associated with membranes and obsolete cytoplasmic parts) and molecular functions (associated with molecular transducer activity and structural constituents of ribosomes) (Table S9). These results suggest that SlBPC proteins may play an important role in regulating the gene expression in tomatoes.

In general, most of the transcription factors are localized in the nucleus. In the present study, SlBPC1, SlBPC2, SlBPC3, and SlBPC6 were shown to be localized in the nucleus (Fig. 7). Similar results were observed in TaBPC genes (Ahad et al. 2021). Another transcription factor PLATZ was localized in the nucleus in tomato (Wai et al. 2022). These findings are consistent with BPC homologs from wheat suggested that BPC from different species may have a conserved function under abiotic stresses.

To understand the function of SIBPC genes in tomato expression analysis, a study was conducted in different organs. Out of the six genes tested, two genes SlBPC2 and SlBPC6 showed upregulation in reproductive tissues (B5) suggested that these genes may be involved in fruit development and ripening in tomato (Fig. 8). However, all genes showed low expression in vegetative tissues, such as roots, stems, and flowers. Similarly, four BPC genes of cucumber were tested in 12 different tissue samples, which revealed that all the genes were highly induced in mature fruit seed tissue (FS) and low in stems and tendrils tissues (Li et al. 2019a, b). This finding suggested that the BPC transcription factor is essential for the growth and development of tomato fruit and ripening.

Under various environmental stresses, the plants immune system is regulated by a network of genes, such as defense genes, transcription factors, and downstream genes activate under ABA dependent or independent pathways (Hernández and Sanan-Mishra 2017). We studied six BPC genes under five abiotic stresses at various time points (Fig. 9). Two genes SlBPC1 and SlBPC6 showed high expression at later stages under salt stress (Fig. 9A). A similar trend was observed in two of the four genes tested in cucumber, which showed higher expression in leaves at later stages compared to roots (Li et al. 2019a, b). Heat stress triggered an increase in the expression level of the SlBPC2 gene after 3 h of exposure which was maintained until 24 h (Fig. 9B). In contrast, the four cucumber genes expressed highly in roots at all the time points tested (Li et al. 2019a, b). None of the genes showed expression level more than two fold under the exposure to drought stress (Fig. 9C), while, all four genes expressed more than two fold in both leaves and roots in cucumber (Li et al. 2019a, b). On exposure to cold stress, only the SlBPC2 gene upregulated > two fold change at 3 h and then decline at later stages (Fig. 9D). Similarly, CsBPC genes were expressed in roots and leaves at one time point and decreased at remaining time points (Li et al. 2019a, b). Under the ABA treatment, three genes SlBPC1, SlBPC2, and SlBPC6 were upregulated at one or two time points at later stages (Fig. 9E). It has been reported that the ABA phytohormone regulates many stress-related genes under various environmental stresses (Suzuki et al. 2016). In contrast, all the genes were highly expressed (> 30 fold) in roots at all the time points in cucumber, whereas the expression level in leaves was low for all genes (Li et al. 2019a, b).

To better understand the role of SlBPC, gene co-expression analysis was performed based on RNA sequencing data (Fig. 6; Table S10). Except for the SlBPC2 gene, the remaining five genes were co-expressed with hub genes, respectively. Defense-related genes like hsp, WD domain and zinc finger were co-expressed with the SlBPC1 gene (Yan et al. 2023). Some genes reported to regulate against drought, cold, heat, and salt stress were co-expressed with the SlBPC3 gene (Song et al. 2016). Defense-related genes and drought tolerance genes were co-expressed with the SlBPC4 gene (Chico et al. 2002; Cui et al. 2022). Genes reported to be involved in heat stress were co-expressed with the SlBPC5 gene (Olivieri et al. 2021). Genes reported to be involved in the growth and development of tomatoes and resistant genes against yellow curl virus were co-expressed with the SlBPC6 gene. No reports of co-expression analysis of BPC family genes were found in cucumber and wheat (Li et al. 2019a, b; Ahad et al. 2021).

Conclusion

In our current study, we have comprehensively characterized the six SlBPC genes in the tomato genome. We have further categorized them into four phylogenetic groups based on gene structure, chromosome location, domains, and motif analysis. One segmental gene duplication pair was identified, which might be responsible for the expansion of BPC genes in tomato during evolution. Several cis-elements, and miRNAs were associated with regulating the gene expression of SlBPC genes. The predicted phosphorylation sites might be associated with the proteins structure and its interaction with ligand binding. RT-PCR results revealed that SlBPC2 and SlBPC6 were induced in B5 tissue, highlighting their crucial role in ripening. Three SlBPC genes (SlBPC1, SlBPC2, and SlBPC6) showed significant upregulation concluding their importance in response to abiotic stress. The co-expression analysis revealed that SlBPC genes were co-expressed with several genes related to stress adaptation, supporting their putative function in stress tolerance of tomato. Overall, our results might be useful for further exploration of the function of SlBPC genes on genetic improvement of tomato.