Open Access
Volume
et al.
Moriyama
Method
Mining the Arabidopsis thaliana genome for highly-divergent seven
comment
transmembrane receptors
Etsuko N Moriyama*, Pooja K Strope*, Stephen O Opiyo, Zhongying Chen
and Alan M Jones
Addresses: *School of Biological Sciences and Plant Science Initiative, University of Nebraska-Lincoln, Lincoln, NE 68588-0660, USA.
Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68583-0915, USA. Departments of Biology and
Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
reviews
Correspondence: Etsuko N Moriyama. Email: abpy2j@r.postjobfree.com
Published: 25 October 2006 Received: 28 June 2006
Revised: 24 August 2006
Genome Biology 2006, 7:R96 (doi:10.1186/gb-2006-7-10-r96)
Accepted: 25 October 2006
The electronic version of this article is the complete one and can be
found online at http://genomebiology.com/2006/7/10/R96
reports
2006 Moriyama et al.; licensee BioMed Central Ltd.
This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
A combination of seven transmembrane proteinsmethods
Arabidopsis putative in Arabidopsis thaliana. is described and used to identify a minimum set of 54 candidate seven trans-
membrane receptors multiple protein classification
deposited research
Abstract
To identify divergent seven-transmembrane receptor (7TMR) candidates from the Arabidopsis
thaliana genome, multiple protein classification methods were combined, including both alignment-
based and alignment-free classifiers. This resolved problems in optimally training individual
classifiers using limited and divergent samples, and increased stringency for candidate proteins. We
identified 394 proteins as 7TMR candidates and highlighted 54 with corresponding expression
patterns for further investigation.
refereed research
The human genome encodes approximately 800 or more
Background
Seven-transmembrane (7TM)-region containing proteins 7TMRs, both with and without known cognate ligands (the
constitute the largest receptor superfamily in vertebrates and latter are so-called orphan GPCRs); they thus constitute >1%
other metazoans. These cell-surface receptors are activated of the gene complement [6,7]. More than 1,000 genes or 5%
by a diverse array of ligands, and are involved in various sig- of the Caenorhabditis elegans genome are predicted to
naling processes, such as cell proliferation, neurotransmis- encode 7TMRs; the majority of them appear to be chemore-
interactions
sion, metabolism, smell, taste, and vision. They are the ceptors [8]. Approximately 300 7TMR-encoding genes
central players in eukaryotic signal transduction. They are (about 1% to 2% of the genome) have been recognized in the
commonly referred to as G protein-coupled receptors Drosophila melanogaster genome [6,7]. Compared to such
(GPCRs) because most transduce extracellular signals into large numbers of 7TMRs found in animal genomes, very few
cellular physiological responses through the activation of het- 7TMpRs have been reported in plants and fungi. Only 22 Ara-
erotrimeric guanine nucleotide binding proteins (G proteins) bidopsis 7TMpRs have been described so far. Fifteen of them
[1]. However, an increasing number of alternative 'G protein- constitute the 'mildew resistance locus O' (MLO) family,
whose direct interaction with the G-protein subunit (G )
independent' signaling mechanisms have been associated
information
with groups of these 7TM proteins [2-5]. Thus, for precision has not been shown [9,10]. While another 7TMpR, GCR1 [11],
directly interacts with the plant G subunit GPA1 [12], it has
and clarity, we refer to these proteins simply as 7TM receptors
(7TMRs), and candidate proteins in organisms greatly diver- been shown that GCR1 can act independently of the heterot-
gent to humans are designated here as 7TM putative recep- rimeric G-protein complex as well [2]. Hsieh and Goodman
tors (7TMpRs). [13] recently reported five expressed proteins predicted to
Genome Biology 2006, 7:R96
R96.2 Genome Biology 2006, Volume 7, Issue 10, Article R96 Moriyama et al. http://genomebiology.com/2006/7/10/R96
have 7TM regions (heptahelical transmembrane proteins based classifiers and more sensitive alignment-free classifi-
(HHPs) 1 to 5) but these, like the other 16, do not have candi- ers, to predict candidate 7TMpRs in divergent genomes more
date ligands. Finally, an unusual Regulator of G Signaling effectively.
(RGS) protein (designated AtRGS1) has been predicted to
have 7TM regions [14]. RGS proteins function as a GTPase
activating protein (GAP) to de-sensitize signaling by de-acti- Results and discussion
vating the G subunits of the heterotrimeric complex. Identifying 7TMpR candidates using various protein
Because Arabidopsis seedlings lacking AtRGS1 have reduced classification methods
sensitivity to D-glucose [2,14,15], the possibility exists that Among many protein classification methods commonly used,
AtRGS1 is a novel D-glucose receptor having an agonist-regu- the current state-of-the-art and most used is the profile hid-
lated GAP function. Although we designate them 7TMpRs den Markov models (profile HMMs) [27]. It is used to con-
here, it should be noted that neither a ligand nor a full signal- struct protein family databases such as Pfam [28,29], SMART
ing cascade has been demonstrated yet for any of these plant [30,31], and Superfamily [32]. However, profile HMMs and
proteins, and only for a barley MLO protein has the 7TM other currently used classification methods such as PROSITE
topology been experimentally confirmed [9]. [33,34] and PRINTS [35,36] share an important weakness.
These methods rely on multiple alignments for generating
None of the reported Arabidopsis 7TMpR proteins share sub- their models (patterns, profile HMMs, and so on). Generating
stantial sequence similarity with known metazoan GPCRs robust multiple alignments is difficult or impossible when
constituting six different subfamilies. It appears that plant extremely diverged sequences are included in the analysis;
7TMpRs dramatically diverged from known metazoan GPCRs 7TMRs are one such protein family whose sequence similari-
over the 1.6 billion years since the plant and metazoan line- ties between subgroups can be lower than 25%. Furthermore,
ages bifurcated. It should be noted that Arabidopsis GCR1 alignments are generated only from known related proteins
shares weak but significant similarity with the cyclic AMP (positive samples), and, therefore, no information from neg-
receptor, CAR1, found in the slime mold [2,11,16]. There is ative samples (unrelated protein sequences) is directly incor-
also very weak similarity to the Class B Secretin family porated in the model building process. Identifiable 'hits' are,
GPCRs. However, other than GCR1, currently used search therefore, constrained by initial sampling bias, which
methods have not robustly identified plant 7TMpR proteins becomes reinforced when models are iteratively rebuilt from
as candidate GPCRs. This great sequence divergence high- accumulated sequences. Consequently, the predictive power,
lights the need for new approaches to identify divergent especially the sensitivity, of these classifiers decreases when
7TMR candidates in non-metazoan genomes. they are applied against extremely diverged protein families.
The human genome contains 16 G, 5 G, and 12 G genes. In To overcome this disadvantage and to increase sensitivities
stark contrast, both fungi and plants have much simpler G- against such non-alignable similarities, several 'alignment-
protein coupled signaling systems. For example, the Arabi- free' methods have been proposed recently. These methods
dopsis genome contains one canonical G, one G, and two quantify various properties of amino acid sequences and con-
G genes [17]. Similarly, a small number of G-proteins are vert them into a descriptor array. Once multiple sequences
found in fungi; there are two G, one G, and one G in Sac- with different lengths are transformed into a uniform matrix,
charomyces cerevisiae [18-20] while Neurospora crassa and various multivariate analysis methods can be applied. Kim et
some fungi have more genes encoding each subunit [21-23]. al. [37] and Moriyama and Kim [38] used parametric and
Therefore, it may be reasonable to assume that plants and non-parametric discriminant function analysis methods.
fungi have fewer GPCRs than human, and while approxi- Karchin et al. [39] incorporated profile HMMs with support
mately 200 Arabidopsis proteins were predicted to have 7TM vector machines (SVMs) using the Fisher kernel (SVM-
regions, sequence divergence precludes unequivocal assign- Fisher) so that negative sample information can be taken into
ment of any as an orphan GPCR [24,25]. However, at least 61 account when training the classifier. SVMs can be applied
7TMpRs have been recently predicted from the plant patho- with completely 'alignment-free' sequence descriptors, for
genic fungus Magnaporthe grisea genome [26], raising the example, amino acid and dipeptide compositions. Such align-
possibility that more divergent groups of 7TMpR proteins ment-free classifiers are shown to outperform profile HMMs
likely remain undiscovered in non-metazoan taxa. as well as Karchin et al.'s SVM-Fisher [40,41] (PK Strope and
EN Moriyama, submitted). Another multivariate method,
In this report, we describe our comprehensive computational partial least squares (PLS) regression, was used by Lapinsh et
strategy for identifying 7TMpR candidates from the entire al. [42] with physico-chemical properties of amino acids. We
protein sequence set predicted from the A. thaliana genome, recently re-evaluated the descriptors used with PLS and opti-
and compile their tissue-specific expression and co-expres- mized them to discriminate 7TMRs from other proteins [43].
sion patterns with G-proteins. To take advantage of different
approaches, we combined multiple protein classification We applied these methods against the entire predicted pro-
methods, including more specific (conservative) alignment- tein sequence set derived from the A. thaliana genome. As
Genome Biology 2006, 7:R96
http://genomebiology.com/2006/7/10/R96 Genome Biology 2006, Volume 7, Issue 10, Article R96 Moriyama et al. R96.3
shown in Table 1, among the 28,952 protein sequences, the Table 1
Sequence Alignment and Modeling system (SAM), a profile
Numbers of 7TMpR candidates identified by various methods
HMM method, predicted only 16 (excluding one alternatively from the A. thaliana genome
comment
spliced gene sequence) as 7TMpR candidates. Fifteen of them
are identified as MLO or similar to MLO and one as GCR1 in Methods Number of 7TMpR candidates*
The Arabidopsis Information Resource (TAIR) [44,45]. It
HMMTOP
clearly shows that SAM is highly specific (discriminating)
with no false positive, assuming that current annotations are 7TMs 236 (201)
correct. SAM failed to identify only one known MLO (MLO4: 6-8 TM 633 (545)
At1g11000). This protein, as well as AtRGS1 and five recently 5-9 TMs 1,091 (957)
predicted 7TM proteins (HHP1-5), were among the 16 previ- 5-10 TMs 1,343 (1,179)
reviews
ously predicted Arabidopsis 7TMpRs not included in the ran- SAM 16 (15)
domly sampled 500 7TMR training sequences (see Materials LDA 3,211 (2,935)
and methods). Thus, we concluded that the predictive power QDA 2,006 (1,820)
of SAM alone is insufficient to identify highly diverged and LOG 2,626 (2,394)
potentially novel 7TMpR sequences. KNN (K = 5) 3,125 (2,839)
KNN (K = 10) 3,202 (2,906)
The results obtained by SAM were compared with those
KNN (K = 15) 3,298 (3,004)
obtained by alignment-free methods. As shown in Table 1,
KNN (K = 20) 3,347 (3,043)
reports
alignment-free methods (LDA, QDA, LOG, KNN, SVM with
SVM-AA 2,263 (2,043)
amino acid composition (SVM-AA), SVM with dipeptide
SVM-di 2,004 (1,807)
composition (SVM-di), and PLS with amino acid properties
PLS-ACC 2,671 (2,466)
(PLS-ACC)) predicted 2,000 to 3,400 proteins as 7TMpR
candidates, which is about 10% of the entire predicted Arabi- *The numbers in parentheses show 7TMpR candidates after removing
dopsis proteome and about 30% to 50% of all possible trans- proteins derived from alternative splicing. The numbers of TM regions
predicted by HMMTOP.
membrane proteins (6,475 proteins) [24,25]. These
deposited research
alignment-free methods clearly call many false positives, and
need further optimization to improve their discrimination regions, the number of Arabidopsis 7TMpR candidates
power. becomes 1,179 proteins.
One advantage of alignment-free methods to be noted is their Choosing 7TMpR candidates by combining prediction
sensitivity against short or partial sequences [37,38]. Many of results
the 28,952 protein sequences used in this study are based Among the ten alignment-free classifiers, LOG misclassified
only on ab initio gene prediction results, and hence are likely seven previously predicted Arabidopsis 7TMpRs. KNN with
refereed research
to contain various types of errors. If only a part of a 7TMR K set at 5, 10, and 15 missed one, while KNN with K set at 20
protein is predicted correctly, alignment-free methods could classified them all correctly (see Materials and methods on
have a better chance to identify it. KNN). To reduce the number of false positives (non-7TMRs
predicted as 7TMRs) as well as false negatives (7TMRs pre-
Table 1 lists Arabidopsis proteins that were predicted to have dicted as non-7TMRs) and to obtain a set of 7TMpR candi-
five to ten transmembrane regions and bins them by the dates with higher confidence, we examined combinations of
number of transmembrane regions. HMMTOP 2.0 [46,47] the prediction results by the remaining six alignment-free
predicted 201 proteins as having 7TM regions. This number is methods (LDA, QDA, KNN with K = 20, SVM-AA, SVM-di,
close to a previous prediction (184 proteins) [24,25]. We and PLS-ACC). There were 652 proteins predicted as 7TMpR
interactions
should note, however, that no single method predicts 7TM candidates by all six methods (by choosing the strict intersec-
regions from all known 7TMRs exactly (see Materials and tion). Using the number of predicted TM regions to be 5 to 10,
methods). As mentioned above, it is also possible that some 394 (342 after removing duplicated entries due to alternative
deduced Arabidopsis proteins we analyzed do not contain the splicing) proteins were identified as 7TMR candidates. These
entire correct coding region. There were 952 Arabidopsis Arabidopsis proteins are listed in Additional data file 1. Of the
proteins predicted to have five to nine TM regions. Based on 22 previously predicted 7TMpRs, 20 were found in this list.
the distribution of predicted TM numbers obtained from the Although HHP4 and HHP5 were not included in this list, both
entire GPCRDB entries, this range (5 to 9 TM regions) could were identified by two of the alignment-free methods: KNN
information
cover almost all of the 7TMR candidates (99.1%; see Figure 1 and SVM-AA. Note that RGS1 and five HHP (as well as nine
and Materials and methods). The 22 previously predicted MLO and GCR1) sequences were excluded from the training
Arabidopsis 7TMpRs were predicted to have seven to ten TM set, and these six were not identified as candidate 7TMpRs by
regions (Figure 1). If we extend the range to 5 to 10 TM SAM.
Genome Biology 2006, 7:R96
R96.4 Genome Biology 2006, Volume 7, Issue 10, Article R96 Moriyama et al. http://genomebiology.com/2006/7/10/R96
list. The identification of multiple members from these gene
families using our alignment-free methods supported the
99.8 (99.1)
consistency of this approach. However, for most of these fam-
ilies, not all members were found. Additionally, eight single
97.6 (97.1)
HMMTOP
representatives of small protein families consisting of two to
TMHMM
five members and four single representatives of large protein
13
families were found in the list. Some of these proteins, espe-
400
cially those from large protein families, may represent false
positives as 7TMpR candidates. This 7TMR mining method
can be refined, for example, by re-training models as well as
300
Counts
using more flexible hierarchical classification.
200
The five predicted heptahelical proteins (HHP1-5) reported
by Hsieh and Goodman [13] were identified by sequence sim-
ilarity to human adiponectin receptors (AdipoRs) and mem-
100
brane progestin receptors (mPRs) that share little sequence
3
4 2 similarity to known GPCRs. HHP1-3 were identified in our
initial list of 394 but were culled from the final list of 54 Ara-
0 1 2 3 4 5 6 7 8 9 10
bidopsis 7TMpR candidates. This is because HMMTOP pre-
Number of TMs
dicted HHP1, HHP2, HHP4, and HHP5 to have seven TM
regions and intracellular amino termini, in contrast to known
Figure 1
bars) and TMHMM (gray bars) numbers predicted sample sequences
Distribution of transmembranefrom the 500 7TMR by HMMTOP (black
Distribution of transmembrane numbers predicted by HMMTOP (black GPCRs. This unusual structural topology was also found in
bars) and TMHMM (gray bars) from the 500 7TMR sample sequences.
AdipoRs [13,48]. HHP3 had eight predicted TM regions. Of
Proportions of the proteins predicted to have six to eight and five to
the 15 MLO proteins, 8 were also predicted to have 8 to 10 TM
nine TM regions by HMMTOP are shown at the top. The percentages
regions by HMMTOP (Figure 1). Recently, Benton et al. [49]
shown in parentheses were obtained from the entire 7,674 7TMR dataset
in GPCRDB. The numbers shown on the top of black bars are the number experimentally showed that Drosophila odorant receptors,
of previously predicted 22 Arabidopsis 7TMpR proteins.
another extremely diverged 7TMR family, have intracellular
amino termini. Among our 394 candidate list, 23 proteins
were predicted to have seven TM regions and intracellular
A further restriction to protein topology of exactly 7TM amino termini (Additional data file 1). Therefore, we consider
regions and an amino-terminus located extracellularly these 54 as a minimum working set of 7TMpR candidates,
reduced the candidate number to 64 (54 excluding duplica- and many of the other proteins included in the list of 394
tions due to alternative splicing). This set included nine of the should be examined in the second stage.
22 previously predicted 7TMpRs. These 54 7TMpR candi-
dates are the first targets for our further analysis and are sum- Expression patterns of genes encoding the 7TMpR
marized in Table 2 (also listed in Additional data file 2). candidates and G-protein subunits
Eighteen are described as simply 'expressed proteins' in the We utilized the Meta-Analyzer server of the Genevestigator
TAIR database (except for AT3G26090, which encodes web site to study spatial expression patterns of Arabidopsis
RGS1). Interestingly, one of them (AT5G27210) is known to genes encoding the 7TMpR candidates and G-protein subu-
have weak similarity to a mouse orphan 7TMR. While others nits. Note that the expression of MLO genes were not
are known to belong to certain protein families (for example, included in this analysis since we reported them recently
MtN3 family), in many cases, their molecular functions have [50]. As is shown in Figure 2, expression patterns of analyzed
not been identified, and further investigation on these 7TMpR candidates can be divided into two major groups;
7TMpR candidates is warranted. about half of them show distinct tissue specificity, whereas
the other half either exhibit less distinct expression patterns
The 54 proteins were grouped into families based on similar- or display ubiquitous expression. All genes encoding G-pro-
ities to known protein sequences. Eight of the 54 7TMpR can- tein subunits fall into the latter major group. Ubiquitous
didates, including GCR1 and RGS1, are encoded by single expression of genes encoding G-protein subunits allows over-
copy genes. In addition to the seven MLO proteins identified, lap with genes in both groups, and makes, in principle, co-
there are eight MtN3 family members, two proteins of an functioning of G-proteins with these 7TMpR candidates spa-
unnamed family consisting of six expressed proteins, as well tially and temporally possible. All eight genes encoding the
as multiple (two to three) members from smaller gene fami- MtN3 family proteins appear to have distinct tissue specific
lies (five or less). All members of the TOM3 family and the expression. Among them, At3g48740 and At4g25010 have
Perl1-like family, as well as the majority of the GNS/SUR4 the highest sequence similarities to At5g23660 and
family and an unnamed family consisting of five expressed At5g50800, respectively. Both pairs of genes share similar or
proteins (expressed protein family 2) were included in the overlapping expression patterns, suggesting relatedness/
Genome Biology 2006, 7:R96
http://genomebiology.com/2006/7/10/R96 Genome Biology 2006, Volume 7, Issue 10, Article R96 Moriyama et al. R96.5
Table 2
Summary of the 54 7TMpR candidates identified in this study1
comment
Groups* TAIR locus IDs
Multiple members from gene families
Nodulin MtN3 family proteins (8/17) At1g21460, At3g16690, At3g28007, At3g48740, At4g25010, At5g13170, At5g23660, At5g50800
MLO proteins (7/15) At1g11000 (MLO4), At1g26700 (MLO14), At1g42560 (MLO9), At2g33670 (MLO5), At2g44110
(MLO15), At4g24250 (MLO13), At5g53760 (MLO11)
Expressed protein family 1 (2/6) At1g77220, At4g21570
GNS1/SUR4 membrane family proteins (3/4) At1g75000, At3g06470, At4g36830
reviews
Perl1-like family protein (2/2) At1g16560, At5g62130
TOM3 family proteins (3/3) At1g14530, At2g02180, At4g21790
Expressed protein family 2 (3/5) At1g10660, At2g47115, At5g62960
Expressed protein family 3 (2/4) At3g09570, At5g42090
Expressed protein family 4 (2/5) At1g49470, At5g19870
Expressed protein family 5 (2/5) At3g63310, At4g02690
Single copy genes (8) At1g48270 (GCR1), At1g57680, At2g41610, At2g31440, At3g04970, At3g26090 (RGS1),
At3g59090, At4g20310
reports
Single member from small gene families (8) At2g01070, At3g19260, At2g35710, At2g16970, At1g15620, At1g63110, At4g36850, At5g27210
Single member from big gene families (4) At1g71960, At3g01550, At5g23990, At5g37310
*The number of candidates identified in this study belonging to each group is shown in parentheses (the number of all proteins in each group is given
after More detailed information is given in Additional data file 2.
similarity of their functions. Confirming the actual functions Materials and methods
of the 7TMpR candidates as GPCRs requires further extensive
deposited research
Arabidopsis protein data
testing. A possible involvement of these candidate proteins in We downloaded 28,952 protein sequences from TIGR (Ara-
'G protein-independent' signaling mechanisms also needs to bidopsis thaliana database release 5, dated 10 June 2004)
be explored. [51]. Among the 28,952 proteins, 2,760 are derived from
alternative splicing.
Conclusion Training data preparation for protein classification
We show that the profile HMM protein classification method, Positive training samples (known 7TMR sequences) were
currently one of the most used, is overly specific (conserva- obtained from GPCRDB (Information System for G Protein-
refereed research
tive) when applied to extremely diverged 7TMpR proteins. Coupled Receptors, Release 9.0, last updated on 28 June 28
Our premise is that there are more 7TMpRs yet to be identi- 2005) [6,7]. In the GPCRDB, 2,030 7TMRs (originally col-
fied in the A. thaliana and other genomes divergent to lected from the Swiss-Prot protein database) were grouped
humans. The limitations were that the lack of available sam- into six major classes (classes A to E plus the Frizzled/
ples limits the effectiveness of profile HMM methods, and Smoothened family) and six putative families (ocular albi-
while alignment-free methods are more sensitive, they have nism proteins, insect odorant receptors, plant MLO recep-
high rates for false positives. The candidate 7TMpR proteins tors, nematode chemoreceptors, vomeronasal receptors, and
provided in this study, for example, can be included to expand taste receptors). Five hundred 7TMR sequences were ran-
the training set and re-iteration using refined training sets domly sampled and used as the positive samples. Note that
interactions
can be done to reduce false positive rates. However, this is 'putative/unclassified' (orphan) 7TMRs and bacteriorho-
possible only after these new candidates are confirmed as true dopsins were not included in this dataset. These 500 7TMRs
positives experimentally. included six of the15 known Arabidopsis MLO proteins.
Among the 22 currently known Arabidopsis 7TMpRs, in
The strategy we described here overcomes the 'chicken-or- addition to the nine MLO proteins, GCR1 as well as six
egg' problem; predictions by multiple protein classification recently identified Arabidopsis 7TMpRs (AtRGS1 and HHP1-
methods and the number of predicted transmembrane 5; GPCRDB does not list these proteins) were not included in
regions were used to identify a more likely reduced set of the random 500 7TMR samples. Note that the 15 Arabidopsis
information
7TMR candidates. By setting up various methods as hierar- 7TMpRs not included in the training set can be used to assess
chical multiple filters, one can prioritize target protein sets for the classifier performance as test cases.
further experimental confirmation of their functions.
For negative samples, 500 non-7TMR sequences longer than
100 amino acids were randomly sampled from the Swiss-Prot
Genome Biology 2006, 7:R96
R96.6 Genome Biology 2006, Volume 7, Issue 10, Article R96 Moriyama et al. http://genomebiology.com/2006/7/10/R96
e
alu
nv
sio
res
ne
ion
af
xp
zo
e
La ots nt le
ns
Pe ult le leaf
0e
Ca wer enc
Ju sette leaf
ati t
Rouline pex
Ra co s
ng roo
on
llu pe
po on
Inf dicle tyl
Se iole f
D
10
a
tyl g
e
c
Ad enile
Call sus
Caoot a
Po men
Hy ed
Coedlin
eI
Ro sc
Floores
Sil icel
Pe ma
Elo al
Se que
Ov l
Se s
r pe
Pe len
=%
ar y
Sta al
Sh de
Ste d
ne
No m
ter
Se al
n
Color scale
p
e
d
g
v
Ge
t
t
l
l
i
Ce
Sti
6 At1g14530 >= 95%
5 At1g10660 >= 90%= 85% = 80% = 75% = 70% = 65% = 60% = 55% = 50% = 45% = 40% = 35% = 30% = 25% = 20% = 15% = 10% = 5% = 0%