免费文献传递   相关文献

A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms


By using Thermal Asymmetric Interlaced PCR (TAIL-PCR) method, a DNA fragment of about 1 000 bp was amplified and cloned from a liverwort species (Lunularia cruciata (L.) Dum. ex Lindb). The nucleotide sequence of this fragment and its deduced amino acid sequence shared about 56% and 60% identity with those of exon 2 of CHS genes from vascular plants respectively. The four characteristic catalyzing sites of CHS were found conserved in the deduced amino acid sequences of the fragment when compared with other CHS sequences. This is the first report of cloning a CHS-like gene from liverworts, suggesting that the origin of CHS genes may predate liverworts. Using the CHS-like sequence from L. cruciata and CHS sequences from two fern-alien species, Psilotum nudum (L.) Griseb. and Equisetum arvense L., as outgroups, the phylogenetic trees of about 250 CHSs from 29 families of angiosperm plants were constructed by using the neighbour-joining (NJ), maximum parsimony (MP) and quartet puzzle (QP) methods. The results showed that the CHSs from most plant families were separated into two or more clades while sequences from the families Brassicaceae, Fabaceae and Poaceae were each grouped into an independent monophyletic clade. The relative base substitution rates were estimated for CHS genes in three plant families, Solanaceae, Convolvulaceae, and Asteraceae, where the heterogeneity rate was detected both within and among the families. Results indicated that CHS genes in angiosperm plants were greatly diverse in terms of copy number, base substitution rate, and duplication/deletion events, which might be correlated with the diversity of life history, habitat, floral characters, and defense system of angiosperm plants.


全 文 :Received 2 Jul. 2003 Accepted 15 Sep. 2003
Supported by the National Natural Science Foundation of China (39830020).
* Author for correspondence. Tel: +86 (0)10 62751847; E-mail: .
http://www.chineseplantscience.com
A Preliminary Study on the Origin and Evolution of Chalcone
Synthase (CHS) Gene in Angiosperms
HUANG Jin-Xia, QU Li-Jia, YANG Ji, YIN Hao, GU Hong-Ya*
(College of Life Sciences, Peking University, Beijing 100871, China)
Abstract: By using Thermal Asymmetric Interlaced PCR (TAIL-PCR) method, a DNA fragment of about
1 000 bp was amplified and cloned from a liverwort species (Lunularia cruciata (L.) Dum. ex Lindb). The
nucleotide sequence of this fragment and its deduced amino acid sequence shared about 56% and 60%
identity with those of exon 2 of CHS genes from vascular plants respectively. The four characteristic
catalyzing sites of CHS were found conserved in the deduced amino acid sequences of the fragment when
compared with other CHS sequences. This is the first report of cloning a CHS-like gene from liverworts,
suggesting that the origin of CHS genes may predate liverworts. Using the CHS-like sequence from L.
cruciata and CHS sequences from two fern-alien species, Psilotum nudum (L.) Griseb. and Equisetum
arvense L., as outgroups, the phylogenetic trees of about 250 CHSs from 29 families of angiosperm plants
were constructed by using the neighbour-joining (NJ), maximum parsimony (MP) and quartet puzzle (QP)
methods. The results showed that the CHSs from most plant families were separated into two or more
clades while sequences from the families Brassicaceae, Fabaceae and Poaceae were each grouped into an
independent monophyletic clade. The relative base substitution rates were estimated for CHS genes in
three plant families, Solanaceae, Convolvulaceae, and Asteraceae, where the heterogeneity rate was
detected both within and among the families. Results indicated that CHS genes in angiosperm plants were
greatly diverse in terms of copy number, base substitution rate, and duplication/deletion events, which
might be correlated with the diversity of life history, habitat, floral characters, and defense system of
angiosperm plants.
Key words: Lunularia cruciata ; chalcone synthase; phylogeny; substitution rate
about 340 amino acids residues. The exon 2 is more con-
served in terms of the length and nucleotide sequence than
the exon 1. The four residues acting as the chemically
active sites also locate in the exon 2 and are conserved in
all known CHS enzymes (CHS2A from Medicago sativa L.
as reference sequence, Ferrer et al., 1999). The high se-
quence similarity and conserved gene structure suggest
that CHS genes may originate from a common ancestor.
Since the flavonoids has been found existing in mosses
and vascular plants, it is thus speculated that the gene(s)
coding for CHS or CHS-like enzyme(s) should be present in
the genome of mosses and higher plants (Swain, 1986;
Stafford, 1991). Up to now many CHS genes have been
cloned from gymnosperm and angiosperm plants. However,
no such gene has been cloned from mosses. The most
primitive plant from which CHS genes were reported was a
fern-alien species, Psilotum nudum (Yamazaki et al., 2001).
At least two genes are found coding for CHS in most
angiosperm species, whereas in some species in the fami-
lies Solanaceae and Fabaceae more than eight CHS genes
are detected. The expression pattern of CHS genes has
Chalcone synthase (CHS), a key enzyme in the biosyn-
thetic pathway of flavonoids, is only found in plants. It
catalyzes a stepwise reaction of three acetate residues from
malonyl-CoA with 4 r-coumaroyl-CoA to yield the interme-
diate naringenin-chalcone. In plants, flavonoids play im-
portant roles in many physiological processes, such as
flower pigmentation, protection against UV-damage and
pathogens, and formation of root nodules in leguminous
plants (Koes et al., 1994). Since the first CHS cDNA
was cloned in 1983 (Reimold et al., 1983), CHS gene has
become an attractive model for studying the regulation of
gene expression and evolution of gene families (Ursula et
al., 1987; Mo et al., 1992; Dong et al., 2001; Koch et al.,
2001; Lukacin et al., 2001; Jez et al., 2002; Yang et al., 2002).
The CHS genes studied so far contain one intron and
two exons, with the only exception of one gene from Anti-
rrhinum majus L. that contains two introns (Sommer and
Saedler, 1986). It is also clear that the intron splits a cys-
teine codon where the position is conserved in all the CHS
analyzed. The first exon (exon 1) encodes about sixty amino
acids residues whereas the second exon (exon 2) encodes
Acta Botanica Sinica
植 物 学 报 2004, 46 (1): 10-19
HUANG Jin-Xia et al.: A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms 11
been studied extensively (Ryder et al., 1987; Koes et al.,
1989; Howles et al., 1995; Clegg et al., 1997; Ito et al., 1997).
It has been found that different CHS genes have different
expression profiles, e.g. some CHS genes in roots, some in
leaves, some in flowers or even in different parts of flowers,
or some wound- or UV-inducible, implying that different
CHS genes may have functionally diverged. However, the
evolution of CHS genes in different angiosperm families
has not been extensively studied yet.
In order to trace the “history” of CHS gene, it is neces-
sary to examine whether the CHS or CHS-like gene exists in
non-vascular plants and to study the evolutionary trends
of this gene in angiosperms. In this study, a liverwort
species, Lunularia cruciata, was selected for detecting
the CHS gene in its genome. The relationship of the CHS
genes of angiosperms was studies based on the phyloge-
netic tree. The evolutionary pattern of CHS genes in an-
giosperm families was also discussed in the aspects of base
substitution rates.
1 Materials and Methods
1.1 Materials
of Taq DNA polymerase, 0.25 mmol/L each of dATP, dCTP,
dGTP and dTTP. The template DNA was denatured at 94
℃ for 5 min prior to amplification. PCR was performed in a
Peltier Thermal Gradient Cycler programmed for 35 cycles
of 94 ℃ for 50 s, 51-60 ℃ for 1 min, and 72 ℃ for 1 min
followed by 72 ℃ for 10 min. A 417-bp fragment was
amplified, cloned and sequenced. The deduced amino acid
sequence of this fragment was found sharing about 70%
similarity with that of other CHSs. Based on this sequence,
three specific primers for TAIL-PCR were designed at its 5-
end and 3-end respectively: 5-end (P5-1: 5-CG AAG CAT
CCT TGC TGG TAG AG-3,P5-2: 5-GGT AGA GCA TGG
TGC GGT TCA CA-3 and P5-3: 5-TTC ACT CCA CTG GTG
GTG CAG AA-3); 3-end (P3-1: 5-CAT CTT CGG TGA TGG
AGC CTC AGT C-3,P3-2: 5-TGA TGG AGC CTC AGT
CCT CGT CAT T-3 and P3-3: 5-GCT ATC GAA GGA CGC
CTG ACT GAA G-3). Arbitrary degenerate (AD) primers
were AD1-1 (5-NTC GA(G/C) T(A/T)T (G/C)G(A/T) GTT-3
and AD1-2 (5-GT CGA (G/C)(A/T)G ANA (A/T)GA A-3).
TAIL-PCR amplification was performed according to the
protocol described by Liu and Huang (1998) and its strat-
egy is illustrated in Fig.1. After a DNA fragment was
Fig.1. The strategy of TAIL-PCR amplification of the CHS-like sequences from
Lunularia cruciata. There are three steps in TAIL-PCR for cloning the 5-end and 3-
end fragments of the 417 bp fragment. At the 5-end, the first amplification reaction
was carried out with the genomic DNA of L. cruciata as the template and P5-1 and
AD1-1 as primers; the second amplification reaction with the product of the first
reaction as the template and P5-2 and AD1-1 as primers; the third amplification
reaction with the product of the second reaction as the template and P5-3 and AD1-1
as the primers. At the 3-end, reactions were the same as that in the 5-end, except that
the primers P3-1 and AD1-2 were used in the first reaction, P3-2 and AD1-2 in the
second reaction, P3-3 and AD1-2 in the third reaction respectively. The specific PCR
products of the second and third reaction were cloned and sequenced.
1.1.1 Plant materials Fresh plants of
Lunularia cruciata L. Dun. ex Lindb. were
collected from the greenhouse of Peking
University.
1.1.2 Sources of CHS sequences from fern-
alien and angiosperm plants The CHS se-
quences used in this study were collected
from EMBL database and from Wang et al.
(2000). Only CHS genes were selected. The
genes coding for stilbene synthase (STS), 2-
pyrone synthase, acridone synthase,
valerophenone synthase, and so on in the
CHS superfamily were excluded in this study.
1.2 Methods
1.2.1 DNA isolation and gene cloning from
L. cruciata Total DNA was isolated from
fresh plants of L. cruciata by using the
modified CTAB method (Gu et al., 1995). A
pair of degenerated primers was designed
based on the conserved region of CHSs: 5-
AT(T/C) AC(T/C) CA(C/T) (G/C)TN (G/A/
C)T(A/C/T) TTC TGC AC(A/T/C) AC-3 and
5-AG(G/A) ATN GC(A/C/G) GGN CC(A/T)
CCN GG(G/A) TG-3. The PCR was per-
formed in a 50 mL reaction mixture contain-
ing 100 ng of total DNA of L. cruciata as
template, 25 pmol of each primer, 2.5 units
Acta Botanica Sinica 植物学报 Vol.46 No.1 200412
obtained in the upstream of 5-end and downstream of 3-
end of the 417-bp fragment and sequenced respectively, a
pair of primers equivalent to the primers used previously
(Wang et al., 2000) was designed, DQ5 (5-CCC TCC CTT
GAC GTT CGA CAG GAC-3) and DQ3 (5-CTA TTC GTT
CTC GAT CAG ATG CGG-3), to ensure cloning the exon 2
of the CHS or CHS-like gene. PCR amplifications were
carried out with the same reaction parameters used to am-
plify the 417-bp fragment.
PCR products were purified from the low-melting-point
agarose gel and cloned into a pGEM T-Easy vector
(Promega, Wisconsin). Plasmid DNA was purified using
Wizard Plus SV Minipreps DNA Purification System
(Promega, Wisconsin) and sequenced on an ABI 377 auto-
mated DNA sequencer using the Dye Terminator Cycle
Sequencing kit (PE Applied Biosystems, USA).
1.2.2 Data analysis Sequences were aligned by
CLUSTAL W (Thompson et al., 1994) and then adjusted
manually. To test the possible differentiation of relative
base substitution rate (abbreviate as rate in the following
text), the programs RRTree (Robinson et al., 1998) and
K2Wuli (Jermiin, 1996) were adopted to compare the rates
within and between plant families. Because the genes in
one plant family that were clustered in different lineages in
the tree constructed in this study may have significantly
different rate, the rate differentiation between lineages of
the same family and different families were tested. The
neighbor-joining (NJ) (Saitou and Nei, 1987) method
(implemented in MEGA2.0, Kumar et al., 2001) was used for
phylogenetic analysis with the model of Kimura-2-
Parameter. The robustness of the tree topology was as-
sessed by bootstrap analysis, with 1 000 resampling
replicates. Maximum parsimony (MP) and Quartet Puzzle
(QP) methods in PAUP 4.0b1 (Swofford, 1998) were also
used for phylogenetic analysis with the default settings.
The heuristic search with three options, MULPARS, 100
replications of random addition and TBR branch swapping,
were performed to search for the most parsimonious trees.
In order to obtain a support estimate for each node, a boot-
strap analysis (1 000 replications, heuristic search, TBR
branch swapping option, and simple addition of sequences)
was also performed.
2 Results
2.1 Cloning the exon 2 of a CHS-like gene from Lunularia
cruciata
The exon 2 of CHS or CHS-like gene was amplified from
many plant species ranging from ferns to angiosperms with
the primers as reported by Wang et al. (2000). However, no
amplified DNA fragment was detected with the same pair of
primers from L. cruciata genomic DNA. Thus, a new pair
of primers was designed in this study based on the more
conserved region of CHS genes. A 417-bp fragment was
obtained and it was found that its deduced amino acid se-
quence shared at least 72% identity to those of CHSs in
angiosperms.
A fragment of 401 bp and 396 bp was obtained from 5-
upstream and 3-downstream of the 417-bp fragment respec-
tively by TAIL-PCR. The sequence of the fragment was
179 bp and 56 bp overlapping with the 5- and 3-end of the
417-bp fragment respectively, making the total length of
962 bp. A stop codon was found in the 3-end of this
sequence, while the 5-end was still in the exon 2 of the CHS
gene but upstream of the position defined by the 5 primer
previously reported (Wang et al., 2000). A single DNA
fragment of about 800 bp was amplified with primers DQ5
and DQ3 and the total DNA of L. cruciata as template. It
was confirmed by sequencing that it coded for the exon 2
of a CHS-like gene designated as LCCHS-like. Sequence
comparison analysis showed more than 56% nucleotide
sequence identity and more than 60% deduced amino acid
sequence identity between LCCHS-like and the exon 2 of
CHSs of other plants respectively. It was interesting to
note that all catalyzing sites in LCCHS-like were the same
as in other CHSs (MCHS2A as reference sequence) (Fig.2).
2.2 Phylogenetic analysis of CHSs in angiosperms
The phylogenetic trees of LCCHS-like, two CHSs of
fern-alien species and about 253 CHSs of angiosperms were
constructed by using NJ, MP and QP methods, respectively.
The NJ tree is shown in Fig.3, which is modified in such a
way that the CHSs from the same family were represented
by a single branch if those CHSs were clustered together in
the original tree as a monophyletic clade, e.g. the branch of
the Brassicaceae represents 54 CHS sequences which were
grouped into a monophyletic clade in the original tree.
The first basal group of all the trees was L. cruciata, the
second and third basal groups were fern-aliens species.
Generally, the topology of the NJ tree was more similar to
that of the QP tree than to that of the MP tree. In the case
of certain positions where the NJ and QP trees disagreed
with each other, the QP tree usually had the similar pattern
with the MP tree. For example, the basal groups in an-
giosperms were the Nymphaeaceae and Fabaceae in the
QP and MP trees, while in the NJ tree the basal group was
the Caryophyllaceae that was clustered with the monocot
families in the QP and MP trees.
In the NJ tree, CHSs from angiosperm plants were clus-
tered into two major clades (Fig.3). The cladeⅠcontained
HUANG Jin-Xia et al.: A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms 13
all of the angiosperm families in this study except the
Caryophyllaceae, while the clade Ⅱ only contained six
families, i.e. Asteraceae, Cannabidaceae, Convolvulaceae,
Nymphaeaceae, Orchidaceae, and Solanaceae. The two
clades did not correspond with traditional groups of mono-
co ts and d ico ts. The monocots d id not form a
monophyletic group, and some CHSs from the Alliaceae
and Orchidaceae were clustered with those from dicots.
Four patterns of grouping for CHSs were found in 29
plant families of angiosperms. The first and simplest one
was found in the families Brassicaceae, Fabaceae, and
Poaceae. There were at least 16 CHS genes available in
Fig.2. Sequence of the exon 2 of CHS-like gene from Lunularia cruciata, and its deduced amino acid sequence with four catalyzing sites
is underlined.
Acta Botanica Sinica 植物学报 Vol.46 No.1 200414
each of these families and all of them were grouped into a
monophyletic clade on family bases. These three families
were in the cladeⅠ. The second one was found in about
nine families: Apiaceae, Juglandaceae, Lamiaceae, Liliaceae,
Magnoliaceae, Rosaceae, Rutaceae, Theaceae, and
Vitaceae. Each of these families had about three to six CHS
genes available, but these genes were not grouped on the
family bases and they appeared at least twice in the tree.
They also belonged to the clade Ⅰ. The third one was
found in six families: Asteraceae, Cannabidaceae,
Convolvulaceae, Nymphaeaceae, Orchidaceae, and
Solanaceae. The number of CHS genes available in these
families varied from two (the Nymphaeaceae) to 34 (the
Convolvulaceae). The common feature in these families
was that CHS genes were found in both clades Ⅰ and Ⅱ,
and the lineages of some families had long branches, such
as Asteraceae, Convolvulaceae, Nymphaeaceae,
Orchidaceae, and Solanaceae. The fourth one was found
in the rest 11 families, and most of them only had one se-
quence available. These CHS genes dispersed in the
Fig.3. Phylogenetic tree constructed by using NJ method. The number following a family represents the different lineage within a
family, and the number in parentheses indicates the number of sequences in a particular branch. The roman figures indicate the clade.
Each color represents one of the four patterns of grouping.
HUANG Jin-Xia et al.: A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms 15
different branches of the clade Ⅰ.
2.3 Rate tests of CHS genes
In clade Ⅱ, CHSs from some families such as Asteraceae,
Solanaceae and Convolvulaceae were divided into two to
three lineages, and one of the lineages had a long branch in
the NJ tree. It indicated that CHS genes in different lin-
eages might have evolved at different rates.
First, the rates among the lineages within the same fam-
ily were calculated (Table 1). The two lineages in the
Asteraceae, Asteraceae 1 and Asteraceae 2, had signifi-
The clade Ⅰ contained two lineages from the Solanaceae,
one from the Convolvulaceae and another one from the
Asteraceae. The rates among these four lineages are shown
in Table 1. The rate differences among these lineages were
not significant, except the difference between
Convolvulaceae 1 and Asteraceae 1. In clade Ⅱ
Convolvulaceae 2 evolved significantly faster than the other
lineages in the Solanaceae and Asteraceae. The results
clearly show that the rates of CHS genes are heteroge-
neous both within and among certain families.
Table 1 Relative base substitution rate tests of CHS genes
Lineage 1/Lineage 2 dKa± SD P
Within family
Asteraceae 1/Asteraceae 2 -0.065 225± 0.019 8 0.001 012**
Solanaceae 1/Solanaceae 2 0.004 333± 0.014 206 0.760 389
Solanaceae 3/Solanaceae 4 -0.041 60± 0.028 252 0.140 961
Solanaceae 1/Solanaceae 3 -0.074 38± 0.015 485 0.000 002**
Solanaceae 1/Solanaceae 4 -0.115 97± 0.026 326 0.000 012**
Solanaceae 2/Solanaceae 3 -0.078 71± 0.020 048 0.000 088**
Solanaceae 2/Solanaceae 4 -0.120 31± 0.027 667 0.000 015**
Convolvuaceae 1/Convolvuaceae 3 0.043 357± 0.034 030 0.202 648
Convolvuaceae 1/Convolvuaceae 3 -0.152 045± 0.047 37 0.001 333**
Convolvuaceae 2/Convolvuaceae 3 0.195 402 3± 0.047 73 0.000 044**
In cladeⅠ
Convolvuaceae 1/Solanaceae 1 0.014 091± 0.033 139 0.670 683
Convolvuaceae 1/Solanaceae 2 0.033 191± 0.036 961 0.369 192
Convolvuaceae 1/Asteraceae 1 0.049 232± 0.022 365 0.027 731*
Solanaceae 1/Asteraceae 1 0.028 961± 0.021 186 0.171 590
Solanaceae 2/ Asteraceae 1 0.027 272± 0.023 725 0.250 304
In cladeⅡ
Convolvuaceae 2/Solanaceae 3 0.112 872± 0.031 382 0.000 326**
Convolvuaceae 2/Solanaceae 4 0.270 373± 0.022 591 0.000 184**
Convolvuaceae 2/Asteraceae 2 0.101 151± 0.029 912 0.000 725**
Convolvuaceae 3/Solanaceae 3 -0.035 34± 0.031 802 0.266 463
Convolvuaceae 3/Solanaceae 4 0.016 291± 0.021 671 0.452 142
Convolvuaceae 3/Asteraceae 2 -0.027 69± 0.036 091 0.325 756
Solanaceae 3/Asteraceae 2 0.022 760± 0.036 662 0.534 708
Solanaceae 4/Asteraceae 2 -0.030 78± 0.028 029 0.272 175
Reference sequence: AB030004 from Equisetum arvense; dKa, difference between the two groups compared on the number of nonsynonymous
substitution per nonsynonymous sites; SD, standard deviation; +, the group on the left side of the pairwise comparison with a faster rate; -, the
group on the right side of the pairwise comparison with a faster rate of substitution; P, exact probability; *, P<0.05; **, P<0.01.
3 Discussion
3.1 The origin of CHS genes
It is postulated that the structural genes encoding the
enzymes for secondary metabolism have been derived from
genes encoding enzymes of primary metabolism (Koes et
al., 1994). The condensation of p-coumaroyl CoA with
malonyl-CoA, which is catalyzed by CHS, is similar to con-
densation reactions in fatty acid biosynthesis. Therefore,
cantly different rates (2 faster than 1). The rates between
Solanaceae 1 and Solanaceae 2 or between Solanaceae 3
and Solanaceae 4 were not significantly different, but the
rates between Solanaceae 1/2 and Solanaceae 3/4 were sig-
nificantly different (3/4 faster than 1/2). Genes in
Convolvulaceae 2 evolved faster than those in
Convolvulaceae 1 and Convolvulaceae 3.
Second, with the reference to the NJ tree the rates among
those three families in clade Ⅰ and clade Ⅱ were calculated.
Acta Botanica Sinica 植物学报 Vol.46 No.1 200416
the condensing enzyme of fatty acid biosynthesis (Fab)
from Escherichia coli and CHS are thought to originate
from a common ancestor (Verwoert et al., 1992). Recently,
the RppA gene is reported and thought to code for a CHS-
related synthase. The RppA gene from a Gram-positive,
soilliving filamentaous bacterium Streptomyces griseus
encodes a 372-aa protein that shows functional similarity
to CHS, i.e. RppA selects malonyl-CoA as the starter, car-
ries out four successive extensions and releases the result-
ing pentaketide to cyclize to 1, 3, 6, 8-tetrahydroxynaphth-
alene (THN) (Funa et al., 1999; 2002).
Although Fab and RppA have been postulated to have
the same ancestor as CHS, they share less than 30% amino
acid sequence similarity to CHSs, suggesting that they have
been diverged greatly from CHSs. It is important to obtain
CHS or CHS-like genes from primitive plants in order to
find the “missing link” in the evolutionary history of CHS.
Based on the deduced amino acid sequence, LCCHS-like is
most likely to catalyze a reaction as same as or similar to
that catalyzed by CHS because all the four characteristic
catalyzing sites found in CHSs are well conserved in LCCHS-
like. Although no CHS gene has been found in moss plants
yet, it is reasonable to predict that, because a CHS-like
gene has been found in liverwort, CHS or CHS-like genes
should exist in mosses. Furthermore, the fact that the
LCCHS-like has relatively high sequence similarity to the
CHSs of vascular plants suggests that there might be more
“primitive” CHS or CHS-like genes in algae. Further work
on cloning the complete CHS-like genes from L. cruciata
and moss plants is needed to compare them with those of
vascular plants and to have an insight into the origin of
CHS genes.
3.2 Gene tree verse plant family tree
Because the sequences used to construct the phyloge-
netic trees are only about 876 bp long but from a wide range
of angiosperm species, the distance-based method may be
more suitable than the most parsimonious method, which
relies only on informative sites. Since it is difficult to dis-
tinguish the orthologous genes from paralogous genes in
this data set, and also difficult to get all members of CHS
genes from every family, the phylogenetic tree is only a
rough estimation of the relationship of the available CHSs,
and cannot be used as a phylogenetic tree for the plant
species or families.
3.3 Evolution of CHS gene family
Although the evolution of individual genes used in
phylogeny reconstruction is generally not well understood
and might have a negative impact on phylogenetic analysis,
the phylogenetic results could provide the framework of
the insight into the evolution of genes or gene families. In
the previous studies on the evolution or phylogeny of CHS,
sampling was limited within certain families or genera (Koes
et al., 1989; Durbin et al., 1995; Clegg et al., 1997; Koch et
al., 2000; Yang et al., 2002). In this study, all the available
angiosperm sequences were adopted in order to draw an
overall picture of CHS evolution.
In general, gene duplication is considered to be a major
mechanism for evolutionary innovation and functional di-
vergence (Ohta, 1993; Force et al., 1999). Duplication of
CHS genes was detected in most plant species so far, in-
cluding a primitive vascular plant, Psilotum nudum. The
fact that the two CHS genes from P. nudum were clustered
together in the phylogenetic trees indicated that the dupli-
cation event resulting in these two genes occurred at least
after the divergence of the ancestor of angiosperms from
ferns. More data from fern species are needed to draw a
general conclusion on the evolution of CHS genes in this
primitive vascular plant group.
In angiosperms, the number of CHS genes varies greatly
in different families and the phylogenetic tree reveals a com-
plicated evolution history of CHS genes. The four group-
ing patterns may represent two evolutionary trends of CHS
genes. The first trend is that the CHS genes in a family,
such as Fabaceae, Poaceae or Brassiaceae, maintain a close
relationship with homogenous base substitution rate, which
is reflected by a monophyletic clade in the NJ tree for each
plant family. In the families Fabaceae and Poaceae, it ap-
pears that all the CHS genes are the descendents of the
duplication after the divergence of the families; while in the
family Brassicaeae the duplication of CHS genes appears
somehow to be “inhibited”, therefore, most of the species
in this family has only one CHS gene. The second trend is
that the CHS genes in each family are clustered into more
than one lineages with other families, or they do not form a
monophyletic clade. One explanation for the second trend
is that the duplication events of these genes occurred be-
fore the differentiation of those families, which was also
found in the phytochrome (PHY) gene of angiosperms
(Donoghue and Mathews, 1998; Mathews and Donoghue,
1999). Taking into consideration of the heterogeneous sub-
stitution rates within some plant family, it could be pre-
dicted that the duplication might have also occurred after
the family divergence and that some of the gene members
evolved faster than the others. Therefore, some lineages in
the phylogenetic tree may not be a true phylogeny, but a
result of parallel or convergent evolution. In this study,
the CHS genes in 15 families follow the second trend. These
results clearly indicate that the current phylogenetic trees
HUANG Jin-Xia et al.: A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms 17
of the CHS gene does not reflect the phylogeny of the
angiosperm families, but can be applied for the analysis of
the evolutionary pattern in the CHS gene family.
The analysis on substitution rate of gene sequences
provides a tool to measure the degree of differentiation
following the gene duplication. If one of the duplicated
genes would evolve with new functions, it must diverge
fast enough to escape from the homogenizing effects of
gene conversion or recombination. In this study, the CHS
genes within or among some families were divergent in terms
of the relative base substitution rate and some genes
evolved faster than the others, which was consistent with
their functional divergence. For example, in the family
Asteraceae the two CHS genes in lineage Asteraceae 2 had
faster rates than other genes in the family, and it was re-
ported that nonsynonymous-synonymous substitution
rate ratio for the gene ancestral to those two genes was
higher than for the other lineages (Yang et al., 2002). The
two genes had similar functional divergence to the gene
GCHS2 that was a CHS-like gene with different substrate
specificity and the truncated catalytic profile in the
Asteraceae (Helariutta et al., 1996). Similarly, CHS genes
in the Convolvulaceae, especially in the lineage
Convolvulaceae 2, appeared to be among the most rapidly
evolving CHS genes; and in the Solanaceae, Solanaceae 3
also evolved rapidly and had highly diverged from the rest
of the CHS sequences from the Solanaceae. Therefore, the
CHS genes in the Solanaceae and Convolvulaceae may
also have diverged functionally. It is possible that the
different evolutionary rate of CHS genes is correlated to
the differentiated functions of the genes. For example, the
Convolvulaceae 2 contained CHS-A and CHS-B genes of
Ipomoea purpurea (L.) Roth exclusively, and they were
postulated to have diverged function from CHS-C and CHS-
D genes, which encoded enzymes with typical CHS activi-
ties (Durbin et al., 1995; 2000; 2001; Clegg et al., 1997). It is
most likely that the CHS genes in the lineages Asteraceae
2, Convolvulaceae 2 and Solanaceae 3 of the clade Ⅱ have
diverged from the rest of the CHS gene lineages both in
sequence and function.
Although there are only several plant families with
enough number of CHS genes available for statistic analy-
ses in this study, almost each family has a unique evolu-
tionary pattern. The overall picture of CHS evolution in
angiosperms, if there is one, is that the gene number varies
greatly among plant families, duplication/deletion of the
gene occur repeatedly, and some genes in certain families
such as Asteraceae, Solanaceae and Convolvulaceae, may
have evolved with new functions independently.
Angiosperm plants are the most diverse group in the
vascular plants in terms of habitat, life history, floral struc-
ture and coloring, defense system, interaction with
microorganisms, and so on. The gene duplication may pro-
vide new genetic materials or more sophisticated regula-
tion of gene expression for the individuals to adapt the
environment. More data on the function of different CHS
genes are needed to elucidate the significance of the diver-
sity of this gene in angiosperms.
Acknowledgements: We gratefully acknowledge WANG
Mei-Zhi (Institute of Botany, The Chinese Academy of
Sciences) for identifying the liverwort plant and Dr. REN
Bo for providing the AD primers.
References:
Clegg M T, Cumming M P, Durbin M L. 1997. The evolution of
plant nuclear genes. Proc Natl Acad Sci USA, 94:7791-7798.
Dong X, Braun E L, Grotewold E. 2001. Functional conservation
of plant secondary metabolic enzymes revealed by comple-
mentation of Arabidopsis flavonoid mutants with maize genes.
Plant Physiol, 127:46-57.
Donoghue M J, Mathews S. 1998. Duplicate genes and the root
of angiosperms, with an example using phytochrome
sequences. Mol Phylogenet Evol, 9:489-500.
Durbin M L, Learn G H, Huttley G A, Clegg M T. 1995. Evolu-
tion of the chalcone synthase gene family in the genus Ipomoea.
Proc Natl Acad Sci USA, 92:3338-3342.
Durbin M L, McCarg B, Clegg M T. 2000. Molecular evolution
of the chalcone synthase multigene family in the morning glory
genome. Plant Mol Biol, 42:79-92.
Durbin M L, Denton A L, Clegg M T. 2001. Dynamics of mobile
element activity in chalcone synthase loci in the common
morning glory (Ipomoea purpurea). Proc Natl Acad Sci USA,
98:5084-5089.
Ferrer J L, Jez J M, Bowman M E, Dixon R A, Noel J P. 1999.
Structure of chalcone synthase and the molecular basis of
plant polyketide biosynthesis. Nat Struct Biol, 6:775-784.
Force A, Lynch M, Pickett F B, Amores A, Yan Y L, Postlethwait
J. 1999. Preservation of duplication genes by complementary,
degenerative mutations. Genetics, 151:1531-1545.
Funa N, Ohnishi Y, Fujii I, Shibuya M, Ebizuka Y, Horinouchi S.
1999. A new pathway for polyketide synthesis in
microorganisms. Nature, 400:897-899.
Funa N, Ohnishi Y, Ebizuka Y, Horinouchi S. 2002. Properties
and substrate specificity of RppA, a chalcone synthase-re-
lated polyketide synthase in Streptomyces griseus. J Biol
Chem, 277:4628-4635.
Gu H-Y, Qu L-J, Ming X-T, Pan N-S, Chen Z-L. 1995. Plant
Genes and Molecular Manipulations. Beijing: Peking Univer-
Acta Botanica Sinica 植物学报 Vol.46 No.1 200418
sity Press. (in Chinese)
Helariutta Y, Kotilainen M, Eolmaa P, Kalkkinen N, Bremer K,
Teeri T H, Albert V A. 1996. Duplication and functional di-
vergence in the chalcone synthase gene family of Asteraceae:
evolution with substrate change and catalytic simplification.
Proc Natl Acad Sci USA, 93:9033-9038.
Howles P A, Aprioli T, Weinman J J. 1995. Nucleotide sequence
of additional members of the gene family encoding chalcone
synthase in Trifolium subterraneum. Plant Physiol, 107:1035-
1036.
Ito M, Ichinose Y, Kato H, Shiraishi T, Yamada T. 1997. Molecu-
lar evolution and functional relevance of the chalcone syn-
thase genes of pea. Mol Gen Genet, 255:28-37.
Jermiin L S. 1996. K2Wuli Version 1.0. Australia: Australian Na-
tional University Press.
Jez J M, Bowman M E, Noel J P. 2002. Expanding the biosyn-
thetic repertoire of plant type Ⅲ polyketide synthases by
altering starter molecule specificity. Proc Natl Acad Sci USA,
99:5319-5324.
Koch M A, Haubold B, Mitchell-Olds T. 2000. Comparative
evolutionary analysis of chalcone synthase and alcohol dehy-
drogenase loci in Arabidopsis, Arabis and related genera
(Brassicaceae). Mol Biol Evol, 17:1483-1498.
Koch M, Haubold B, Mitchell-Olds T. 2001. Molecular system-
atics of the Brassicaceae: evidence from coding plastidic matK
and nuclear Chs sequences. Am J Bot, 88:534-544.
Koes R E, Spelt C E, van den Elzen P J, Mol J N. 1989. Cloning
and molecular characterization of the chalcone synthase
multigene family of Petunia hybrida. Gene, 81:245-257.
Koes R E, Quattrocchio F, Mol J N. 1994. The flavonoid biosyn-
thetic pathway in plants: function and evolution. BioEssays,
16:123-132.
Kumar, S, Tamura K, Jakobsen I B, Nei M. 2001. MEGA: mo-
lecular evolutionary genetics analysis software. Version 2.1.
Arizona State University, Tempe, Arizona, USA.
Liu Y G, Huang N. 1998. Efficient amplification on insert end
sequences from bacterial artificial chromosome clones by ther-
mal asymmetric interlaced PCR. Plant Mol Biol Rep, 16:175-
181.
Lukacin R, Schreiner S, Matern U. 2001. Transformation of
acridone synthase to chalcone synthase. FEBS Lett, 508:413-
417.
Mathews S, Donoghue M J. 1999. The root of angiosperm phy-
logeny inferred from duplicate phytochrome genes. Science,
286:947-949.
Mo Y, Nagel C, Taylor L P. 1992. Biochemical complementation
of chalcone synthase mutants defines a role for flavonols in
functional pollen. Proc Natl Acad Sci USA, 89:7213-7217.
Ohta T. 1993. Pattern of nucleotide substitution in growth hor-
mone-prolactin gene family: a paradigm for evolution by gene
duplication. Genetics, 134:1271-1276.
Reimold U, Kroeger M, Kreuzaler F, Hahlbrock K. 1983. Coding
and 3 non-coding nucleotide sequence of chalcone synthase
mRNA and assignment of amino acid sequence of the enzyme.
EMBO J, 2:1801-1805.
Robinson M, Gouy M, Gautier C, Mouchiroud D. 1998. Sensi-
tivity of the relative-rate test to taxonomic sampling. Mol Biol
Evol, 15:1091-1098.
Ryder T B, Hedrick S A, Bell J N, Liang X W, Clouse S D, Lamb
C J. 1987. Organization and differential activation of a gene
family encoding the plant defense enzyme chalcone synthase
in Phaselous vulgaris. Mol Gen Genet, 210:219-233.
Saitou N, Nei M. 1987. The neighbor-joining method: a new
method for reconstructing phylogenetic trees. Mol Biol Evol,
4:406-425.
Sommer H, Saedler H. 1986. Structure of the chalcone synthase
gene of Antirrhinum majus. Mol Gen Genet, 202:429-434.
Stafford H A. 1991. Flavonoid evolution-an enzymic approach.
Plant Physiol, 96:680-685.
Swain T. 1986. The evolution of flavonoids. Copy V, Jr E M,
Harborne J B. Plant Flavonoids in Biology and Medicine,
Biochemical, Pharmacological, and Structure Activity
Relationships. New York: Alan R, Liss, Inc. 1-14.
Swofford D L. 1998. Paup4.0 Beta Version: Phylogenetic Analy-
sis Using Parisimony. Sinauer Associates, Sunderland, MA,
USA
Thompson J D, Higgins D G, Gibson T J. 1994. CLUSTAL W:
improving the sensitivity of progressive multiple sequence
alignment through sequence weighting, positions-specific gap
penalties and weight matrix choice. Nucleic Acids Res, 22:
4673-4680.
Ursula N K, Barzen E, Bernhardardt J, Rohde W, Schwarz-Sommer
Z, Reif H J, Wiennand U, Saedler H. 1987. Chalcone synthase
genes in plants: a tool to study evolutionary relationship. J
Mol Evol, 26:213-225.
Verwoert I I, Verbree E C, van der Linden K H, Ni Jkamp H J J,
Stuitje A R. 1992. Cloning nucleotide and expression of the
Escherichia coli fabD gene, encoding malonyl coenzyme A-
acyl carrier protein transacylase. J Bacteriol, 174:2851-2857.
Wang J L, Qu L J, Chen J, Gu H, Chen Z L. 2000. Molecular
evolution of the exon 2 of CHS genes and the possibility of its
application to plant phylogenetic analysis. Chin Sci Bull, 45:
1735–1742.
Yamazaki Y, Suh D Y, Sitthithaworn W, Ishiguro K, Kobayashi
Y, Shibuya M, Ebizuka Y, Sankawa U. 2001. Diverse chal-
cone synthase superfamily enzymes from the most primitive
vascular plant, Psilotum nudum. Planta, 214:75-84.
Yang J, Huang J X, Gu H, Zhong Y, Yang Z H. 2002. Duplication
HUANG Jin-Xia et al.: A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms 19
and adaptive evolution of the chalcone synthase genes of
Dendranthema (Asteraceae). Mol Biol Evol, 19:1752-1759.
(Managing editor: ZHAO Li-Hui)