全 文 : 遗 传 学 报 Acta Genetica S inica , M ay 2005, 32 (5):519 ~ 527 ISSN 0379 -4172
收稿日期:2005 - 03 - 10;修回日期:2005 - 04 -04
基金项目:国家自然科学基金项目(编号:30225021),中国科学院 “百人计划 ”和创新重大项目 [ Supported by the NationalNatural Science Founda-
t ion of Ch ina (No. 30225021) and the C hinese A cadem y of S ciences]
*表示同为第一作者
作者简介:田朝光(1973-),男,博士生
① 通讯作者。 E-m ail:m schen@genetics. ac. cn;Tel:010-64837087
Evidence for an AncientWhole-Genome Duplication
Event in R ice and Other Cereals
TIAN Chao-Guang1, 2 , * , XIONG Yu-Q ing1, 2, * , LIU Tie-Yan1, 2 , SUN Shou-Hong1 ,
CHEN Liang-B iao1 , CHEN M ing-Sheng1, ①
(1. Institute of Geneticsand Developm en ta lB iology, Ch inese Academ y of Sciences, Beijing 100101, China;
2. Gradua te S chool, Ch in ese Acad em y of S ciences, B eijing 100039, Ch ina)
Abstract:Gene dup lication has been proposed as an acce le rato r o f evo lu tion. Ancient genome dup lica tion
even ts have been iden tified in d iverse organ ism s, such as yeast, verteb ra tes, and Arab idopsis. He re, we have
iden tified a w ho le genom e dup lication event (WGD) in the r ice genome, w h ich took p lace p rior to the d ive r-
gence o f g rasses abou t 70 m illion years ago (m ya). A to ta l o f 117 dup licated b locks w ere de tected, wh ich a re
d istribu ted on a ll12 chromosom es and cover about 60% o f the rice genome. Abou t 20% genes on these dup lica-
ted segmen ts are re ta ined as dup lica te pa irs. In con trast, 60% o f the transc rip tion facto r genes are re ta ined as
dup lica tes. The iden tification o f aWGD in the ancestra lg rass genom e w ill mi pact the study of g rass genome e-
vo lu tion, and sugges t tha t po lyp lo id iza tion and subsequen t gene losses and ch romosoma l rea rrangemen ts have
p layed an mi po rtan t ro le in the d ive rsifica tion o f g rasses.
K ey words:genome dup lica tion;genome evo lu tion;po lyp lo id iza tion
水稻和其他禾本科植物基因组多倍体起源的证据
田朝光 1, 2 , * , 熊煜青 1, 2 , * , 刘铁燕 1, 2 , 孙守红 1 , 陈良标 1 , 陈明生 1, ①
(1. 中国科学院遗传与发育生物学研究所 , 北京 100101;
2. 中国科学院研究生院 , 北京 100039)
摘 要:基因加倍(G ene duplication)被认为是进化的加速器。古老的基因组加倍事件已经在多个物种中被确定 ,
包括酵母 、脊椎动物以及拟南芥等。本研究发现水稻基因组同样存在全基因组加倍事件 ,大概发生在禾谷类作物
分化之前 , 距今约 7 000万年。在水稻基因组中 ,共找到 117个加倍区段(Dup licated b lo ck), 分布在水稻的全部 12
条染色体 , 覆盖约 60%的水稻基因组。在加倍区段 , 大约有 20%的基因保留了加倍后的姊妹基因对(Dup licated
pairs)。与此形成鲜明对照的是加倍区段的转录因子保留了 60%的姊妹基因。禾本科植物全基因组加倍事件的确
定对研究禾本科植物基因组的进化具有重要影响 ,暗示了多倍体化及随后的基因丢失 、染色体重排等在禾谷类物
种分化中扮演了重要角色。
关键词:基因组加倍;基因组进化;多倍体化
中图分类号:Q751 文献标识码:A 文章编号:0379-4172(2005)05-0519-09
G ene duplication has been proposed as the major
force of evolution, as dup licated genes can supply the
gene tic raw ma te ria l fo r the c reation of novel functions
th roughmu tation and natural se lec tion
[ 1]
. WGD (who le
genome dup lication) is particula rly intriguing, which
w as usually fo llow ed by diploidiza tion w ith ex tensive
loss of dup licated genes and genom ic rearrange-
men ts
[ 2 ~ 4]
. The re is ev idence fo r segmental /genome-
w ide duplica tions in the ve rtebra te s
[ 5] , including hu-
man
[ 6, 7] , mouse[ 8] and fish[ 9] . The yeast(Saccharomy-
ces cerevisiae) was also show n to be an ancien t te tra-
ploid
[ 10 ~ 14]
. In p lants, po lyploidy is ub iqu itous[ 15] , and
recent stud ies based on the complete genome sequen-
cing info rmation have also hypothesized that the small
crucifer Arabidopsis (Arabidopsis tha liana ) is an
ancien t po lyplo id
[ 3, 16 ~ 20]
.
The g rass fam ily Poaceae is one o f the largest fam-
ilies of ang iosperms, including approx imate ly 10 000
species that diverged from a common ancestor 50 to 70
mya
[ 21]
. It has been proposed that most g rass spec ies
are of polyp loid o rig in
[ 22, 23]
. Fo r example, bread wheat
(Triticum aestivum ) is an a llohexap lo id and has con-
served homoeologous chromosomes
[ 24, 25]
. How ever,
ma ize (Z ea mays) represents an olde r tetraploidiza tion
even t, which lo st its homoeo logous chromosomes by re-
assembling homoeo logous segments on new ly formed
chromosomes
[ 26]
. In contrast to maize and wheat, rice
(Oryza sativa) and so rghum (Sorghum bicolor) are re-
ga rded as a diploid a lthough te trap lo id ve rsions exist,
whe re homoeologous chromosomes like in w heat are
preserved. Homoeo logous chromosomal regions derived
from ancestral chromosomes of the progen ito rs of the
grass fam ily have been demonstra ted using gene tica lly
mapped markers. In such a compa rison, ortho logous
ma rke rs have show n tha tw hilema ize fa lls into two sub-
genomes, rice and sorghum are represented by a sing le
subgenome
[ 27]
. Segmen tal dup lications in the rice ge-
nome w ere reported previously
[ 28, 29] , but ana ly sis of
whole genome sequence data has p rovided unprece-
dented oppo rtunities fo r iden tify ing ancien t genome du-
plication events, sim ilar to those in yeast andArabidop-
sis
[ 3]
. A fter the draft sequenc ing o f the rice genom e, ge-
nome-w ide dup lications w ere also proposed for the rice
genome. But the nature and orig in o f these genom ic du-
plications remained con troversial
[ 30 ~ 33]
. O ve r 2 000
dup licated cDNA s w ere p lo tted by chromosoma l re-
g ions, and the exten t of genome duplica tions w ere de-
fined
[ 30]
. How ever, most o f the duplicated segments i-
dentified w ere small, excep t for a large segmen t shared
by ch romosomes 11 and 12. Da ting on the basis o f am i-
no acid substitution rate revea led that aWGD occu rred
in rice some 40 to 50 mya
[ 30]
. How eve r, ano ther study
claim s that rice is an ancient aneup lo id tha t has experi-
enced the dup lication of one or a large part of one ch ro-
mosome about 70mya, predating the dive rgence ofmost
cereals
[ 31]
. By using transcrip tion factor genes as an-
chor points, we have iden tified genome-w ide duplica-
tions in the rice genome (Xiong et al. , subm itted). In
this manuscrip t, we have used the map-based comp le te
sequence o f the rice genome to prov ide evidence that a
WGD took p lace in the ancesto r o f rice abou t 70 mya,
predating the dive rgence of grasses, suggesting tha t
gene dup lications, inse rtions, and de letions have oc-
cu rred ove r long pe riod o f time in the evo lution of the
g rasses.
1 M ate rial andM e thods
1. 1 Dataset
A to ta l o f 56 056 p redicted rice genes (bo th nu-
c leo tide and am ino ac id sequence data)were re trieved
520 遗传学报 Acta Gene tica S inica Vo .l 32 No. 5 2005
from the TIGR da tabase (http:/ /www. tig r. o rg). Ret-
ro transposon-like sequencesw ere removed by searching
w ith BLASTP
[ 34]
to known re tro transpo son sequences.
The remaining da taset inc ludes 42 688 gene mode ls.
The Arabidopsis data w e re down loaded from the TA IR
database (h ttp:/ /www. arabidopsis. o rg). The M oss
EST data w ere downloaded from the N IBB PHYSCO-
base (http:/ /mo ss. n ibb. ac. jp /)[ 35] . The da ta for
ma ize, whea t, barley, and sorghum we re selected from
the SW ISS-PROT pro tein database (http:/ /www. eb.i
ac. uk /sw issprot /) and the N ational Cen te r for B io-
techno logy Info rmation (NCB I) UniG ene co llec tion
(http:/ /www. ncb.i nlm. nih. gov /Genbank /index. ht-
m l).
1. 2 Detection o f dup licated b locks in the rice
genome
The detection o f duplicated blocks w as in itia lly
performed using transcrip tion factor genes of rice
(X iong et a l. , subm itted). W e have expanded those
duplicated regions by three steps. F irst, we identified
all gene pairs in each pair of sister reg ions. W e used
the BLASTP program to search for all the predicted
genes on each pair of dup lica ted chromosomes or seg-
men ts o f chromosomes. An e-value less than 1e-10, an
overlapping region o f more than 150 am ino acids, and
an identity of no less than 30% were used to identify
the reciprocal best hits
[ 31]
. Second, the anchor points
w ere identified based on the o rder and distance o f each
pair o f genes. Small inversions w e re a llowed in ou r a-
nalysis. The in te rven ing reg ion betw een two ne ighbo ring
ancho rs con tains no more than 30 genes on bo th
strands
[ 17]
. Third, duplica ted blocks w ere identified
based on the numbe r and o rder of anchors. The blocks
identified conta in a t least three anchors in an appropri-
ate o rder and orienta tion.
1. 3 Phylogenetic analysis o f dup licated
genes
The homo logs in maize, so rghum , barley, wheat,
Arabidopsis or moss w ere detected using the BLASTP
prog ram. An e-va lue less than 1e-10, an ove rlapping re-
g ion of more than 150 am ino acids, and an iden tity of
no less than 30% were used to iden tify the reciproca l
best hits. The phy logene tic tree w as constructed w ith
the neighbor-joining method using the C lustaWl p ro-
gram for a lignment
[ 36]
. Each g roup conta ins four p ro-
teins:tw o rice duplicate pairs, the best homo log from
an o rgan ism unde r comparison, and the best homo log
from outgroup. The final roo ted trees w ith less than
70% boo tstrap suppo rt we re not included fo r further a-
na ly sis.
1. 4 Asymmetric d ivergence o f duplicate
genes
A to tal of 608maize homo logs w ere used to de tec t
rice duplicatesw ith asymmetric dive rgence in these 608
pairs of rice duplica te genes. The phy logene tic treew as
constructed w ith the neighbor-joining me thod using the
C lustaWl prog ram fo r a lignment
[ 36]
. The evolution rate
o f dup lica te genes w as derived from the tree branch
length
[ 11]
. If one duplica te evo lved 50% faste r than the
o ther one, asymme tric evo lu tion w as inferred for these
dup licates
[ 11]
.
1. 5 Age estimation o f duplicated b locks
The va lues of dS (synonymous substitution rate)
were ca lculated w ith the PAML so ftw are
[ 37]
. The synon-
ymous substitu tion rate w as considered to be 6. 03 x
10
-9
synonymous base substitutions per site pe r
year
[ 38]
.
2 Resu lts and D iscussion
2. 1 Detection of dup licated segments
Previously, we de tected 12 pairs o f large
intragenom ic dup licated segments in the rice genome
by phy logenetic ana lysis of transcrip tion facto r genes
(X iong et a l. , subm itted). W e have extended those
studies w ith mo re stringent crite ria (see M a terial and
521T IAN Chaoguang et a l. :Ev idence for an Ancien tW hole-Genome Dup lication …
M ethods), and identified 117 duplicated blocks, which
contain 1 934 anchors and abou t 20 000 gene mode ls,
w ith a t least three anchor po in ts w ithin each b lock
(F ig. 1, Table 1). Except for genes on ch romosomes
11-12, the average iden tity o f duplica ted genes in the
rice genome is abou t 62%, wh ich is sim ilar to 63%
found in yeast
[ 39]
. How eve r, the average identity o f du-
plicated genes on chromosomes11-12 is about 76%, sug-
gesting tha t the segmen tal dup lication on chromosomes
11-12 is amore recent even.t
Con tinuous dup licated blocks on a chromosome
likely represent one large duplicated region, a lthough
they do no t satisfy our criteria for one dup lica ted b lock
(seeM ateria l andM ethods)(Fig. 1). Our 117 blocks
could compose 12 duplica ted segments, inc luding chr1-5 (sho rt
arm and long arm), 2-4 , 2-6 (short arm and long arm),
Fig. 1 Duplica ted b locks in the rice genome
S ister dup licated segments are connec ted w ith a s ing le
line. T he dup licated segmen ts that have the same orienta-
t ion are marked w ith the sam e color; the duplicated seg-
ments that have d ifferent orientat ions are labe led w ith diff-
erent colors. The centromeres of chrom osom es a re
m arked as black dots.
Tab le 1 Summary of rice duplica ted blocks
Sm all b locksa
No. of B locks No. of anchors
Large b locksa
No. of B locks N o. of anchors M ean of dSb
To tal b locks Total anchors
Ch r1-5 7 29 14 441 0. 82 21 470
Ch r2-4 3 13 8 250 0. 79 11 263
Ch r2-6 4 17 12 268 0. 84 16 285
Ch r3-7 5 17 14 207 0. 86 19 224
Ch r3-10 8 36 10 119 0. 84 18 155
Ch r8-9 8 30 6 153 0. 79 14 183
Ch r11-12 7 21 1 245 0. 25 8 266
Ch r3-12 1 4 3 42 0. 86 4 46
Ch r4-8 3 16 3 26 0. 92 6 42
Tota l 46 183 71 1751 0. 75 117 1934
a:Sm al lb lock s refer to those bock sw ith 3-6 anchors, and large b locksw i th no less than seven an ch ors. b:Them ean of dSw as calcu lated based on large
b locks on ly.
3-7 (short arm and long arm ), 3-10 , 3-12, 4-8, 8-9 ,
and 11-12. The duplica ted regions cover 26 000 gene
mode ls and 61. 1% of the rice genome, wh ich is sim ilar
to 61. 9% reported earlier by Pa te rson et a l. [ 40] .
71 dup licated b locks conta in a t least 7 anchors
w ith in each block. In to tal, these duplicated b locks con-
tain 1 751 anchors and 18 000 gene models, and cover
about 42% o f the entire genome. The follow ing analy-
sesw ere based mainly on these 71 duplica ted b locks.
In these 71 dup licated blocks, 20% o f the genes are re-
tained as duplicate pa irs, compared to on ly 12% in
yeast
[ 11]
and 28% in Arabidopsis fo r the recent duplica-
tion
[ 17]
. In contrasting, about 60% o f the transc ription
factor genes are re tained as duplicate pa irs on these
blocks (X iong et al. , subm itted), which could be due
to the potential of regu latory genes to evolve new func-
tions and hence be reta ined in the genome
[ 41, 42]
. InAr-
abidopsis, gene s invo lved in signa l transduc tion and
transcription w ere also prefe rentially retained a fte r
WGD and subsequent gene losses
[ 43]
.
B ecause over 80% of the genes have been lost
since the WGD event, extensive gene de letion and
522 遗传学报 Acta Gene tica S inica Vo .l 32 No. 5 2005
chromosome rea rrangement must have taken place. In
the yeast genome, gene loss in duplicated b locks oc-
curred by many small de letions ( the ave rage size of a
lost segmen t is tw o genes), and was typically balanced
betw een the tw o sister regions (average balance be-
tw een 57% to 43%)[ 11] . In the rice genome, of the 71
larger duplica ted blocks, gene loss w as a lso gene ra lly
ba lanced be tw een the two sister reg ions (average ba l-
ance betw een 61% to 39%). The average size o f a lost
segment is five genes.
The largest dup lication b lock lies on ch romosomes
11-12, wh ich con tains 245 anchors and a size o f about
4 M b. The average size of duplicated b locks is mo re
than 1M b , which contains 25 anchors and about 130
genes. The distribution of duplicated b locks on a ll 12
chromosomes is no t entire ly random. Fo r example, ex-
cept the large dup lica ted segmen ts on the sho rt arm s of
chromosomes 11 and 12, which have a very recen t o ri-
gin, very few additiona l duplicated segments w ere iden-
tified on these tw o ch romosomes (Fig. 1). No dup lica-
ted b lock w as identified in the cen tromeric regions,
which is sim ilar to the situation in Arabidopsis
[ 17]
.
2. 2 Dating of dup lication events
Phy logenetic analy sis p rovides an opportun ity to
assess the tempo ral o rder of dup lication events and can
be used to in fer whethe r the dup lication occurred be-
fo re speciation
[ 11, 14, 18, 40, 44]
. How ever, a ma jo r handicap
for the analy sis o f the rice genome is the absence of da-
ta from a descendent o f a progenitor tha t did no t unde r-
go a WGD like in the comparison of K luyveromyces
wa ltii and Saccharomyces cerevisiae
[ 11]
o r Sorghum bi-
color and Z ea mays
[ 26]
. The re fo re, we can no t establish
orthology for a llgenes, tw o genes derived from the same
ancestral chromosome, and have to base our phy loge-
netic ana ly sis for each pair of duplica ted genes on hom-
olog s from Arabidopsis, wheat, barley, so rghum and
ma ize. On the o ther hand , if a homo log from ano ther
speciesm atche smore gene cop ies in rice than the two
aligned pairs, it is assumed tha t the one that has formed
a fte r specia tion is a paralogous sequence (F ig. 2). The
ratio o f interna l trees fo r a ll ce real species tested is be-
tween 52% ~ 57% (Table 2). Howeve r, the ra tio of
interna l trees forA rab idopsis is 5%. W e estimated the
dup lication time by ca lculating the dS va lues assum ing
a mo lecu lar clock
[ 17, 30, 40, 45, 20]
. Based on these ca lcula-
tions a major dup lication even t occurred 73 mya ago
be fore cereal species diverged from a common ances-
tor, bu t after the dive rgence of monoco ts and eudicots.
The duplica tion on the sho rt arm s of chromosomes 11
and 12 is of very recent origin, and w as e stimated to
have occurred 8 mya.
F ig. 2 Phy logenetic ana lys is of dup lica ted genes
A:Interna l tree, the topology o fw hichm eans that the dup lica-
t ion occurred befo re spec ies d ivergence. B:Ex terna l tree, the
topology of w hich means that the duplicat ion occurred a fter
spec ies d ivergence. C:Phy logenetic tree of cerea ls. A whole
genome duplicat ion event took p lace right before the d iver-
gence of cerea ls as indica ted by the open c irc le. D:A pseu-
do-ex ternal tree o f a pair of rice duplicate genes w ith the ir
m aize homolog. 8352. m03786 and 8351. m03346 are two
rice duplica tes. S11527681 is the ma ize hom olog. At3g59350
is the outgroup from Arab idops is. The number at the node
(93) is the boo ts trap value. The number above the line rep-
resents 0. 1 subs titutions per s ite.
523T IAN Chaoguang et a l. :Ev idence for an Ancien tW hole-Genome Dup lication …
Table 2 Phy logenet ic da ting of the
genom ic dup lica tion in rice
Species
Arabid-
opsis
Maize S orghum Barley W heat
Total number
of trees
359 608 252 401 516
Num ber of
in terna l trees
17 319 130 212 292
Percent of in ternal
trees
5% 53% 52% 53% 57%
Note:The Arabidopsis sequences w ere sub ject to phylogenetic analysis
us ing m oss as ou tgroup. The m aize, sorghum , barley and w heat se-
quencesw ere sub ject to phylogenetic analysis u sing the Arab idopsis se-
quences as ou tgroup.
The age estimation of dup licated b locks suggested
that eithe r the w ho le genome dup licated as a sing le e-
vent or asmassive duplica tions be fore the divergence of
grasses. If independen t duplication events m ight have
taken p lace prio r to the d ivergence of grasses, one
m igh t expect to find examp les o f triplica tions besides
duplications. One cou ld argue that, based on the Pois-
son distribution , 71 successive duplica tions w ou ld be
expected to resu lt in abou t 10 trip licated reg ions (du-
plica tes of dup licates)[ 10] . How ever, we observed on ly
one triplica ted region in these 71 b locks. The expected
number o f trip licated reg ions fo r 117 dup licated blocks
is 16;howeve r, we obse rved on ly tw o. A s described a-
bove, we calcu la ted the dS va lues fo r a ll dup licated
pairs. The distribution o f these values indicates that
most dup licated gene s have duplica ted at a sim ilar time
(F ig. 3). These results support tha t aWGD occurred a-
bou t 70mya predating the divergence of g rasses.
The phy logene tic ana lysis of the dup licated genes
could suffer from long branch a ttrac tion
[ 10, 14, 39]
. Fu r-
ther, comparison of pa ra logs between tw o species could
also shift resu lts significan tly. For instance, when we
exam ined our da ta using the maize homo logs as exam-
ples, we find that out o f 608 ma ize-rice trees, 133
(22%) show asymmetric evo lu tion fo r the tw o rice du-
plica tes. This ra tio is sim ilar to that in yeast (21% ~
27%)[ 46] and Arabidopsis (21%)[ 43] . Tw en ty-th ree of
these 133 tree s show e rroneous topo logy, mainly be-
cause one dup licate evo lved faster than the othe r one.
Fig. 3 D is tribution o f dS va lues fo r
duplica ted rice genes
The horizonta l ax is represents the values of dS.
The vert ica l ax is represents the num ber o f dupli-
cated pa irs. The le ft shoulder represen ts the dupli-
cated segment on the short arms of chromosomes
11 and 12.
For example, 8351. m03346, 8352. m03786 and the
maize homo log fo rm a pseudo-ex terna l tree (Fig. 2,
D).
2. 3 Asymmetric evo lution of dup licate genes
WGD (po lyploidiza tion) has been proposed as an
acce le ra tor o f evo lu tion
[ 22, 47]
. In yeast, genome dupli-
cation p layed a direc t ro le in the adaptation o f the S.
cerev isiae lineage tow ards fermentation
[ 10, 14, 39]
. Ohn-
o logs (paralog s derived from who le genome duplica-
tion)[ 3, 39] have evo lved nove l func tions and show dif-
feren t spatia l and temporal expression pa tte rns o r re-
spond diffe rently to env ironmen tal cues. About 20% of
the ohno logs show asymmetric evo lu tion, as one ohno log
evo lved faster than the o ther one
[ 11, 46]
. In Arabidopsis,
57% of the recent duplicated pairs and 73% ancien t
dup licated pairs have diverged in the ir expression pat-
terns
[ 43]
. In Arabidopsis, 21% of the recent duplicates
show asymmetric evolution in Arabidopsis
[ 43]
.
In o rder to find out which genes have p layed im-
po rtan t roles in the rice genome evo lution, we ana lyzed
the 133 asymme trica lly evo lved duplica tes derived from
the phy logene tic analysis of the rice duplicates w ith
the irmaize homo logs. W e c lassified these 133 duplicate
524 遗传学报 Acta Gene tica S inica Vo .l 32 No. 5 2005
pairs in to different func tion categories based on G ene
On to logy
[ 48] (Tab le 3). Genes tha t encode enzymes in-
vo lved in metabo lism form the la rgest group, suggesting
that the modification o fme tabo lism m ight be important
for species d ivergence and adaptation to the environ-
men.t For example, the ohno logs for hexokinases that
are involved in g lucose metabo lic pa thw ay have asym-
me tric dive rgence in yeast
[ 39, 49]
. In rice, one duplica te
pair, wh ich encodes puta tive hexok inase, a lso has asym-
me tric evolution. A nothe r group o f genes tha t show
asymmetric d ivergence are genes invo lved in regula to ry
netw ork (m ainly transcription fac to r genes). Th irteen
transcrip tion factor genes show asymme tric evo lution ,
and are d istribu ted in 8 gene fam ilies, including
MADS-box, MYB, bZIP, WRKY , RAV2, AP2 /ERF ,
BHLH , and E2F /DP.
Table 3 C lass ification o f rice dup licate genes
w ith asymm etric evolu tion
M olecu lar Function Number B iological p rocess Num ber
Ch aperone 2 Cell commun ication 8
C atalytic activity 82 Cell g row th and /or
m ain tenance
21
E nzym e regu lator 2 Cell cycle 5
B inding 64 Cellm oti lity 0
Nucleic acid binding 33 Metabolism 87
St ru ctura lm olecu le 4 Response to stress 5
M otor 2 Tran sport 15
T ransporter 11 D eath 1
T ranscrip tiona l
regu lator
7 D eve lopm en t 4
Signal transductor 7 Physiological
p rocesses
101
Hypothetical /
unknow n p rotein
4 B iological p rocess
unknow n
2
Unclassif ied 14 Un classified 11
Note:The genes w ere annotated and class ified by GoPipe prog ram
(Chen L et a l. , unpub lished data). Th e sam e gen e can be classif ied
into d ifferent g roups, so the sum can exceed th e number of dup licates
(133).
To conclude, we detected aWGD predating the di-
ve rgence of ce real species but po stdating the split of
monoco t-eudico t divergence. The duplicated regions
cover about 50% ~ 60% o f the rice genome. How about
the rem aining 40% of the rice genome F rom the yeast
study, we learned tha t some ohno logs have diverged be-
yond detection by BLAST search
[ 39] , so they cannot be
easily recognized in ou r curren t searching a lgorithm.
D iscove ry o fmany of the dup licated reg ions w ill aw a it
the comple te sequencing of a refe rence genome, wh ich
diverged from the rice genome befo re theWGD
[ 11, 12]
o r
o ther g rasses, such as maize and sorghum[ 4, 26] .
参考文献 (References):
[ 1 ] S Ohno. E volution by Gene Duplication. London, 1970.
[ 2 ] SeoigheC, Wo lfe K H. E xten t of genom ic rearrangem en t af ter ge-
nom e duplication in yeas.t P roc Na tl Acad S ci USA, 1998, 95:
4447 ~ 4452.
[ 3 ] W olfe K H. Yesterdays polyp loid s and the mys tery of d ip loidiza-
t ion. Na tR ev Genet, 2001, 2:333 ~ 341.
[ 4 ] Lai J, M a J, Sw igonova Z, Ram ak rishnaW , L in ton E, Llaca V , Ta-
nyolac B, Park Y J, Jeong O Y, B ennetzen J L. Gene loss and
movem en t in the m aize genom e. Genom e R es, 2004, 14:1924 ~
1931.
[ 5 ] Panopou lou G , H enn ig S, G roth D, K rause A, PoustkaA J, H erw ig
R, V ingronM , Leh rach H. New eviden ce for genom e-w id e dup li-
cation s at the orig in of verteb rates us ing an am ph ioxu s gene set
and com p leted an im al genom es. Genome R es, 2003, 13:1056 ~
1066.
[ 6 ] B ailey J A, Gu Z, C lark R A , Rein ert K, Sam on te R V , Schw artz
S, Adam sM D , Myers EW , Li PW , E ich ler E E. Recen t segm en-
tal dup lications in th e hum an genom e. Science, 2002, 297:1003~
1007.
[ 7 ] M cLysagh tA , H okam p K,W olfeK H. Ex ten sive genom ic dup lica-
t ion during early chordate evolu tion. Na t Genet, 2002, 31:200 ~
204.
[ 8 ] Cheung J,W ilsonM D, Zh ang J, Khaja R, MacDonald J R, H eng
H H , Koop B F, S cherer S W. Recen t segm en tal and gene dup li-
cation s in the mouse genom e. Genom e B io l, 2003, 4:R47.
[ 9 ] Taylor JS, B raasch I, Frickey T, M eyerA, van de Peer Y. Genom e
duplication, a trait shared by 22000 species of ray-finn ed fish. Ge-
nom e R es, 2003, 13:382 ~ 890.
[ 10] W olfe K H , Sh ield sD C. Mo lecu lar evidence for an ancient dup li-
cation of the ent ire yeas t genom e. Na tu re, 1997, 387:708 ~ 713.
[ 11] KellisM , B irren BW , Lander E S. Proof and evolu tionary analysis
of ancien t genom e dup licat ion in the yeast Saccharom yces cerev i-
siae. Na ture, 2004, 428:617~ 624.
[ 12] D ietrich F S, Voege li S, B rachat S, Lerch A, G ates K, S teiner S,
M oh r C, Poh lm ann R, Lued i P, Choi S, W ing R A , F lavier A,
Gaffney T D , Phi lippsen P. Th e Ashbya gossyp ii genom e as a too l
for m app ing the ancient S accharomyces cerevis iae genome. S ci-
ence, 2004, 304:304~ 307.
[ 13] Du jonB, Sherm an D , F isch er G, Du rrens P, C asaregola S, Lafon-
taine I, D e Montigny J, M arck C, Neuveglise C, Talla E, Goffard
N, Frangeu lL, A ig leM , An thouard V, Babou rA , B arbe V , Barnay
525T IAN Chaoguang et a l. :Ev idence for an Ancien tW hole-Genome Dup lication …
S, B lanch in S, Beckerich JM , Beyne E, B leyk as ten C, Boisram e
A , Boyer J, C atto lico L, Confan ioleri F, De DaruvarA, D espons L,
Fab re E, Fairhead C, Ferry-Dumazet H , Groppi A , H antraye F,
H ennequ in C, Jaun iaux N, JoyetP , K achou riR, Kerres tA, Koszu l
R, Lem aire M , Lesur I, M a L, Mu l lerH , N icaud JM , N iko lskiM ,
O ztas S, O zier-Ka logeropou los O, Pe llenz S, Potier S, Richard G
F, St raubM L, Su leau A , Sw ennen D , Tekaia F ,Wesolow sk i-Lou-
velM ,W esthof E,W irth B, Zeniou-M eyerM , Zivanovic I, B olotin-
FukuharaM , Th ierry A, Bouch ier C, C audron B, S carpelliC, Gail-
lard in C,W eissenbach J,W incker P, Sou ciet J L. G enom e evolu-
tion in yeasts. Na tu re, 2004, 430:35 ~ 44.
[ 14] Langk jaer R B, C liften P F, Johnston M , P isku r J. Yeas t genom e
dup licationw as followed by asynch ronou s d ifferentiation of dup li-
cated genes. Na ture, 2003, 421:848~ 852.
[ 15] Paterson A H , B ow ers J E, Bu rowM D, D raye X, E lsik C G , Jiang
C X, Katsar C S, Lan T H , Lin Y R, M ing R,W righ tR J. C om par-
at ive genom ics of p lant ch rom osom es. P lan t Cell, 2000, 12:1523
~ 1540.
[ 16] B lanc G, BarakatA , GuyotR, C ooke R, DelsenyM. E xten sive du-
p lication and reshu ffling in the Arabidopsis genom e. P lan t Cel l,
2000, 12:1093~ 1101.
[ 17] B lanc G, Hokamp K, W olfe K H. A recent polyp loidy superim-
posed on o lder large-sca le dup lications in the Arabidopsis ge-
nom e. Genom e Res, 2003, 13:137~ 144.
[ 18] Bowers JE, Chapman BA , Rong J, PatersonA H. Unravelling an-
giosperm genome evo lu tion by phylogenetic analys is of chrom o-
som al dup lication events. Na tu re, 2003, 422:433~ 438.
[ 19] LynchM , Conery J S. Th e evolu tionary fate and consequences of
dup licate genes. Science, 2000, 290:1151~ 1155.
[ 20] V ision T J, Brown D G, Tank sley S D. The origin s of genom ic du-
p lications inArab idopsis. S cience, 2000, 290:2114~ 2117.
[ 21] Kellogg E A. E volutionary h is tory of the grasses. P lant
Physio l, 2001, 125:1198 ~ 1205.
[ 22] Levy A A, Feldm anM. Th e impact of polyp loidy on g rass genom e
evo lution. P lan t Physiol, 2002, 130:1587~ 1593.
[ 23] S tebb in s G. Ch romosom al Evo lut ion in H igher P lan ts. London:
Edw ardA rnold Ltd, 1971.
[ 24] M cFadden E S, S ears E R. The origin of Tri ticum spelta and its
free-thresh ing h exap loid relatives. Journa l of H eredi ty, 1946, 37:
107~ 116.
[ 25] M oore G. Cereal Ch rom osom e structu re, evolu tion, and pairing.
Annu Rev P lant Physio lP lan tMo lB io l, 2000, 51:195 ~ 222.
[ 26] Sw igonova Z, Lai J, M a J, Ram ak rishn aW , Llaca V, B ennetzen J
L, M essing J. C lose sp li t of sorghum and maize genom e progen i-
tors. Genom eR es, 2004, 14:1916~ 1923.
[ 27] G ale M D , DevosK M. P lan t com parative genetics af ter 10 years.
S cience, 1998, 282:656~ 659.
[ 28] K ishim oto Nea. Iden tification of the duplicated segm en ts in rice
chrom osom es 1 and 5 by link age analys is of cDNA m akers of
known functions. Th eor ApplGenet, 1994, 88:722 ~ 726.
[ 29] N agam u ra Yea. C onservation of dup licated segm en ts betw een rice
ch rom osom e 11 and ch romosom e 12. B reeding S ci, 1995, 45:373
~ 376.
[ 30] Goff SA , Ricke D, LanT H , Presting G ,W ang R, DunnM , G laze-
brook J, Session sA , Oeller P, Varm a H , H ad ley D , H u tch ison D,
M artin C, KatagiriF, Lange BM , M ougham er T, X ia Y, Budw orth
P , Zhong J, M igu elT, Paszkow sk iU , Zhang S, C olbertM , SunW
L, Chen L, C ooper B, Park S,W ood T C, M ao L, Quail P, W ing
R, Dean R, Yu Y , Zharkikh A, Shen R, Sahas rabudhe S, Thom as
A, Cann ings R, Gu tinA , Pru ssD, Reid J, Tavtigian S, M itchell J,
E ldredge G , S ch ollT , M iller R M , Bhatnagar S, A dey N , Rubano
T , Tu sneem N , Rob inson R, Feldhau s J, M acalm a T, O liphan t A,
B riggs S. A d raft sequence of the rice genom e (O ryza sa tiva L.
ssp. japon ica). Science, 2002, 296:92 ~ 100.
[ 31] VandepoeleK, S im i llion C, van de Peer Y. Evidence that rice and
other cereals are ancien t aneup lo ids. P lan tCell, 2003, 15:2192~
2202.
[ 32] PatersonA H , Bow ers J E, Chapm an B A , PetersonD G , Rong J,
W icker T M. C omparat ive genom e an alysis of m onocots and d i-
cots, tow ard characterization of angiosperm d iversity. Cu rr Opin
B iotechno l, 2004, 15:120 ~ 125.
[ 33] PatersonA H , Bow ers J E, Peterson D G, E still J C, C hapm an B
A. S tructu re and evo lu tion of cereal genom es. Curr Opin Genet
D ev, 2003, 13:644~ 650.
[ 34] A ltschu l S F, M adden T L, S chafferA A , Zhang J, Zhang Z , M iller
W , Lipm an D J. Gapped BLAST and PSI-BLAST:a new genera-
t ion of p rotein database search program s. Nucleic Acid sRes, 1997,
25:3389~ 3402.
[ 35] N ish iyamaT, Fu jita T, Shin I T, Sek iM , N ish ide H , Uch iyam a I,
Kam iya A, C arninci P , H ayash izak i Y , Sh inozaki K, Koh ara Y,
H asebe M. Comparative genom ics of Physcom itrella patens ga-
metophytic transcrip tom e and A rab idopsis thaliana: Imp lication
for land plan t evo lut ion. P rocNa tlAcad S ciUSA, 2003, 100:8007
~ 8012.
[ 36] Thomp son J D , G ibson T J, P lew n iak F, Jeanm ougin F,
H iggin sD G. Th e CLUSTAL X w indow s interface:flexib le strate-
gies for m u lt ip le sequence alignm ent aid ed by quali ty analysis
too ls. Nucleic Acid sR es, 1997, 25:4876 ~ 4882.
[ 37] Yang Z. PAML:a p rogram package for phylogenetic ana lysis by
m axim um like lihood. Com pu tApp lB iosci, 1997, 13:555 ~ 556.
[ 38] M use S V. Exam in ing rates and pat terns of nucleotide subs titu tion
in plan ts. P lan tMo lB io l, 2000, 42:25~ 43.
[ 39 ] Wolfe K. Evolu tionary genom ics: yeasts accelerate beyond
BLAST. Cu rrB iol, 2004, 14:R392 ~ R394.
[ 40] Paterson A H , Bow ers J E, Chapm an B A. Ancient
polyploid ization predating d ivergence of the cereals, and its conse-
qu ences for com parative genom ics. P roc Na tl Acad Sci USA,
2004.
[ 41] Doeb ley J, Luken s L. Transcriptional regu lators and th e
evolu tion of p lan t form. P lan tCell, 1998, 10:1075~ 1082.
[ 42] B irch lerJ A , New ton K J. D iscovering the seeds of d iversity in
p lan t genom es. Genom e B iol, 2004, 5:323.
526 遗传学报 Acta Gene tica S inica Vo .l 32 No. 5 2005
[ 43] B lanc G , W olfe K H. Functional d ivergen ce of dup licated
genes form ed by polyploidy du ring Arabidopsis evolution. P lant
Cel l, 2004, 16:1679~ 1691.
[ 44] Chapm an B A, Bow ers J E, Schu lze S R, PatersonA H. A com par-
at ive phy logenetic app roach for d ating who le genome dup lication
events. B ioin form a tics, 2004, 20:180~ 185.
[ 45] S im illion C, V andepoe le K , SaeysY , van de Peer Y. Bu ild ing ge-
nom ic p rofiles for uncovering segm ental homo logy in the tw i ligh t
zone. Genom e R es, 2004, 14:1095 ~ 1106.
[ 46] Conan tG C,W agn er A. A symm etric sequence divergence of du-
p licate genes. Genom e Res, 2003, 13:2052 ~ 2058.
[ 47] P iskur J, Langk jaer R B. Yeas t genom e sequencing: the power of
com parative genom ics. MolM icrobiol, 2004, 53:381~ 389.
[ 48] AshburnerM , B allC A , B lake JA, Botstein D, Bu t lerH , Cherry J
M , DavisA P, Do linski K , Dw igh t S S, E pp ig J T, H arrisM A,
H il lD P, Isse l-Tarver L, K asarskisA , Lew is S, M atese J C, Rich-
ardson J E, Ringw aldM , Rub inG M , Sh erlock G. Gene on tology:
too l for the unif ication of biology. The Gene On tology Con sortium.
Na tGenet, 2000, 25:25 ~ 29.
[ 49] G elade R, van de V elde S, van D ijck P, Thevelein J M.
M u lti-level respon se of the yeas tgenom e to glu cose. Genom eB iol,
2003, 4:233.
“2005中国黑龙江国际农业生物技术峰会 ”
由中国黑龙江省农业科学院 、英国皇家农业大学 、中国生物工程学会和香港文汇报共同主办的 “ 2005中国黑龙江国际农
业生物技术峰会”将于 2005年 9月在哈尔滨市举行。会议将邀请世界著名生物技术科学家到会做学术报告 ,共同探讨农业生
物技术最新研究动态 、前瞻技术及未来发展方向 , 引导产 、学 、研相结合 ,打造国际合作平台 ,共同推进农业生物技术的研究和
产业开发。本次会议包括大会主题报告 、分组报告研讨和生物技术博览三大模块。大会主题:生物技术与现代农业。
会议包括以下内容:
1. 植物生物技术的研究与应用;2. 动物生物技术的研究与应用;3. 微生物生物技术的研究与应用;4. 生物 、食品安全性
评估及公众评价;5. 农业生物技术产业化论坛。
已邀请到的大会报告人
国外(按英文字母顺序排列):1. B ra in H eap教授 2. Ganesh M , Kishore博士 3. Lee Sing Kong教授 4. Paul Dav ies教
授 5. Roger N. Beachy博士
国内(按姓氏笔画顺序排列):1. 王连铮 2. 任继周 3. 沈荣显 4. 杨胜利 5. 周 琪 6. 黎志康 7. 薛红卫
会议时间与地点
时间:2005年 9月 6 ~ 10日;地点:哈尔滨国际会展中心
参会费用
国内与会代表每人缴纳注册费 980元人民币 , 在读研究生(凭有效证件)500元人民币 ,国外与会代表 380美元。 注册费
包括会议交通 、资料 、用餐等费用 ,差旅费自理。
论文征集
1. 欢迎与会人员提交论文或论文摘要 ,内容应符合大会主题范围 , 近期未公开发表的研究论文或综述的英文电子版。论
文按版面收费 , 每版 180元人民币 ,为审阅稿件方便起见 , 在提交论文时应附中文摘要;另外 , 可单独提交英文论文摘要 , 每篇
收费 300元人民币 ,也须附中文摘要。会议论文将以《中国生物工程杂志》增刊形式正式出版。具体格式要求见网站:www.
iac2005. haas. cn
2. 投稿通过电子邮件报送 , 并请在电子邮件主题栏中注明所属内容。 电子邮箱:haas2005@ vip. 163. com或 haas2005@
126. com
3. 征文截止日期:2005年 7月 15日
联系方式
网 站:www. iac2005. haa s. cn电子邮件:haas2005@ v ip. 163. com或 haas2005@126. com
通信地址:黑龙江省哈尔滨市学府路 368号 黑龙江省农业科学院国际合作处 邮 编:150086
电 话:086-451-86677452 传 真:086-451-86695508
联 系 人:王 :13351913218 李 铁:13936133830 韩德贤:13936042361
527T IAN Chaoguang et a l. :Ev idence for an Ancien tW hole-Genome Dup lication …