免费文献传递   相关文献

Analysis of EST-SSR Loci in Tobacco (Nicotiana tabacum L.)

烟草EST-SSR位点分析



全 文 :武汉植物学研究 2007,25(5):427—431
Journal o,Wuhan Botanical Research
烟草 EST-SSR位点分析
张俊娥 ,李 芬 ,孙蒙祥
(1.武汉大学生命科学学院。植物发育生物学教育部重点实验室。武汉 43~ 2;2.河南师范大学生命科学学院,河南新乡 453007)
摘 要:利用 MISA软件对烟草 EST公共数据库中的简单重复序列(SSRs)进行了分析。结果表明,在 133 523条
EST序列中,共获得 81 757条 SSR序列,SSRs之间的距离约为0.92 kb。其中,六碱基重复丰度最大,占60.3%,而
单碱基、三碱基、四碱基 、二碱基和五碱基重复丰度分别为 20.0%、II.0%、4.2%、2.8%和 1.7%。在单碱基 、二碱
基、三碱基和四碱基重复模体中,丰度最大的分别是A/T、AG、AAG和A从T,而CG在编码区内丰度很低。用CAP3
软件进行冗余分析表明,在这 6种类型的重复模体中,冗余 与非冗余的烟草 EST之间没有显著差异。在得到的
SSR序列中随机选择 10个序列设计引物,在7个烟草品种中进行 PCR扩增。结果表明,l0对引物全部扩增出 PCR
产物,其中8对引物扩增出预期片段。用这8组扩增出预期片段的PCR产物进行变性 PAGE凝胶电泳检测,结果
表明,其中有 4对引物(EB4、EB5、EB6和 EB8)扩增出多态性条带。
关键词:EST;微卫星;SSR;多态性 ;烟草
中图分类号:Q75;s572 文献标识码:A 文章编号:1000.470X(2007)05.0427.05
Analysis of EST·SSR Loci in Tobacco(Nicotiana tabacum L.)
ZHANG Jun—E ,LI Fen ,SUN Meng—Xiang ’
(1.Key Laboratory ofMinistry ofEducationfor Plant Development Biology,College of蜘 Sciences。Wuhan University,Wuhan 430072。China;
2.Colege of Science,Henan Normal University,Xinxiang。Henan 453007,China)
Abstract:Simple sequence repeats(SSRs)of tobacco expressed sequence tags(EST)in public data—
base were investigated by using computer program MISA(MlcroSAtelite).Up to 8 1 757 SSRs were found
in 133 523 sequences and the average distance between SSRs was approximately 0.92 kb.Among them.
hexanucleotide repeats(60.3%)were the most abundance.while the monomeric.trimeric.tetrameric.di—
meric,and pentameric repeats are represented in decreasing proportions of 20.0% ,l1.0% ,4.2% ,
2.8% ,and 1.7% ,respectively.The most abundant motif was A/T.AG.AAG and AAAT in monomeric.
dimeric,trimeric,and tetrameric repeats.respectively.Whereas CG rich repeats are rarely found in the
coding regions.The redundancy analysis indicated that no significant diferences were observed between
the redundant and non—redundan t set of tobacco ESTs in individual SSR motifs.Ten pairs of primers flan —
king EST—SSR loci were designed to detect the polymorphism in seven tobacco cultivars.The analyses on
denatured PAGE by silver staining confirmed the existence of polymorphism by these four pairs of primers
of EB4.EB5,EB6.and EB8 among the seven tobacco cultivars.Th e SSR loei reported in this study are
the first molecular DNA—based genetic markers developed from public ESTs database in tobacco.Th e EST—
SSR markers wil be useful inform ation resource for germ plasm characterization and crop improvement
through genetic mapping and marker assisted selection in tobacco.
Key words:EST;Microsatellites;SSR;Polymorphism;Tobacco
Microsatelites or simple sequence repeats(SSRs)
are the genetic loci where short sequences of nucleo—
tides(1—6 units in length)are tandemly repeated.
SSRs have the characteristic of multiallelic nature,
abundan t polymorphism,larger genetic inform ation,
codominant inheritance,simple manipulation and easy
detection by PCR.SSRs markers,therefore,have been
widely used in genetic breeding and DNA diversity
studies[ 一引
. Traditionally.SSRs have been based on
the construction of genomic libraries. Although the
收稿日期:2007-03-12,修回日期:2007·05-08。
基金项目:国家自然科学基金资助项目(30370743,90408002);中国博士后科学基金(20060390831)资助。The work was financialy sup-
ported by the National Natural Science Foundation of China(30370743,90408002);China Postdoctoral Science Foundation(20060390831)。
作者简介:张俊娥 (1972一),女,山西榆社人,博士,研究方向为植物发育生物学(E·marl:zhangjune2001@yahoo.eom.cn)。
+ 通讯作者(Authorfor correspondence.E-mail:mxsun@whu.edu.cn)。
维普资讯 http://www.cqvip.com
428 武 汉 植 物 学 研 究 第 25卷
method is time-consuming and labor-intensive
, they
have been developed in a large number of species
,such
as maize j and rice 6 J
. Recentlv.a new aheITlative
method to identify SSRs has been developed since ESTs
sequences of many species are accumulating fast in the
genomics.The method is based on computer programs
and the information available from public EST databa.
ses.Th us,the costs for obtaining SSR markers can be
signifcandy reduced.So far,numbers of SSR markers
have been identified from public EST databases for the
application in economic plants

such as potato[

grape 引

citrus 。。

almond[‘

cucumber㈦
。 pep.
per 3。

wheat[14,15]

and foxtail millet[‘6。
. H0wever t0
our knowledge,no report has been concerned about
SSR analysis from tobacco ESTs.
Tobacco is an important economic crop.Up to
now,SSR markers have not been widely developed due
to the limitations of genetic background among its cuhi.
vars,although DNA marker assisted selection is indeed
indispensable for analysis of genetic relation among
cultivars and for evaluation of the effects of parent.ori.
gin genes on tobacco development.In this study,we
used the software MISA (MIcroSAtelites identification
too1)to investigate occurrence of SSR motifs in tobacco
sequences from public databases and ten pairs of pri.
mers flanking EST-SSR loci were designed to detect the
polymorphism of seven tobacco cuhivars in order to
provide useful inform ation resource for germ plasm char-
acterization and crop improvement through genetic
mapping and marker-assisted selection.
1 M aterials and Methods
1.1 Plant materials
Seven tobacco(Nicotiana tabacum L.)cuhivars.
‘Baile N0.21’,‘Taiyan No.7’

‘KYI4’,‘Samsun’,
‘Yunyan N0.87’,‘Nanluodexiya’.and ‘SR1’.were
used in present work.The plants were grown in a
greenhouse at 25% with a 16 h light period.
1.2 Search for SSR loci
The EST sequences data of tobacco was obtained
in FASTA format from http://www.ncbi.nlm.nih.
g—ov

/ on April 27,2007,which containing 133 523 ES—
Ts.EST sequences less than 1 00 bp in length were not
included in our analysis.A Perl5 script from htp://
c.ipk-gatersleben.de/misa/ was used to c aITv out
the identifcation and localization of SSR loci
. which is
able to identify perfect SSRs and remove polyA.tail in
eukaryotic mRNA.For searching SSRs by the Per15
script(misa.p1)L I7],SSRs were considered to contain
motifs that are between 1 an d 6 nucleotides of size
. Th e
distribution of perfect repeats of length ≥ 1 2 bp have
been analyzed,thus,for a 12 bp SSR
. on e occur ence
may comprise a repeat of 1 2 monomers,or six dim.
mers,or four trimers,or three tetramers ,or three pen—
tamers,or two hexamers,respectively.
1.3 A redundancy analysis
CAP3 program was used to remove the redun.
dancy sequences.
1.4 The prim er pairs designed and polymorphism
analysis
Ten sequ ences containing SSR loci were selected
randomly for primer designing an d the primer pairs flan.
king the SSR loci were designed by using the software
primer3 from htp://redb. ncpgr. cn/modules/red.
btols/primer3.php and synthesized for detecting the
polymorphism among tobacco cuhivars. e genomic
DNA was extracted using modified CTAB method[ 】
from young leaves of the seven tobacco cuhivars.PCR
conditions were as follows:20 —3O ng temple DNA.
o.2 ~unol/L of each primer

0.2 mmol/L of dNr .
10 mmol/L s-a(pH 8.3),50 rnmol/L KC1.3 rnmol/L
MgCI2,1 unit of Taq DNA polymerase in a volume of
20 L,and the mixture was first denatured at 94oC for
5 mins,then cycled 32 times at 94oC f0r 1 min
. 54oC
(6o℃f0r BP6 and ERS)for 30 seconds.and 74oC for 1 min
and finally extended at 72℃ for 4 mins.The PCR products
were detected on ethidium bromide.stained 1.5% agarose
gels and then on denatured PAGE by silver staining.
2 Results and Discussion
2.1 Occurrence and density of SSRs
A large set of EST data representing 75.4 Mb in
tobacco was procured from the public database and the
average length was 565 bp.The average distance be-
tween SSRs was approximately 0.92 kb.The search for
SSRs in available sequ ences revealed 8 1 757 SSRs in
55 559 sequ ences.in which 1 8 207 sequences contai.
ning more than one SSR.Among the six types of SSR
维普资讯 http://www.cqvip.com
第5期 张俊娥等:烟草EST.SSR位点分析(英) 429
repeats,hexanucleotide repeats(653 repeat loci/Mb)
are the most abundant class of SSR in the sequences,
while the monomeric,trimeric,tetramerlc,dimeric,ans
pentameric repeats aye represented in decreasing pro—
portions of216 repeat loci/Mb,119 repeat loci/Mb,46
repeat loci/Mb,3 1 repeat loci/Mb,and 1 8 repeat loci/
Mb,respectively(Fig.1).
|

r_] II
. 广 1 .r.1.n.一
M ono. Di.
Mono一,Di一,Tri一,
dimerie,trimerie,
respectively
Tn. Tetra- Penta- Hexa-
Unit s[zc
Tetra一,penta一,and Hera一,denote monomerlc,
tetramerie,pentamerie,and heramerie repeats,
Fig.1 Number of SSR loci per million base pairs
of sequences from monomer to hexamer
in tobacco ES rs
As far as proportions of various SSRs were con—
cerned,the hexameric repeats(60.3%)are the most
abundant class of microsatellites in al dataset.The
monomeric,trimeric,tetrameric,dimeric,and pentamer—
ic repeats are represented in decreasing proportions of
20.O% ,11.O% ,4.2% ,2.8% ,and 1.7% ,respective-
ly.Th ese findings aye in consistency with previous ob—
servations about diferences in abundance of SSR unit
size classes[20]

It was concluded that the hexameric
and trimeric SSRs are highly abundant in the coding
region sequences of tobacco public database.Since tri—
metric repeats usually resulted in less band artifacts.
the relative high density of them suggested that trime—
tric repeats were better source of markers than other
types of repeats.
We further analyzed the distributions of every
class of SSRs in different repeat length.Th e results in—
dicated that the frequency of every class microsateilites
decreases as the increase of the repeat length(data not
shown),which is in consistency with previous observa-
tion【

To classify al diferent types of microsateilites
into three categories by the length of 12 bp,12 —
18 bp,and ≥ 18 bp,it was found that 83.8% SSRs fall
into the category of 12 bp repeat units.In every class
microsateilites,1 2 bp repeat length accounts for high
percentage,the hexameric,tetrameric,trimeric,dimeric
repeats of 12 bp repeat lengths account for 97.5% .
92.8% ,71.1% ,42.7% .respectively.
2.2 Distribution Of SSR classes
Among mononucleotide repeats,poly A/poly T re—
peats were predominant,while poly C/poly G repeats
were rare in tobacco(Fig.2).These findings aye in
consistency with previous observations about differences
on the abundance in monomer repeats[
. Among al
dimeric repeat combinations,AG repeats are more fre-
quent,folowed by AC and AT repeats.GC dimeric re—
peats aye extremely rare in whole genome,which might
be due to the transition of methylated C residues to
T[
. Among the trimeric repeats.the motifs AAG aye
the most common,folowed by AAC,AAT,ACT,ACC,
and AGC repeats,while the AGT repeat are rare
(Fig.2).
Repeat types
Fig.2 Number of SSR loci in each monomeric。
dimerlc and trimeric repeats across
the entire tobacco ESTs
Analysis of density of each tetrameric repeat type
revealed that AAAT.AAAG.ACCT and AAAC were
the predominant types across the whole genome.Th e
overall density of tetrameric repeats such as A棚 ’.
ACAT.AATC。AAGG,and AATG were followed
(Fig.3).However,92.8% tetrameric repeat fals into
the category of 1 2 bp repeats units.Pentameric repeat
only accounted for 1.O% in the whole genome.In hex—
americ repeats.97.5% falls into the category of 12 bp
repeat units.
2.3 Redundancy analysis
Above—mentioned observation was based on a re—
dundan t set of tobacco sequences.For avoiding over—es—
timation of specifc SSR types,a redundancy analysis
was performed by CAP3 program.A set of 64 732
枷 o
g 一 ∞∞J0 2一gjZ
∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ O
加 ∞ ∞ ∞ ∞ 加
一uo1 ∽∽ 0 l。 uljZ
维普资讯 http://www.cqvip.com
430 武 汉 植 物 学 研 究 第 25卷
1ooo
8
— 8oo
600
:40o
宣 200
Z
0
-
. ⋯ . . 丹 . .一 . .n⋯ 几⋯ 。I..⋯一.⋯.. .⋯ ...
萋墓l 0 《U l l l l蔓墓量星 萋l量l
Fig.3 Number of SSR loci in diferent tetramer repeats of ESTs sequences in tobacco
assembled ESTs representing 4O.6 Mb were identifed.
40 894 non—redundant SSRs were obtained and the
average distance between non—redundan t SSRs was ap—
proximately 0.99 kb.However,when comparing the re—
lative frequencies of individual SSR motifs between the
non—redundan t SSRs an d the redundan t SSRs.we found
that there were no significan t diferences in diferent in—
dividual SSR motifs.(Fig.4).It suggested that our re—
suits could provide useful information for the develop—
ment of SSR mar—kers from the public EST database.

. 一 . . ~
2 4 5 6
type
1:Monomem;2:Dimers;3:Tfime~;
4:Tetrame~;5:Pentamers;6:Hexamers
Fig.4 Comparative an alysis of redul~dant and
non.redundan t SS in tobacco ESTs
2.4 Polymorphism
tobacco cultivars
To test the eficiency of development of SSR max-
kers directly from the public database,ten pairs of pri—
mers were successfuly designed using Primer3(Table
1、.AU these ten primers were used to detect polymor-
phism in seven tobacco cuhivars.The PCR results
showed that all these ten pairs of pri—mers were able to
amplify PCR ban ds from the genomic DNA of the seven
tobacco cultivars,an d eight PCR products were very
close to the expected size.Th e other two pairs of pri—
mers of BP6(BP7dA.d27)and DV1(DV159032)pro—
duced larger size ban ds than what expected from EST
sequences,which may be due to the existence of introns
in the corresponding genomic DNA sequence between
two primer sequ ences.Interestingly,the PCR products
of the primer pairs of BP6 displayed obvious po lymor-
phism among the seven tobacco cultivars.If the expec—
ted PCR products could be produced by using cDNA as
templates,the primers pairs of BP6 wil be a very use—
ful marker to identify the polymorphism of different
tobacco cultivars.
The po lymorphism loci were further detected on
denatured PAGE by silver staining.Th e results showed
that there were polymorphism by these four pairs of
primers of EB4,EB5,EB6,and EB8(Fig.5).Func—
tions for the po lymorphic SSR associated ESTs were
determined by similarity search an d EB5 an d EB8 loei
Table 1 SSR primer pairs designed from tobacco EST-SSR sequences
Note8:1.“F”stands for forward primer;⋯R stands for i~velse primer.2.Accession No.of putative homology were obtained by searching NCBI
non.Iedundant database with BLASTX with the expected value <10-7.
000《
U00《
0u《
y UU《
a 0uu《
e
UUU《 R
lvU《

00<《
U0<《
U《《
0U<《
UU<《
《《《
0《<《
U《《《
i堇 鲫 姗 枷 瑚 0
毒毫 ∞∞ 0 J。0EjZ
维普资讯 http://www.cqvip.com
第 5期 张俊娥等:烟草 EST.SSR位点分析(英) 431
showed significant similarities to known gene
(AAG43553 and AAG49896).The robust informative
EST—SSR markers wil be particularly valuable for ge.
netic relationship assessments an d markers assisted se.
1ection,as wel as comparative genomic studies.
2 3 4 5 6 7
1:Baile No.21;2:Tmyan No.7;3:KY14;4:Samsun;
5:Yunyan No.87;6:Nanluodexiya;7ISR1
Fig.5 PCR products of EB8 on denatured PAGE
by silver staining
Th e SSRs from tobacco EST are a viable source of
polymorphism.Although only ten have been tested
here,their utility has been ilustrated and there is the
potential to find more useable tobacco SSRs from our
EST project.The SSR loci presented here are the first
molecular DNA-based genetic markers developed from
public ESTs database in tobacco.Th ese markers wil be
useful for conducting population genetic and mapping
studies.Th e identifcation of larger numbers of annota.
ted EST—SSR molecular markers from gene discovery
projects wil open the way for the application of these
markers in a variety of molecular genetic studies in
tobacco.
References:
[1] Mo~ante M,Olivieri A M.PCR—amplifed microeatelites a8 mark.
el-s in plant genetics[J].Plant J,1993,3:175—182.
[2] Powel W,Machray G C,Provan J.Polymorphism revealed by
simple sequence repeats[J].Trends Plant Sci,1996,1:215—
222.
[3] Smulders M J M,B~lemeijer G,Rus—Kortekaas W,Arens P,Vos—
man B. Use of short microsatellites from database sequences to
generate polymorphisms among Lycoper~icon esculentum cultivars
and accesions of ot}ler Lycoper~icon species[J].Theor Appl Ge·
net,1997,941264—272.
[4] He C,Poysa V,Yu K.Development and characterization of simple
sequence repeat(SSR)nlalkel~and their use in determining re—
lationships among Lycopersicon esculentum cuhivars[J].TheorAp-
pl Genet,2003,106 :363—373.
[5] Yu J,Lu H,Bernardo R.Inconsistency between SSR groupings
and genetic backgrounds of white conl inbreds[J].Maydica,
2001.46 :133—139.
[6] Temnykh S,DeClerck G,Lukashova A,Lipovich L,Cartinhour S,
McCouch S.Computational and experimental analysis of microsa—
tellites in rice(Orm s~iva L.):frequency,length variation,
transposon associations,and genetic marker potential[J].C~,/lo/l~
Res,2001,11:1441—1452.
[7] Milbeurne D,Meyer R C,Collins A J,Ram~y L D,Gebhardt C,
Waugh R.Isolation,characterisation and mapping of simple 8e—
quences repeat loci in potato[J].Mol Gen Genet,1998,259:
233—245.
[8] Scot K D,Eg0er P ySoaton G,Rosseto M,Ablet E M,Lee L S,
Henry R J.Analysis of SSRs derived from grape ESTs[J].Theor
apvt Genet,2000,1001723—726.
[9] Jiang D,Zhang G Y,Hong Q B.Analysis of micresatelites in C/t—
M unigenes[J].Acta Genet Sinica,2006,33(4):345—353.
[10] Chen C X,Zhou P,Choi Y A,Huang S,Gmlter F G Jr.Mining
and characterizing microeatelites from C/trus ESTs[J].TheorAp-
pl Genet,2006,112(7):1248—1257.
[11] XieH,SuiY,Chang FQ,XuY,M8R C.SSRallelic variationin
almond(Prunus du/c/s Mil1.)[J].Theor Appl Genet,2006,112
(2)1366—372.
[12] Kong Q,Xiang C,Yu Z.Development of EST—SSR.s in Cucum/s
sativus from sequence database[J].Mol Ecol Notes,2006,6(4):
1234—1236.
[13] Yi G B,Lee J M,Lee S,Choi D,Kim BD.Exploitation of popper
EST—SSR.s and aI1 SSR—based linkage map[J].Theor Appl Genet,
2006,114(1):113—130.
[14] Wang H Y,Wei Y M,Yan Z H,Zheng Y L.EST·SSR DNA poly—
morphism in durumwheat(Triticum durum L.)colections[J].-r
Appl Gen~t,2007,48(1)135—42.
[15] Gadaleta A,Manglni G,Mul/~G,Blanco A.Characterization of di—
nucletide and trinucletide EST.defived micresatellites in the
wheat genome[J].Euphytica,2007,153(1-2)173—85.
[16] Ji8 X P,Shi Y S,Song Y C,Wang G Y,Wang T Y,Li Y.Devel·
opment ofEST—SSR in foxtail milet(Setaria italics)[J].Gen~t
Resour CropEv,2007,54(2)1233—236.
[17] Thiel T,Michalek W,Varshney R K,Graner A.Exploitng EST
databases for the development an d characterization of gene-de-
rived SSR·markers in barley(HordeumⅢ L.) J].Theor
Appl Genet,2003,106:411—422.
[18] Huang X,Madan A.CAP3:a DNA sequence assembly program
[J].Genome Res,1999,9:868—877.
[19] Yang Y C,Zhou Q M,Yin H Q.Making template DNA in the a.
nalysis of tobacco germplasm based on AFLP fingerprint[J].Sub-
tropicalPlant Sci,2005,34(20)I1—4(in Chinese).
[20] Temnykh S,DeClerck G,Lukashova A,Lipovich L,Cartinhour S,
McCouch S.Computational an d experimental an alysis of microsat·
elites in rice(D sativa L.):frequency,length variation,
transposon associations,and genetic nlarker potential[J].Genome
Res,2001,11:14 1—1452.
[21] Kati M V,Ranjekar P K,Gupta V S.Diferential distribution of
simple sequence repeats in eukaryotic genome sequences[J].Mol
Biol Evol,2001,18(7):1161—1167.
[22] Schorderet D,Gartler S.Analysis of CpG suppression in methyla—
ted and non—methylated species[J].Proc Natl Acad Sci,1992,
89:957—961.
维普资讯 http://www.cqvip.com