免费文献传递   相关文献

Wheat cSNP Mining Based on Full-Length cDNA Sequences

基于全长cDNA序列的小麦cSNP发掘


以测序得到的来自小麦不同基因组的基因序列为源序列,用AutoSNP软件,在GenBank中的小麦EST库中检测到一批cSNP,开辟了一条发掘小麦基因组特异候选cSNP的新途径。在2089个源序列中,检测到1 296个cSNP,其中有397个来自A基因组,322个来自S基因组,420个来自D基因组;另外,A和D基因组共有的SNP有154个,A和S,S和D,A、S和D基因组共有的SNP各仅有1个,这一结果也同时表明,小麦的3个基因组供体种中,A、D基因组关系比较近,而它们与S基因组的关系比较远。统计分析表明,小麦中SNP出现的频率约为0.914‰。

Single nucleotide polymorphism (SNP), the third generation molecular marker, is a research focus because of its characteristics of large amount, high stability, wide distribution, easy, high-throughput detection and wide application. Most SNP mining methods were based on the amplification of target regions, except RFLP. Few applications of high-throughput SNP mining methods were reported in common wheat (Triticum aestivum) SNP detection because of its hexaploid, huge genome and high homologous among A, B and D genomes. The dbEST of wheat in GenBank is an important resource to mining SNP with bioinformatic methods, and the full-length cDNA libraries, which constructed with wheat’s three genome donor species seedlings, provide the chance to mine wheat genome specific cSNP candidates.
More than 5 000 clones, from A, S and D genomes’ full-length cDNA libraries, were randomly sequenced, and 2 467 singlets were obtained, thereunto, 990 from A genome, 737 from S genome and 740 from D genome. Using these singlets as origin templates, we “fished out” 1 296 SNP with AutoSNP program from the wheat dbEST in GenBank and developed an effective method to mine wheat cSNP. Among these SNP, 397 were from A genome, 322 from S genome and 420 from D genome. The common SNP of A and D genomes were 154. The SNP shared by A and S, S and D, A, S and D was only 1 respectively, which seemed to indicate that the relationship of A and D genomes was much closer among wheat three genomes according to the principle of SNP mining.
Among the 1296 SNP, eighty-two were InDel , which accounted for 6.25% of all the sequence variations, and most of them were single-base InDel (64.20%), a small part of them were 2–5 base InDel (30.86%), and only very few InDels (4.94%), whose size were more than 5 bases. The analysis results indicated that the frequencies of SNP in the three genomes were different, that was 0.98‰ in A genome, 0.68‰ in S genome and 1.54‰ in D genome, and the average frequency of the three genome was 0.914‰, which was closer with that of soybean (0.97‰), yet much lower than that of maize (6.3‰). Additionally, the ratio of transition/transversion (Ts/Tv) of the three genome were different either, the highest (1.404) was in D genome, the second in A genome (1.196) and the lowest in S genome (1.033), and the average ratio of A, S and D genomes was 1.306, which was much closer with that of maize (1.53) and soybean (1.08), yet far from that of human, mouse and fruit fly.


全 文 :