免费文献传递   相关文献

Factors Affected BC1F1 Size for Development of Genome-wide Introgression Lines

构建全基因组导入系中BC1F1群体大小的影响因素研究



全 文 :作物学报 ACTA AGRONOMICA SINICA 2012, 38(1): 5054 http://www.chinacrops.org/zwxb/
ISSN 0496-3490; CODEN TSHPA9 E-mail: xbzw@chinajournal.net.cn

The Research Supported by State Key Basic Research and Development Plan of China (973 program) (2004CB117203) and International
Science and Technology Cooperation and Exchanges Projects (2008DFA30550).
* Correspondence author: QIU Li-Juan, E-mail: qiu_lijuan@263.net
Received(收稿日期): 2011-03-10; Accepted(接受日期): 2011-06-13; Published online(网络出版日期): 2011-11-07.
URL: http://www.cnki.net/kcms/detail/11.1809.S.20111107.1049.014.html
DOI: 10.3724/SP.J.1006.2012.00050
Factors Affected BC1F1 Size for Development of Genome-wide Introgression
Lines
YAN Long1,2, LIU Hui-Yong1,2, LI Ying-Hui1, ZHANG Meng-Chen2, and QIU Li-Juan1,*
1 National Key Facility of Crop Gene Resource and Genetic Improvement / Institute of Crop Science, Chinese Academy of Agricultural Sciences,
Beijing 100081, China; 2 Institute of Cereal and Oil Crops, Hebei Academy of Agricultural and Forestry Sciences / Shijiazhuang Branch Center of
National Center for Soybean Improvement / Key Laboratory of Crop Genetics and Breeding, Shijiazhuang 050031, China
Abstract: Introgression lines are important genetic materials for genetics study and breeding. Development of those lines involves
cross and backcross processes between recipient and donor parents. The population size of BC1F1 is a critical parameter for fully
covering donor genome and successfully obtaining desired introgression lines. However, the minimum sufficient number of BC1F1
plants is unknown for each species and can not be obtained experimentally. We have developed a computer program by simulating
the recombination process during meiosis to define the ideal BC1F1 population size. The reliability of the program was confirmed
by mathematics and experimental data. Three factors including linkage groups number, linkage group length and gene density
were analyzed and all of them had positive relation with the BC1F1 population size. The population size increased from 6.06 to
9.49 when the linkage number increased from 5 to 40. The population size was 7.14 when the linkage group length was 80 cM,
while it became 8.64 when the length was 200 cM. The population size was 7.65 with the density of 20 cM per gene and 8.22 with
10 cM per gene. The BC1F1 population sizes of rice, wheat, maize and soybean were predicted to be 12, 13, 14–15, and 13 by the
program with 95% confident level.
Keywords: Introgression lines; Population size; Simulate
构建全基因组导入系所需 BC1F1群体大小的影响因素分析
闫 龙 1,2 刘慧勇 1,2 李英慧 1 张孟臣 2 邱丽娟 1,*
1中国农业科学院作物科学研究所 / 农作物基因资源与基因改良国家重大科学工程, 北京 100081; 2河北省农林科学院粮油作物研究
所 / 大豆改良中心石家庄分中心 / 河北省作物遗传育种重点实验室, 河北石家庄 050031
摘 要: 全基因组导入系是遗传和育种研究的重要材料。导入系经受体亲本和供体亲本间连续杂交、回交构建而成,
BC1F1群体大小是获得理想导入系群体的关键参数。然而, 各物种所需要的最小群体尚不清楚, 并且难以通过试验确
定。本研究通过编写程序, 模拟减数分裂时的重组过程研究适宜的群体大小, 并通过数学运算和试验验证程序的可靠
性。结果表明, 编程模拟与数学计算和试验结果一致。BC1F1群体大小与连锁群数目、连锁群长度和基因密度之间均
为正相关。当模拟连锁群从 5个增加到 40个时, 群体大小需要由 6.06增加到 9.49; 当模拟连锁群长度从 80 cM增加
到 200 cM时, 需要的群体大小从 7.14增加到 8.64; 当模拟基因密度从每基因 20 cM缩小到每基因 5 cM时, 群体大
小从 7.65 增加到 8.22。为测试该程序的应用范围, 对水稻、小麦、玉米、大豆等主要作物进行了 BC1F1群体大小模
拟,在保证 95%的概率覆盖全基因组条件下, 水稻需要的群体最少, 为 12个个体, 小麦和大豆均需 13个个体, 玉米需
要的个体数最多, 为 14~15个。
关键词: 导入系; 群体大小; 模拟
Introgression lines (ILs) carry particular chromosome
segments from a donor line in the genetic background of
recipient line, which ILs are also termed chromosome
segment substitution lines (CSSLs). Those lines play
important roles in fine mapping of quantitative trait loci
(QTL) and single genes as well as the map-based gene
cloning[1-2]. ILs have been developed and widely used in
genetic studies and breeding practice for several species
第 1期 闫 龙等: 构建全基因组导入系中 BC1F1群体大小的影响因素研究 51


including rice[3-7], wheat[8-9], barley[10-12], tomato[13], and
lettuce[14].
Development of ILs for better genetic improvement
was clearly described in rice[15]. Crosses were made be-
tween the recurrent parent and all donors, and F1 proge-
nies were backcrossed with respective recurrent parents
to produce the BC1F1s. Twenty-five randomly selected
plants from each BC1F1 population were backcrossed
again to produce BC2F1 lines. The BC2F1 lines were
planted for each cross to form a single bulk BC2F2 popu-
lation. The bulks were screened for excellent performed
progeny lines, based on the major agronomic traits. The
similar approach was also used in other crops. In wheat,
Liu reported[8] that 20 BC1F1 plants were backcrossed to
recurrent parent to produce 20 BC2F1, however Marion[9]
used 42 BC1F1 for backcross. In lettuce, 82 BC4 indi-
viduals, which origined from 11 BC1 plants, covered
about 65% of the donor genome[14].
Since molecular markers are widely used in genetics
and breeding, it should be considered as a factor together
with number of backcrosses and progeny size for more
efficient development of introgression lines.
Comprehensive genetic improvement of some traits is
likely to require large scale of backcrosses to cover the
whole genome of the donor parent, but the actual re-
sources may be limited in some plant species such as
bean and peanut, which are very difficult to make cross
for hybrid seeds. For instance, the average success rate of
cross in soybean is around 20% in our breeding practice.
To obtain 25 BC1F1 individuals requires about 125
crosses under the average success rate, which is neces-
sary to make a number of crosses for soybean. Hence,
economic population size needs to be optimized under
the theoretical support for the development of ILs. We
have not noticed any report aimed to optimize the popu-
lation size at BC1F1 in ILs development although people
use their beat guess to decide the number of BC1F1 plants
to be used. Optimized BC1F1 population size with accu-
rate theoretical calculation will provide a scientific basis
for better construction of ILs. It has been a general con-
cept that the donor genome occupies on average 25% in
population in BC1F1 generation, therefore at least four
individuals are necessary to cover the whole donor ge-
nome theoretically. It is common practice that election
of 20 individuals ensures the coverage of the whole
donor genome. However, utilization of more plants was
lacking in scientific support. In plant breeding and popu-
lation genetics, computer simulations become more and
more popular to investigate problems which was short of
available analytical solutions[16]. For example, Frisch et
al.[17-18] and Ribaut et al.[19] used computer simulations to
compare selection strategies for introgression of one, two
or several target genes from a donor line into a recipient
line. Those computer simulations focused on how effi-
ciently introgression of limited target genes was with
better recovered recurrent parental genetic background. It
may be also possible to use the computer simulations to
optimize the BC1F1 population size.
Since the BC1F1 was the first segregation generation,
desired population size of which is the prerequisite for
effective introgression progress[20]. Here we report the
development of computer program for optimizing BC1F1
population size using rice, wheat, maize, and soybean.
Three factors including linkage groups number, linkage
group length and gene density were also considered in
the program development.
1 Materials and Methods
1.1 Genetic model
The program, written in C++, was used to simulate the
recombination process during meiosis of F1 generation.
The BC1F1 was generated through a cross between F1
plants and recurrent parent. Since gametes from the re-
current parent were all the same, the BC1F1 genotypes
only depended on the gametes from F1 plant. Gametes
from a diploid F1 individual were modeled as a string of
1’s (homozygous for donor alleles) and a string of 0’s
(homozygous for recipient allele) for a series of markers
(representing genes) with the genetic distance measured
in centimorgen (cM). Both recipient and donor parents
were assumed to be homozygous and to carry different
alleles at all the tested loci. The strings of F1 gamete were
simulated by “random combinations” of the selected
markers. The probability of recombination was under the
assumption that exchange could only occur once between
the neighbor loci. The simulation process was completed
when all the targeted donor loci presented in BC1F1
population (Fig. 1). The number of individual plants re-
sulted from the simulation for obtaining the whole tar-
geted loci, was record as the optimized BC1F1 population
size.

Fig. 1 Process of generating BC1F1 from the F1 gamete
When the whole targeted loci were present in BC1F1 population, the
number of individual was record as the optimized size.

1.2 Validation
Results generated from the program were validated
through both mathematics in simple situations and actual
52 作 物 学 报 第 38卷


experiment. In mathematics, supposed that there were
two unlinked loci, the population size was calculated and
compared with the one from the program. In experiment,
the BC1F1 individual genotype was deduced, based on a
real soybean F2 population including 85 individuals with
10 markers on 2 linkage groups. The gamete genotype,
which produced the BC1F1 individual, was the same as
the individual F2 genotype at homozygous loci. These
loci were randomly taken from the F2 genotype at hete-
rozygous loci. The BC1F1 population size was calculated
and compared in two ways. In the first way, we randomly
selected the deduced BC1F1 individuals, and calculated
the BC1F1 population size that contained all the donor
markers. The calculation process was repeated 10 000
times to obtain the average number as the reported result.
In the second way, the program was used to simulate the
BC1F1 population size with 1 000 repeats, the parameters
based on the real F2 linkage map.
1.3 Affected factors of the population size
Analyses were performed to evaluate the possible ef-
fects of three factors on BC1F1 population size. Those
three factors include number of linkage groups, linkage
group length, and gene density (Table 1). We used the
parameters of 5, 10, 15, 20, 25, and 40 to evaluate the
number of linkage groups. We used number of 80, 120,
160, and 200 cM to evaluate linkage group length. The
gene density was setup for 5, 10, and 20 cM per gene. All
the simulation results in this study were from the average
value of 1 000 repeats for each analysis.
1.4 BC1F1 population size for rice, wheat, maize,
and soybean
Crop genome parameters were used according to the
literatures. Rice has 12 linkage groups. Length of each
linkage group varied from 114 cM to 231 cM[21]. Wheat
has 21 linkage groups, the length of linkage group varied
from 17 cM to 243 cM [22]. Maize has only 10 linkage
groups but individual linkage group has 578 to 1 143 cM
(http://www.maizegdb.org/map.php). Soybean has 20
linkage groups with variation of 71 cM to 166 cM in
length[23]. Gene density was setup as three scales of 10, 5,
and 2 cM per gene. For maize gene, there were only two
gene densities of 10 cM and 5 cM per gene due to the
limitation of calculation capacity.
2 Results
2.1 Program could predict the population size
In mathematic validation, the population size calcu-
lated by math and the program both were 2.68. In ex-
periment validation, the result of the program simulation
was 3.73 for covering the whole genome, while the ex-
periment data resulted in the population size of 3.8.
2.2 Population size was affected by three fac-
tors
Results from the program simulation showed that the
number of linkage groups, linkage group length and gene
density have positive correlation with population size
(Table 1). The average population size increased from
6.06 to 9.49 when the linkage number increased from 5
to 40 regardless of the linkage group length and gene
density. The program simulation indicated that the popu-
lation size depended on linkage group length. For exam-
ple, the average population size was 7.14 when the length
of the linkage group was 80 cM, while it became 8.64
when the length was 200 cM. The gene density is the
third factor positively correlating with the BC1F1 popula-
tion size. As the results showed, the average population
size was 7.65 with 20 cM per gene, 7.99 with 10 cM per
gene and 8.22 with 5 cM per gene.

Table 1 Population sizes of BC1F1 in different stimulation
conditions
Linkage groups number Linkage group length
(cM)
Gene density
(cM per gene) 5 10 15 20 25 40
5 5.59 6.46 7.13 7.76 8.21 8.99
10 5.33 6.43 7.07 7.58 7.95 8.76
80
20 5.07 6.21 6.74 7.26 7.67 8.29
5 6.06 7.36 7.98 8.53 8.91 9.54
10 5.81 7.06 7.72 8.21 8.60 9.49
120

20 5.69 6.66 7.35 7.85 8.28 8.97
5 6.57 7.71 8.25 8.96 9.34 10.16
10 6.25 7.48 8.24 8.75 9.02 9.97
160

20 6.17 7.30 7.99 8.35 8.76 9.39
5 6.91 8.18 8.86 9.39 9.88 10.59
10 6.82 7.99 8.65 9.10 9.43 10.05
200

20 6.49 7.57 8.24 8.67 8.95 9.66

2.3 Optimized BC1F1 population size for dif-
ferent crops
To expand the usefulness of the simulation program,
four crops were tested for optimizing BC1F1 population.
Results showed that the optimized BC1F1 population with
more than 95% probability for coverage the whole ge-
nome was 12 for rice, 13 for wheat and soybean with the
variable gene densities of 10, 5, and 2 cM for each gene
(Fig. 2-a, b, and d). For maize genome simulation the
BC1F1 size was 14 cM at 10 cM per gene density level
and 15 cM at 5 cM per gene density level for more than
95% probability (Fig. 2-c).
3 Discussion
This work was based on a genetic model following the
rules of Mendelian segregation, and the program had
been tested by mathematics and experiment. The high
similarity of two results provided strong evidence that the
program could correctly predict the BC1F1 population
size as the real experiment in soybean. Frisch [17] reported
a similar result in comparing an original linkage map,
based on experimental data with a linkage map con-
第 1期 闫 龙等: 构建全基因组导入系中 BC1F1群体大小的影响因素研究 53



Fig. 2 Probability of covering the whole genome for different population size in four crops
PCWDG means the probability for covering the whole donor genome. The difference line in each chart means the different gene density. In rice,
wheat, and soybean the gene density was 10, 5, and 2 cM per gene; in maize, the density was 10 cM and 5 cM per gene.

structed from simulated data in his simulation software
development. We need point out that the program opera-
tion is based under assumption of no either chiasma or
chromatid interference. In reality, the chiasma interfer-
ence is unlikely within a small genetic distance as 10 cM
or less between markers.
This result indicated that the population sizes were
needed to cover the whole donor genome for the four
major crops with 99% probability in most cases. Among
the four crops used, maize needs the biggest BC1F1
population (14–15 individuals) and rice needs the small-
est population (12 individuals) to cover the whole donor
genome. Compared with previous studies of BC1F1
population size, Cox[24] showed that increasing the num-
ber of BC1F1 beyond 12 individuals produced little
change of additive genetic variance in BC1F1-derived
lines by simulation. Korff [10] used 12 BC1F1 plants in
the second backcrosses with the two cultivars covered
98.1% and 93.0% of the donor genome in barley. Our
results were similar to previous reports. The impor-
tance of this study is to optimize BC1F1 population
size for important crops of rice, wheat, maize, and
soybean, meanwhile provided the scientific basis for
the optimization.
4 Conclusions
In this work, a reliable program was developed. Three
factors including linkage groups number, linkage group
length and gene density had positive relation with the
BC1F1 population size. The BC1F1 population sizes of
rice, wheat, maize and soybean were predicted by the
program with 95% confidence level. This report should
better guide the BC1F1 population size used for economic
and reliable introgression practice.
Acknowledgments
The authors would like to thank Dr. G S Hu from
USDA-ARS and Dr. Marinus J M Smulders from Plant
Research International, Wageningen UR, for their revis-
ing suggestion.
References
[1] Song X J, Huang W, Shi M, Zhu M Z, Lin H X. A QTL for rice
grain width and weight encodes a previously unknown RING-
type E3 ubiquitin ligase. Nat Genet, 2007, 39: 623630
[2] Phadnis N, Orr H A. A single gene causes both male sterility and
segregation distortion in Drosophila hybrids. Science, 2009, 323:
376378
[3] Wan X Y, Wan J M, Su C C, Wang C M, Shen W B, Li J M, Wang
H L, Jiang L, Liu S J, Chen L M, Yashi H, Yoshimara A. QTL
detection for eating quality of cooked rice in a population of
chromosome segment substitution lines. Theor Appl Genet, 2004,
110: 7179
[4] Tian F, Li D J, Fu Q, Zhu Z F, Fu Y C, Wang X K, Sun C Q. Con-
struction of introgression lines carrying wild rice (Oryza rufipogon
Griff.) segments in cultivated rice (Oryza sativa L.) background and
characterization of introgressed segments associated with yield-
related traits. Theor Appl Genet, 2006, 112: 570580
[5] Xi Z Y, He F H, Zeng R Z, Zhang Z M, Ding X H, Li W T, Zhang
54 作 物 学 报 第 38卷


G Q. Development of a wide population of chromosome sin-
gle-segment substitution lines in the genetic background of an elite
cultivar of rice (Oryza sativa L.). Genome, 2006, 49: 476484
[6] Takai T, Nonoue Y, Yamamoto S, Yamanouchi U, Matsubara K,
Liang Z W, Lin H, Ono N, Uga Y, Yano M. Development of
chromosome segment substitution lines derived from backcross
between indica donor rice cultivar ‘Nona Bokra’ and japonica re-
cipient cultivar ‘Koshihikari’. Breed Sci, 2007, 57: 257261
[7] Cheema K K, Bains N S, Mangat G S, Das A, Vikal Y, Brar D S,
Khush G S, Singh K. Development of high yielding IR64 ×
Oryza rufipogon (Griff.) introgression lines and identification of
introgressed alien chromosome segments using SSR markers.
Euphytica, 2008, 160: 401409
[8] Liu S B, Zhou R H, Dong Y C, Li P, Jia J Z. Development, utili-
zation of introgression lines using a synthetic wheat as donor.
Theor Appl Genet, 2006, 112: 13601373
[9] Röder M S, Huang X Q, Börner A. Fine mapping of the region on
wheat chromosome 7D controlling grain weight. Funct Integr
Genom, 2008, 8: 7986
[10] Korff M, Wang H, Léon J, Pillen K. Development of candidate
introgression lines using an exotic barley accession (Hordeum
vulgare ssp. spontaneum) as donor. Theor Appl Genet, 2004, 109:
17361745
[11] Schmalenbach I, Léon J, Pillen K. Identification and verification
of QTLs for agronomic traits using wild barley introgression lines.
Theor Appl Genet, 2009, 118: 483497
[12] Schmalenbach I, Körber N, Pillen K. Selecting a set of wild bar-
ley introgression lines and verification of QTL effects for resis-
tance to powdery mildew and leaf rust. Theor Appl Genet, 2008,
117: 10931106
[13] Eshed Y, Zamir D. A genomic library of Lycopersicon pennellii in
L. esculentum: a tool for fine mapping of genes. Euphytica, 1994,
79: 175179
[14] Jeuken M J W, Lindhout P. The development of lettuce backcross
inbred lines (BILs) for exploitation of the Lactuca saligna (wild
lettuce) germplasm. Theor Appl Genet, 2004, 109: 394401
[15] Li Z K, Fu B Y, Gao Y M, Xu J L, Ali J, Lafitte H R, Jiang Y Z,
Pey J D, Vijayakumar C H M, Maghirang R. Genome-wide in-
trogression lines and their use in genetic and molecular dissection
of complex phenotypes in rice (Oryza sativa L.). Plant Mol Biol,
2005, 59: 3352
[16] Frisch M, Bohn M, Melchinger A E. PLABSIM: software for
simulation of marker-assisted backcrossing. J Hered, 2000, 91:
8687
[17] Frisch M, Bohn M, Melchinger A E. Comparison of selection
strategies for marker-assisted backcrossing of a gene. Crop Sci,
1999, 39: 12951301
[18] Frisch M, Melchinger A E. Marker-assisted backcrossing for si-
multaneous introgression of two genes. Crop Sci, 2001 41:
17161725
[19] Ribaut J M, Jiang C, Hoisington D. Simulation experiments on
efficiencies of gene introgression by backcrossing. Crop Sci,
2002, 42: 557565
[20] Susic Z. Experimental and Simulation Studies on Introgressing
genomic Segments from Exotic into Elite Germplasm of Rye
(Secale cereale L.) by Marker-assisted Backcrossing. PhD Dis-
sertation of University of Hohenheim, 2005
[21] McCouch S R, Teytelman L, Xu Y B, Lobos K B, Clare K,
Walton M, Fu B, Maghirang R, Li Z, Xing Y, Zhang Q, Kono I,
Yano M, Fjellstrom R, Declerck G, Schneider D, Cartinhour S,
Ware D, Stein L. Development and mapping of 2240 new SSR
markers for rice (Oryza sativa L.). DNA Res, 2002, 9: 199207
[22] Paillard S, Schnurbusch T, Winzeler M, Messmer M, Sourdille P,
Abderhalden O, Keller B, Schachermayr G. An integrative ge-
netic linkage map of winter wheat (Triticum aestivum L.). Theor
Appl Genet, 2003, 107: 12351242
[23] Song Q J, Marek LF, Shoemaker R C, Lark K G, Concibido V C,
Delannay X, Specht J E, Cregan P B. A new integrated genetic
linkage map of the soybean. Theor Appl Genet, 2004, 109:
122128
[24] Cox T S. Expectations of means and genetic variances in back-
cross populations. Theor Appl Genet, 1984, 68: 3541