「文献」多倍体植物基因组测序组装当前策略

「文献」多倍体植物基因组测序组装当前策略

文献地址: Current Strategies of Polyploid Plant Genome Sequence Assembly

基因组多倍化主要发生在被子植物中。很多多倍体植物都在农业生产上有重大的价值,例如小麦(Triticum aestivum),花生(Arachis hypogaea),十字花科,马铃薯(Solanum tuberosum),燕麦(Avena sativa),香蕉(Musa sp.),草莓(Fragaria ananassa),咖啡( Coffea arabica)等。

多倍体分为两种类型,来自于全基因组加倍的同源多倍体(Autopolyploidy)和物种间/物种内杂交后染色体加倍的异源多倍体(allopolyploidy). 同源多倍体通常会有育性上的问题,而异源多倍体则可能出现杂交优势(heterosis or hybrid vigor).多倍体在表型和基因型上的关系更加复杂,例如它们需要比较复杂的调控才能保证同源基因相互间的表达一致。

在基因组组装上,同源多倍体相对异源多倍体更加困难。这是因为全基因组加倍事件之后通常还会跟着基因组重拍(genome rearrangement), 非典型重组(atypical recombination), 可移动因子启动(transposable element activation),减数分裂/有丝分裂缺陷(meiotic/mitotic defects),以及内含子扩张(intron expansions)与DNA缺失。因此组装基因组一大挑战就是不能错误组装了两个亚基因组中的相似片段。

作者在NCBI查询并总结了到2018年为止已发表的多倍体物种,我更新了草莓(Fragaria × ananassa),香蕉(Musa balbisiana)和甘蔗(Saccharum spontaneum L.)

IDOrganism nameGenome size (Mb)Current status1st Release date in NCBIPloidy levelReferences/center
1Arabidopsis lyrata subsp lyrata206.823Scaffold2009/11/30TetraploidHu et al., 2011
2Glycine max978.972Chromosome2010/1/5AllotetraploidSchmutz et al., 2010
3Triticum aestivum15344.7Chromosome 3B2010/7/15AllohexaploidChoulet et al., 2010
4Solanum tuberosum705.934Scaffold2011/5/24AutotetraploidPotato Genome Sequencing Consortium, 2011
5Actinidia chinensis604.217Contig2013/9/16TetraploidHuang et al., 2013
6Fragaria orientalis214.356Scaffold2013/11/27TetraploidHirakawa et al., 2014
7Fragaria x ananassa805.488Chromosome2019/2/25AllooctaploidEdger et al., 2019
8Beta vulgaris566.55Chromosome2013/12/182n, 4n (Beyaz et al., 2013)Dohm et al., 2014
9Oryza minuta45.1659Chromosome2014/4/16TetraploidOryza Chr3 Short Arm Comparative Sequencing Project
10Camelina sativa641.356Chromosome2014/4/17HexaploidKagale et al., 2014
11Brassica napus976.191Chromosome2014/5/5AllotetraploidChalhoub et al., 2014
12Brassica oleracea var. oleracea488.954Chromosome2014/5/22HexaploidNCBI
13Nicotiana tabacum3643.47Scaffold2014/5/29AllotetraploidSierro et al., 2014
14Eragrostis tef607.318Scaffold2015/4/8AllotetraploidCannarozzi et al., 2014
15Gossypium hirsutum2189.14Chromosome2015/4/29AllotetraploidLi et al., 2015
16Zoysia japonica334.384Scaffold2016/3/15TetraploidTanaka et al., 2016
17Zoysia matrella563.439Scaffold2016/3/15AllotetraploidTanaka et al., 2016
18Zoysia pacifica397.01Scaffold2016/3/15AllotetraploidTanaka et al., 2016
19Musa itinerans455.349Scaffold2016/5/212n, 3n hybrids (Wu et al.,2016)South China Botanic Garden, CAS
20Rosa x damascena711.72Scaffold2016/6/13TetraploidBIO-FD & C CO., LTD
21Chenopodium quinoa1333.55Scaffold2016/7/11TetraploidJarvis et al., 2017
22Brassica juncea var. tumida954.861Chromosome2016/7/19AllotetraploidZhejiang University
23Hibiscus syriacus1748.25Scaffold2016/7/292n, 3n, 4n (Van Huylenbroeck et al., 2000)Korea Research Institute of Science and Biotechnology (Kim et al., 2017)
24Gossypium barbadense2566.74Scaffold2016/10/28TetraploidHuazhong Agricultural University
25Momordica charantia285.614Scaffold2016/12/272n to 6n (Kausar et al., 2015)Urasaki et al., 2016
26Drosera capensis263.788Scaffold2016/12/30Tetraploid (Rothfels and Heimburger, 1968)Butts et al., 2016
27Capsella bursa-pastoris268.431Scaffold2017/1/29TetraploidLomonosov Moscow State University
28Saccharum hybrid cultivar1169.95Contig2017/3/3It varies (D’Hont, 2005)Riaño-Pachón and Mattiello, 2017
29Xerophyta viscosa295.462Scaffold2017/3/31HexaploidCosta et al., 2017
30Triticum dicoccoides10495Chromosome2017/5/18TetraploidWEWseq consortium
31Utricularia gibba100.689Chromosome2017/5/3116-ploidLan et al., 2017
32Eleusine coracana1195.99Scaffold2017/6/8AllotetraploidHittalmani et al., 2017
33Dioscorea rotundata456.675Chromosome2017/7/28TetraploidIwate Biotechnology Research Center
34Ipomoea batatas837.013Contig2017/8/26AutohexaploidYang et al., 2017
35Echinochloa crus-galli1486.61Scaffold2017/10/23HexaploidZhejiang University
36Pachycereus pringlei629.656Scaffold2017/10/31AutotetraploidZhou et al., 2017
37Olea europaea1141.15Chromosome2017/11/12n, 4n, 6n (Besnard et al., 2007)Unver et al., 2017
38Monotropa hypopitys2197.49Contig2018/1/3HexaploidInstitute of Bioengineering, RAS
39Dactylis glomerata839.915Scaffold2018/1/19AutotetraploidSichuan Agricultural University
40Panicum miliaceum848.309Scaffold2018/1/23AllotetraploidChina Agricultural University
41Euphorbia esula1124.89Scaffold2018/2/6HexaploidUSDA-ARS
42Santalum album220.961Scaffold2018/2/122n, 4n etc (Xin-Hua et al., 2010)Center for Cellular and Molecular Platforms
43Avena sativa67.3266Contig2018/2/26HexaploidThe Sainsbury Laboratory
44Panicum miliaceum850.677Chromosome2018/4/9TetraploidShanghai Center for Plant Stress Biology
45Arachis monticola2618.65Chromosome2018/4/23TetraploidHenan Agricultural University
46Arachis hypogaea2538.28Chromosome2018/5/2AllotetraploidInternational Peanut Genome Initiative
47Artemisia annua1792.86Scaffold2018/5/8TetraploidShen et al., 2018
48Saccharum spontaneum L.2.9 GChromosome2018/09/10octoploidZhang et al., 2018
49Musa balbisiana430Chromosome2019/7/15TetraploidWang et al., 2019

在倍性预测上,有两种方法可以使用

而在单倍型组装上,作者列了如下工具,当然最靠谱的肯定是最新的,也就是HapCUT2

  • HapCompass (Aguiar and Istrail, 2012)
  • HaploSim (Bastiaansen et al., 2012)
  • HapCut (Bansal and Bafna, 2008)
  • HapCUT2 (Edge et al., 2017)

在解决多倍体问题上,作者给出了两种策略

  • 基因组上: 尽量挑选单倍型,或者先测二倍体祖先
  • 分析流程上: 三代测序, BioNano, HiC

最终,作者总结了目前植物可用的资源网站

DB nameResourcesPlantsURL
GenbankGenomicVarious plant specieshttps://www.ncbi.nlm.nih.gov/genbank/
EMBLGenomicVarious plant specieshttps://www.ebi.ac.uk/
DDBJGenomicVarious plant specieshttp://www.ddbj.nig.ac.jp/
UniProtProtein and functionalVarious plant specieshttp://www.uniprot.org/
NCBIGenomicVarious plant specieshttps://www.ncbi.nlm.nih.gov/
GOLDGenomic, metagenomics, transcriptomicVarious plant specieshttps://gold.jgi.doe.gov/cgi-bin/GOLD/bin/gold.cgi
PhytozomGenomic92 assembled and annotated plant specieshttps://phytozome.jgi.doe.gov/pz/portal.html
PlantgdbGenomic, transcriptomic27 assembled and annotated plant specieshttp://www.plantgdb.org/
SolGenomic11 Solanaceae specieshttps://solgenomics.net/
GrameneGenomic, genetic markers, QTLs53 plant specieshttp://www.gramene.org/
MaizeGCBGenomic, annotations, tool hostZea mayshttps://www.maizegdb.org/
TairGenetic and molecular biology dataArabidopsis thalianahttps://www.arabidopsis.org/
CottonGEGenomic, Genetic and breeding resources49 Gossypium specieshttps://www.arabidopsis.org/
PLEXdbGene expression14 plant specieshttp://www.plexdb.org/
RiceProGene expressionOryza sativahttp://ricexpro.dna.affrc.go.jp/
CerealsDBGenetic markersTriticum aestivumhttp://www.cerealsdb.uk.net/cerealgenomics/CerealsDB/indexNEW.php
PeanutBaGenome, MAS, QTLs, GermplasmArachis hypogaeahttps://peanutbase.org/
SoyKbGenetic markers, genomic resourcesGlycine maxhttp://soykb.org/
SoyBaseGenetic markers, QTLs, genomic resourcesG. maxhttps://soybase.org/
PGDBjGenetic markers, QTLs, genomic resources80 plant specieshttp://pgdbj.jp/
SNP-SeekGenotype, Phenotype and Variety informationO. sativahttp://snp-seek.irri.org/
GrainGeneGenome, Genetic markers, QTLs, genomic resourcesT. aestivum, Hordeum vulgare, Secale cereale, Avena sativa etchttps://wheat.pw.usda.gov/GG3/
ASRPsmall RNAA. thalianahttp://asrp.danforthcenter.org/
CSRDBsmall RNAZ. mayshttp://sundarlab.ucdavis.edu/smrnas/
BrassicaInGenomic7 Brassica specieshttp://brassica.info/
BRADGenomics, Genetic Markers and MapsBrassicahttp://brassicadb.org/brad/
Ensembl PlantsGenomic45 plant specieshttp://plants.ensembl.org/index.html
Ipomoea Genome HubGenomic, ESTIpomoea batatashttps://ipomoea-genome.org/
PGSCGenomic, annotationS. tuberosum, S.chacoensehttp://solanaceae.plantbiology.msu.edu/pgsc_download.shtml
GDRGenomics, Genetics, breedingRosaceaehttps://www.rosaceae.org/analysis/266
HWGGenomics, Transcriptomics, Genetic MarkersForest trees and woody plantshttps://www.hardwoodgenomics.org/

评论

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×