Scholar Data

Type 2 Diabetes gut metagenome (microbiome) data from 368 Chinese samples and updated metagenome gene catalog

We provide data from the sequenced and analyzed gut metagenome of 368 Chinese individuals with Type 2 Diabetes (T2D) and healthy controls used in a newly developed two stage Metagenome-Wide Association Study (MWAS) aimed at identifying associations between gut microbiota and Type 2 Diabetes. The data here include the an updated metagenome gene catalog, metagenome assemblies, genetic and functional markers associated with T2D, and a novel form of marker- a Metagenomic Linkage Marker (MGL), that allows taxonomic species-level analyses without the need to identify, isolate, and culture novel bacterial species.Bacterial DNA was extracted from faecal samples and subjected to unbiased, whole-genome shotgun (WGS) sequencing using the Illumina GAIIx and HiSeq2000. In summary, 145 samples were sequenced in stage I and 378.4 Gb of paired-end (PE) sequence data were generated; 200 samples were sequenced in stage II and 830.8 Gb of PE sequence data were generated, as well as 23 additional samples for a validation study on gut-microbiota-based T2D classification. By combining MetaHIT data and our sequence data in stage I we constructed an updated human gut microbial gene catalogue, which contained 4.3 million genes suitable for studies on Chinese individuals. In this updated gene catalogue, 21.3%, 47.1% and 60.9% of genes could be assigned to known genera, KEGG and eggNOG database respectively. Taking this gene catalogue as a reference, we identified gene, KO and OG markers, respectively, which seemed to be strongly associated with T2D in our two-stage case-control study, and 47 Metagenomic Linkage Groups (MLGs). These MLGs were further assembled into different contig sets, which could provide useful genetic, functional, and taxonomic information for the relevant bacterium.

Authors

Li, Shenghui ;
Guan, Yuanlin ;
Zhang, Wenwei ;
Zhang, Fan ;
Cai, Zhiming ;
Wu, Wenxian ;
Zhang, Dongya ;
Jie, Zhuye ;
Liang, Suisha ;
Shen, Dongqian ;
Qin, Youwen ;
Xu, Ran ;
Wang, Mingbang ;
Gong, Meihua ;
Yu, Jing ;
Zhang, Yanyan ;
Han, Lingchuan ;
Lu, Donghui ;
Wu, Peixian ;
Dai, Yali ;
Sun, Xiaojuan ;
Li, Zesong ;
Tang, Aifa ;
Zhong, Shilong ;
Li, Xiaoping ;
Chen, Weineng ;
Zhang, Ming ;
Zhang, Zhaoxi ;
Chen, Hua ;
Qin, Junjie ;
Li, Yingrui ;
Wang, Jun

3 Citations0 Mentions31% FAIR2.0 Dataset Index

10.5524/100036January 2012

Updated genome assembly of YH: the first diploid genome sequence of a Han Chinese individual (version 2, 07/2012)

Updated genomic data from the YH (Homo sapiens) diploid genome the first sequenced Han Chinese individual, a representative of the Asian population. The genomic DNA used in this study came from an anonymous male Han Chinese individual who has no known genetic diseases.The original version of the YH genome was assembled based on 3.3 billion reads using the Illumina Genome Analyzer (see dataset doi:10.5524/100015). This latest (as of 07/2012) and improved version of the YH genome was assembled based on 2.1 billion reads using the Illumina HiSeq2000. A total of 202G nucleotides data was achieved using 100 bp-long paired end reads with an insert size ranging from 180 bp to 40 kbp, and the genome was sequenced to 67.5-fold average coverage. The latest version of SOAPdenovo2 was used to reassemble, improve and update the previously assembled genome (tools and pipelines available here: doi:10.5524/100044). By aligning the short reads with SOAP, 177G nucleotides were mapped onto the NCBI reference genome and 99.99% of the genome was covered. The raw sequences, assemblies and relevant tools are released for public use under a CC0 license.More information about the YH genome can be viewed at: http://yh.genomics.org.cn/

Authors

Wang, Jun ;
Li, Yingrui ;
Luo, R ;
Liu, B ;
Xie, Y ;
Li, Zhuo ;
Fang, Xiaodong ;
Zheng, Hancheng ;
Qin, Junjie ;
Yang, Bin ;
Yu, C ;
Ni, Peixiang ;
Li, Ning ;
Guo, Guangwu ;
Ye, Jia ;
Fang, Lin ;
Su, Yeyang ;
, Asan ;
Zheng, Hongkun ;
Kristiansen, Karsten ;
Wong, Gane, Ka-Shu ;
Nielsen, Rasmus ;
Durbin, Richard ;
Bolund, Lars ;
Zhang, Xiuqing ;
Li, Songgang ;
Yang, Huanming ;
Wang, Jian

6 Citations0 Mentions31% FAIR3.3 Dataset Index

10.5524/100038January 2012

Genomic data from <em>Escherichia coli</em> O104:H4 isolate TY-2482

The May 2011 outbreak of an E. coli infection in Europe resulted in serious concerns about the potential appearance of a new deadly strain of bacteria, Escherichia coli O104:H4 TY-2482. In response to this situation, and immediately after the reports of deaths, the University Medical Centre Hamburg-Eppendorf and BGI-Shenzhen worked together to sequence the bacterium and assess its human health risk.

The bacteriums genome was first sequenced using Life Technologies; Ion Torrent sequencing platform. According to the results of the draft assembly, the estimated genome size of this new E. coli strain is about 5.2 Mb. Sequence analysis indicated this bacterium is an EHEC serotype O104 E. coli strain. Comparative analysis showed that this bacterium has 93% sequence similarity with the EAEC 55989 E. coli strain, which was isolated in the Central African Republic and known to cause serious diarrhea. This strain of E. coli, however, has also acquired specific sequences that appear to be similar to those involved in the pathogenicity of hemorrhagic colitis and hemolytic-uremic syndrome. The acquisition of these genes may have occurred through horizontal gene transfer.

To maximize its utility to the research community and aid those fighting the epidemic, this genomic data was released into the public domain under a CC0 license.

To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to genomic data from the 2011 E. coli outbreak. This work is published from China.

Authors

Li, Dongfang ;
Xi, Feng ;
Zhao, Meiru ;
Chen, Wentong ;
Cao, S ;
Xu, R ;
Wang, G ;
Wang, J ;
Zhang, Zhaoxi ;
Li, Yin ;
Cui, C ;
Chang, C ;
Cui, C ;
Luo, Y ;
Qin, Junjie ;
Li, Shenghui ;
Li, Junhua ;
Peng, Yangqing ;
Pu, Fei ;
Sun, Y ;
Chen, Y ;
Zong, Y ;
Ma, X ;
Yang, Xianwei ;
Cen, Zhong ;
Song, Yajun ;
Zhao, Xiangna ;
Chen, F ;
Yin, X ;
Rohde, Holger ;
Liang, Y ;
Li, Yingrui ;
, The <Em>Escherichia Coli</Em> O104:H4 TY-2482 Isolate Genome Sequencing Consortium

13 Citations0 Mentions31% FAIR6.7 Dataset Index

10.5524/100001January 2011

Genome sequence of YH: the first diploid genome sequence of a Han Chinese individual.

Genomic data from the YH (Homo sapiens) genome first diploid genome sequence of a Han Chinese, a representative of the Asian population. The genomic DNA used in this study came from an anonymous male Han Chinese individual who has no known genetic diseases.The YH genome was assembled based on 3.3 billion reads using the Illumina Genome Analyzer. We achieved 117.7G nucleotides data and the genome was sequenced to 36-fold average coverage. By aligning the short reads with SOAP, 102.9G nucleotides were mapped onto the NCBI reference genome and 99.97% of the genome was covered. The raw sequences, alignments, consensus genome, variants and relevant tools are released for public use under a CC0 license.

Authors

Wang, Jun ;
Wang, Wei ;
Li, Ruiqiang ;
Li, Yingrui ;
Tian, Geng ;
Goodman, Laurie ;
Fan, Wei ;
Zhang, Junqing ;
Li, Jun ;
Zhang, Juanbin ;
Guo, Yiran ;
Feng, Binxiao ;
Li, Heng ;
Lu, Yao ;
Fang, Xiaodong ;
Liang, Huiqing ;
Du, Zhenglin ;
Li, Dong ;
Zhao, Yiqing ;
Hu, Yujie ;
Yang, Zhenzhen ;
Zheng, Hancheng ;
Hellmann, Ines ;
Inouye, Michael ;
Pool, John ;
Yi, Xin ;
Zhao, Jing ;
Duan, Jinjie ;
Zhou, Yan ;
Qin, Junjie ;
Ma, Lijia ;
Li, Guoqing ;
Yang, Zhentao ;
Zhang, Guojie ;
Yang, Bin ;
Yu, Chang ;
Liang, Fang ;
Li, Wenjie ;
Li, Shaochuan ;
Li, Dawei ;
Ni, Peixiang ;
Ruan, Jue ;
Li, Qibin ;
Zhu, Hongmei ;
Liu, Dongyuan ;
Lu, Zhike ;
Li, Ning ;
Guo, Guangwu ;
Zhang, Jianguo ;
Ye, Jia ;
Fang, Lin ;
Hao, Qin ;
Chen, Quan ;
Liang, Yu ;
Su, Yeyang ;
, Asan ;
Ping, Cuo ;
Yang, Shuang ;
Chen, Fang ;
Li, Li ;
Zhou, Ke ;
Zheng, Hongkun ;
Ren, Yuanyuan ;
Yang, Ling ;
Gao, Yang ;
Yang, Guohua ;
Li, Zhuo ;
Feng, Xiaoli ;
Kristiansen, Karsten ;
Wong, Gane, Ka-Shu ;
Nielsen, Rasmus ;
Durbin, Richard ;
Bolund, Lars ;
Zhang, Xiuqing ;
Li, Songgang ;
Yang, Huanming ;
Wang, Jian

2 Citations0 Mentions31% FAIR1.5 Dataset Index

10.5524/100015January 2011

Automated Author Profile
Qin, Junjie

Qin, Junjie

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Type 2 Diabetes gut metagenome (microbiome) data from 368 Chinese samples and updated metagenome gene catalog

Updated genome assembly of YH: the first diploid genome sequence of a Han Chinese individual (version 2, 07/2012)

Genomic data from <em>Escherichia coli</em> O104:H4 isolate TY-2482

Genome sequence of YH: the first diploid genome sequence of a Han Chinese individual.

Automated Author ProfileQin, Junjie

Qin, Junjie

Current S-Index

Average Dataset Index per Dataset

Total Datasets

Average FAIR Score

Total Citations

Total Mentions

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Type 2 Diabetes gut metagenome (microbiome) data from 368 Chinese samples and updated metagenome gene catalog

Updated genome assembly of YH: the first diploid genome sequence of a Han Chinese individual (version 2, 07/2012)

Genomic data from <em>Escherichia coli</em> O104:H4 isolate TY-2482

Genome sequence of YH: the first diploid genome sequence of a Han Chinese individual.

Automated Author Profile
Qin, Junjie