Automated Author Profile

Wang, Jun

0000-0002-1422-3331

Current S-Index

106.7

Sum of Dataset Indices for all datasets

Average Dataset Index per Dataset

1.9

Average Dataset Index per dataset

Total Datasets

57

Total datasets for this author

Average FAIR Score

31.4%

Average FAIR Score per dataset

Total Citations

169

Total citations to the author's datasets

Total Mentions

2

Total mentions of the author's datasets

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

Single-cell sequencing data using DOP-PCR, MDA and MALBAC whole genome amplification methods.

Single-cell sequencing (SCS) provides many biomedical advances but currently relies on whole-genome amplification (WGA). Three methods are commonly used for WGA: multiple displacement amplification (MDA), degenerate oligonucleotide-primed PCR (DOP-PCR) and multiple annealing and looping-based amplification cycles (MALBAC). Here we systematically compare the advantages and disadvantages and performance of each method. To systematically evaluate the SCS performance of commonly used WGA methods, we performed single cell WGA using six commercial kits based on DOP-PCR and MDA and then performed whole genome sequencing (WGS) of the successfully amplified DNA. A total of 36 single cells were collected in our study, 19 from a lymphoblastoid cell-line (YH cell-line), the rest from a widely-known gastric cancer cell-line, BGC823. Corresponding pooled DNA was extracted as a non-amplified control. BGC823 cell-line was provided by Professor Youyong Lv at Beijing Cancer Hospital. All samples and experimental protocols were approved by the Institutional Review Board of BGI-Shenzhen.

Authors

  • Hou, Y ;
  • Wu, K ;
  • Shi, X ;
  • Li, F ;
  • Song, L ;
  • Wu, H ;
  • Dean, M ;
  • Li, G ;
  • Tsang, S ;
  • Jiang, R ;
  • Zhang, X ;
  • Li, B ;
  • Liu, G ;
  • Bedekar, N ;
  • Lu, N ;
  • Xie, G ;
  • Liang, H ;
  • Wang, T ;
  • Chen, J ;
  • Li, Y ;
  • Zhang, X ;
  • Yang, H ;
  • Xu, X ;
  • Wang, L ;
  • Wang, Jun
2 Citations0 Mentions31% FAIR1.3 Dataset Index
10.5524/1001152015

Supporting data for the dynamics and stabilization of the Human gut microbiome during the first year of life.

Here we performed metagenomic shotgun sequencing on fecal samples from 98 full-term Swedish infants (new born, 4-months and 12-months old) and their mothers; assembled gut microbial genomes and constructed reference gene catalogs from the cohort. We generated 1.52 Tb paired-end reads of high-quality sequences (average 3.99 Gb per sample). A gene catalog was constructed for each time point based on de novo assembly and metagenomic gene prediction; and functionally annotated using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. We also assembled a total of 4,356 microbial genomes (>0.9 MB) de novo; by binning assembled contigs according to abundance variations across samples. These de novo assembled genomes were complemented by 1,147 genomes from the National Center for Biotechnology Information (NCBI) Bacteria/Archaea genome database. All genomes were subsequently clustered into 690 unique metagenomic operational taxonomic units (MetaOTUs) that were equivalent to species-level classifications. Of these, 373 were annotated to species, the remaining 317 represent novel species related to known species. We constructed the metaOTUs profile by mapping reads to our metaOTUs sequences.

Authors

  • Backhed, Fredrik ;
  • Roswall, Josefine ;
  • Peng, Yangqing ;
  • Feng, Qiang ;
  • Jia, Huijue ;
  • Kovatcheva-Datchary, Petia ;
  • Li, Yin ;
  • Xia, Yan ;
  • Xie, Hailiang ;
  • Zhong, Huanzi ;
  • Khan, Muhammad, Tanweer ;
  • Zhang, Jianfeng ;
  • Li, Junhua ;
  • Xiao, Liang ;
  • Al-Aama, Jumana, Yousuf ;
  • Zhang, Dongya ;
  • Lee, Ying, Shiuan ;
  • Kotowska, Dorota ;
  • Colding, Camilla ;
  • Tremaroli, Valentina ;
  • Yin, Ye ;
  • Bergman, Stefan ;
  • Madsen, Lise ;
  • Kristiansen, Karsten ;
  • Dahlgren, Jovanna ;
  • Wang, Jun
1 Citation0 Mentions31% FAIR1.0 Dataset Index
10.5524/1001452015

Nontargeted metabolomics and lipidomics HPLC-MS data from maternal plasma of 180 healthy pregnant women.

Metabolic variations occur during normal pregnancy to provide the growing fetus with a supply of nutrients required for its development and to ensure the health of the woman during gestation. Mass spectrometry-based metabolomics was employed to study the metabolic phenotype variations in the maternal plasma that are induced by pregnancy in each of its three trimesters.
Here we provide the LC-MS data from 180 healthy pregnant women, each individual was followed up to term to make sure that women had normal term pregnancy and healthy babies. All volunteers gave written consent and filled out individual questionnaire at the time of sample collection.
The samples were divided into six sub-groups according to the gestational week of their pregnancy at the time of sampling, T1 (n= 30, 9-12 weeks), T2 (n=30, 13-16 weeks), T3 (n=30, 17-20 weeks), T4 (n=30, 21-24 weeks), T5 (n=30, 25-28 weeks), and T6 (n=30, 29-40 weeks). Body mass index (BMI), age, and gestational week were recorded for each individual.
The repository contains data in 3 modalities: positive and negative ion 'global' non-targeted LC-MS and shotgun lipidomics (including carnitine profiling) LC-MS.

Authors

  • Luan, Hemi ;
  • Meng, Nan ;
  • Liu, Ping ;
  • Feng, Qiang ;
  • Lin , Shuhai ;
  • Fu, Jin ;
  • Chen, Xiaomin ;
  • Weiqiao, Rao ;
  • Chen, Fang ;
  • Jiang, Hui ;
  • Xu, Xun ;
  • Cai, Zongwei ;
  • Wang, Jun
6 Citations0 Mentions31% FAIR2.8 Dataset Index
10.5524/1001082015

Genomic data from the Tibetan Plateau frog (<em>Nanorana parkeri</em>).

Nanorana parkeri (also known as the high Himalaya frog, Xizang Plateau frog, Parker's slow frog, or mountain slow frog) is a common frog living across the Tibetan Plateau. It occurs at elevations ranging from 2,850 to 5,000m. Because this species lives at such high elevations, it provides an additional excellent biological model to study the frogs adaptations to extreme conditions.
A female frog was collected from the Qinghai-Tibetan Plateau at an elevation of 4,900m, and genomic DNA was extracted from muscle tissue. Paired end DNA libraries with different insert-size lengths (170 bp to 20 kb) weresequenced on the Illumina HiSeq 2000 platform. After performing filtering steps to remove artificial duplication, adaptercontamination and low-quality reads, 190 Gbp of high-quality data (83× genome coverage) was obtained. This was assembled using SOAPdenovo and SSPACE, producing a final draft assembly of 2.0Gb with an N50 scaffold size of 1.05Mb. More than 20,000 genes were predicted. The Nanorana parkeri genome should help offer new insights into the amphibian evolution and Tibetan high-altitude adaptation.

Authors

  • Liu, Shiping ;
  • Xiong, Zijun ;
  • Zhang, Xueyan ;
  • Zhang, Guojie ;
  • Sun, Yanbo ;
  • Che, Jing ;
  • Zhang, Yaping ;
  • Wang, Jun
3 Citations0 Mentions31% FAIR1.8 Dataset Index
10.5524/1001322015

A Catalogue of the Mouse Gut Metagenome

To increase the value of mice models studies, we have used HiSeq2000-based whole genome sequencing to establish a catalogue of 2.6 million non-redundant microbial genes derived from 1,130 gigabases of microbial sequences from faecal samples of 184 mice of different strains and from different providers and housing laboratories. More than 99% of the genes are bacterial indicating that the mouse gut microbiota comprises at least 800-900 prevalent bacterial species.This reference gene catalog was annotated to Non-redundant protein sequences (NR) and Kyoto Encyclopedia of Genes and Genomes (KEGG) and the evolutionary genealogy of genes: Non-supervised Orthologous Groups (eggNOG) databases.

Authors

  • Xiao, Liang ;
  • Feng, Qiang ;
  • Liang, Suisha ;
  • Sonne, Si Brask ;
  • Xia, Zhongkui ;
  • Qiu, Xinmin ;
  • Li, Xiaoping ;
  • Long, Hua ;
  • Zhang, Jianfeng ;
  • Zhang, Dongya ;
  • Liu, Chuan ;
  • Fang, Zhiwei ;
  • Chou, Joyce ;
  • Glanville, Jacob ;
  • Hao, Qin ;
  • Kotowska, Dorota ;
  • Colding, Camilla ;
  • Licht, Tine, Rask ;
  • Wu, Donghai ;
  • Yu, Jun ;
  • Sung, Joseph, Jao Yiu ;
  • Liang, Qiaoyi ;
  • Li, Junhua ;
  • Jia, Huijue ;
  • Lan, Zhou ;
  • Tremaroli, Valentina ;
  • Backhed, Fredrik ;
  • Doré, Joel ;
  • Le Chatelier, Emmanuelle ;
  • Ehrlich, S.Dusko ;
  • Lin, John , C ;
  • Arumugam, Manimozhiyan ;
  • Wang, Jun ;
  • Madsen, Lise ;
  • Kristiansen, Karsten
4 Citations0 Mentions31% FAIR2.4 Dataset Index
10.5524/1001142015

Genome sequence of a Mongolian individual

We present the genome sequence of a Mongolian male individual. The genome is assembled using short reads produced from the massive parallel sequencing method, resulting in 130.8-fold genome coverage. We identify high-confidence variation sets validated by chip genotyping and PCR-Sanger sequencing, including 3.7 million single nucleotide polymorphisms and 756,234 short insertions and deletions. We assign the paternal inheritance of the individual to the lineage D3a through Y haplogroup analysis and infer Genghis Khan had a common paternal ancestor with Tibeto-Burman populations. We investigate the gene flow between Mongolians and other ethnic groups and demonstrate that the genetic influences on them most likely resulted from the expansion of the Mongol Empire. The Mongolian genome lays a foundation to further understand human evolution and explore population specific genetic causes of diseases/traits in Mongolians and closely related groups.

Authors

  • Bai, Haihua ;
  • Guo, Xiaosen ;
  • Yang, Zili ;
  • Narisu, Narisu ;
  • Bu, Junjie ;
  • Jirimutu, Jirimutu ;
  • Liang, Fan ;
  • Zhao, Xiang ;
  • Xing, Yanping ;
  • Wang, Dingzhu ;
  • Li, Tongda ;
  • Zhang, Yanru ;
  • Guan, Baozhu ;
  • Yang, Xukui ;
  • Zhang, Dong ;
  • Shuangshan, Shuangshan ;
  • Su, Zhe ;
  • Wu, Huiguang ;
  • Li, Wenjing ;
  • Chen, Ming ;
  • Zhu, Shilin ;
  • Bayinnamula, Bayinnamula ;
  • Chang, Yuqi ;
  • Gao, Ying ;
  • Lan, Tianming ;
  • Suyalatu, Suyalatu ;
  • Li, Wenqi ;
  • Yang, Xu ;
  • Chen, Yujie ;
  • Feng, Qiang ;
  • Wang, Jian ;
  • Yang, Huanming ;
  • Wang, Jun ;
  • Wu, Qizhu ;
  • Yin, Ye ;
  • Zhou, Huanmin
1 Citation0 Mentions31% FAIR1.1 Dataset Index
10.5524/1001042014

Genomic data of the domestic goat (<em>Capra hircus</em>).

The domestic goat is one of the most important livestock species in the world, especially in China, India and other developing countries. Goats not only serve as an important source of meat, milk, fiber and pelts, and have fulfilled agricultural, economic, cultural and even religious roles from very early times in human civilization, but also are now used as animal models for biomedical research and transgene production of protein medicines.We would like to share all the genome data of goat. We hope the genome sequence of goat can provide a new resource for biological research and breeding of goat and other small ruminants.
We sequenced the 2.92 Gb genome to a depth of approximately 65.6 X with short reads from a series of libraries with various insert sizes ( 170 bp, 350 bp, 800 bp, 2 kb, 5 kb, 10 kb and 20 kb) on a HiSeq 2000 sequencer.
The assembled scaffolds of high quality sequences total 191.5 Gb, with the contig and scaffold N50 values of 18.7 kb and 2.21 Mb respectively. We identified 22,175 protein-coding genes.In addition, we also provide the restriction-enzyme fragment maps derived from the whole genome mapping (WGM) technology developed by the Argus System (method described in this paper).
Scaffolds derived from de novo assembly of next-generation sequencing data are converted into restriction maps by in silico restriction enzyme digestion. Then, the distance between restriction enzyme sites in the sequencing-derived scaffolds are matched to the lengths of the optical fragments in the single-molecule WGM restriction maps. Matches allow the scaffolds to be extended and linked into super-scaffolds.

Authors

  • Dong, Yang ;
  • Xie, Min ;
  • Jiang, Yu ;
  • Xiao, Nianqing ;
  • Du, Xiaoyong ;
  • Zhang, Wenguang ;
  • Tosser-Klopp, Gwenola ;
  • Wang, Jinhuan ;
  • Yang, Shuang ;
  • Liang, Jie ;
  • Chen, Wenbin ;
  • Chen, Jing ;
  • Zeng, Peng ;
  • Hou, Yong ;
  • Bian, Chao ;
  • Pan, Shengkai ;
  • Li, Yuxiang ;
  • Liu, Xin ;
  • Wang, Wenliang ;
  • Servin, Bertrand ;
  • Sayre, Brian ;
  • Zhu, Bin ;
  • Sweeney, Deacon ;
  • Moore, Rich ;
  • Nie, Wenhui ;
  • Shen, Yongyi ;
  • Zhao, Ruoping ;
  • Zhang, Guojie ;
  • Li, Jinquan ;
  • Faraut, Thomas ;
  • Womack, James ;
  • Zhang, Yaping ;
  • Kijas, James ;
  • Cockett, Noelle, E ;
  • Xu, Xun ;
  • Zhao, Shuhong ;
  • Wang, Jun ;
  • Wang, Wen
4 Citations0 Mentions31% FAIR2.4 Dataset Index
10.5524/1000822014

Genomic data of the soft shell turtle (<em>Pelodiscus sinensis</em>).

The soft shell turtle can reach a carapace length of 1 ft (0.30 m). It has webbed feet for swimming. They are called "softshell" because their carapace lacks horny scutes (scales). The carapace is leathery and pliable, particularly at the sides. It is commercially farmed in vast numbers for the food trade.
DNA from the soft shell turtle was collected in Japan. We sequenced the 2.21 Gb genome to a depth of approximately 105.6 X with short reads from a series of libraries with various insert sizes ( 170bp, 500bp, 800bp, 2kb, 5kb, 10kb,20kb and 40kb) on a HiSeq 2000 sequencer.
The assembled scaffolds of high quality sequences total 221.7 Gb, with the contig and scaffold N50 values of 21.9 kb and 3.33 Mb respectively. We identified 19,327 protein-coding genes with an mean length of ~1500bp. Experimental procedures and animal care were conducted in strict accordance with guidelines approved by the RIKEN Animal Experiments Committee (Approval IDs H14-23 and H16-10).

Authors

  • Wang, Zhuo ;
  • Pascual-Anaya, Juan ;
  • Zadissa, Amonida ;
  • Li, Wenqi ;
  • Niimura, Yoshihito ;
  • Huang, Zhiyong ;
  • Li, Chunyi ;
  • White, Simon ;
  • Xiong, Zhiqiang ;
  • Fang, Dongming ;
  • Wang, Bo ;
  • Ming, Yao ;
  • Chen, Yan ;
  • Zheng, Yuan ;
  • Kuraku, Shigehiro ;
  • Pignatelli, Miguel ;
  • Herrero, Javier ;
  • Beal, Kathryn ;
  • Nozawa, Masafumi ;
  • Li, Qiye ;
  • Wang, Juan ;
  • Zhang, Hongyan ;
  • Yu, Lili ;
  • Shigenobu, Shuji ;
  • Wang, Junyi ;
  • Liu, Jiannan ;
  • Flicek, Paul ;
  • Searle, Steve ;
  • Wang, Jun ;
  • Kuratani, Shigeru ;
  • Yin, Ye ;
  • Aken, Bronwen ;
  • Zhang, Guojie ;
  • Irie, Naoki
1 Citation0 Mentions31% FAIR1.1 Dataset Index
10.5524/1000862014

Genomic data of the green sea turtle (<em>Chelonia mydas</em>).

Green turtles are long-lived and may take up to 59 years to reach sexual maturity. Undertaking tremendous feats of navigation, adults return to the same beach to breed each season.
DNA from the green sea turtle was collected in Hong Kong. We sequenced the 2.24 Gb genome to a depth of approximately 82.3 X with short reads from a series of libraries with various insert sizes ( 170bp, 500bp, 800bp, 2kb, 5kb, 10kb,20kb and 40kb) on a HiSeq 2000 sequencer.
The assembled scaffolds of high quality sequences total 180.94 Gb, with the contig and scaffold N50 values of 20.4 kb and 3.78 Mb respectively. We identified 19,633 protein-coding genes with an mean length of 1456 bp.Experimental procedures and animal care were conducted in strict accordance with guidelines approved by the RIKEN Animal Experiments Committee (Approval IDs H14-23 and H16-10).

Authors

  • Wang, Zhuo ;
  • Pascual-Anaya, Juan ;
  • Zadissa, Amonida ;
  • Li, Wenqi ;
  • Niimura, Yoshihito ;
  • Huang, Zhiyong ;
  • Li, Chunyi ;
  • White, Simon ;
  • Xiong, Zhiqiang ;
  • Fang, Dongming ;
  • Wang, Bo ;
  • Ming, Yao ;
  • Chen, Yan ;
  • Zheng, Yuan ;
  • Kuraku, Shigehiro ;
  • Pignatelli, Miguel ;
  • Herrero, Javier ;
  • Beal, Kathryn ;
  • Nozawa, Masafumi ;
  • Li, Qiye ;
  • Wang, Juan ;
  • Zhang, Hongyan ;
  • Yu, Lili ;
  • Shigenobu, Shuji ;
  • Wang, Junyi ;
  • Liu, Jiannan ;
  • Flicek, Paul ;
  • Searle, Steve ;
  • Wang, Jun ;
  • Kuratani, Shigeru ;
  • Yin, Ye ;
  • Aken, Bronwen ;
  • Zhang, Guojie ;
  • Irie, Naoki
1 Citation0 Mentions31% FAIR1.1 Dataset Index
10.5524/1000852014

Supporting data for the paper: "An integrated catalog of reference genes in the human gut microbiome".

Here we sequenced 249 fecal samples from European adults, leading to a total of 760 samples in the Metagenome of the Human Intestinal Tract (MetaHIT) project. All 6.4TB whole-genome shotgun sequencing data from 1267 fecal samples in MetaHIT, the Human Microbiome Project (HMP) and our diabetes study on Chinese adults were processed with the MOCAT pipeline. The resulting gene catalogs were merged using CD-HIT and complemented with genes from 511 sequenced human gut-related prokaryotic genomes that were present in our gut metagenomes. The final high-quality integrated reference catalog of the human gut microbiome contains 9,879,896 non-redundant genes. The genes were phylogenetically annotated according to 3449 bacterial and archaeal genomes and draft genomes from NCBI, and functionally annotated using orthologous groups from the Kyoto Encyclopedia of Genes and Genomes (KEGG) and the evolutionary genealogy of genes: Non-supervised Orthologous Groups (eggNOG) databases. In addition, 11 samples from the Chinese cohort were re-extracted using the MetaHIT DNA extraction protocol and shotgun-sequenced to compare with the original data generated by a slightly different DNA extraction protocol.

Authors

  • Li, Junhua ;
  • Jia, Huijue ;
  • Cai, Xianghang ;
  • Zhong, Huanzi ;
  • Feng, Qiang ;
  • Sunagawa, Shinichi ;
  • Arumugam, Manimozhiyan ;
  • Kultima, Jens, Roat ;
  • Prifti, Edi ;
  • Nielsen, Trine ;
  • Juncker, Agnieszka, Sierakowska ;
  • Manichanh, Chaysavanh ;
  • Chen, Bing ;
  • Zhang, Wenwei ;
  • Levenez, Florence ;
  • Wang, Juan ;
  • Xu, Xun ;
  • Xiao, Liang ;
  • Liang, Suisha ;
  • Zhang, Dongya ;
  • Zhang, Zhaoxi ;
  • Chen, Weineng ;
  • Zhao, Hailong ;
  • Al-Aama, Jumana, Yousuf ;
  • Edris, Sherif ;
  • Yang, Huanming ;
  • Wang, Jian ;
  • Hansen, Torben ;
  • Nielsen, Henrik, Bjorn ;
  • Brunak, Soren ;
  • Kristiansen, Karsten ;
  • Guarner, Francisco ;
  • Pedersen, Oluf ;
  • Doré, Joel ;
  • Ehrlich, S.Dusko ;
  • , MetaHIT Consortium ;
  • Bork, Peer ;
  • Wang, Jun
8 Citations0 Mentions31% FAIR4.4 Dataset Index
10.5524/1000642014