Site is currently under maintenance
Some features may be unavailable or limited during this time. We apologize for any inconvenience and appreciate your patience.

Published on 18 December 2020 |

Version 0.1.0

Additional information for manuscript entiteld "Host-parasitoid associations in marine planktonic time series: can metabarcoding help reveal them?" (PONE-D-20-17825R1)

View Dataset
Käse, Laura;Neuhaus, Stefan

Description

Description: This repository contains material to reproduce metabarcoding analyses based on the q-zip pipeline (https://github.com/PyoneerO/qzip). Raw fastq files can be downloaded from https://www.ebi.ac.uk/ena/browser/view/PRJEB37135. The used reference file can be downloaded from https://github.com/pr2database/pr2database/releases/tag/4.11.1. Please select the files created for the classifier implemented in mothur. The dockerfile in this repository can be used to set up the environment which inludes the installation of the needed versions of the needed tools. Twelve different analyses had been conducted. For each analysis one zip file had been created which contains the following files: - q-zip_commands.sh: the shell script to launch the pipeline - q-zip_parameters.txt: pipeline parameter file as input of the shell script - q-zip_workflow.log: log file containing stdout and sdterr - q-zip_seq_of_coms.txt: file containing each command executed during the pipeline run (minimal set of command to reproduce the results) - seq_number_stats.txt: file containing the sequence numbers at each filtering step - OTU tables in tsv and biom format (sequences and taxonomic annotation included) - Meta data map (here only including the raw file names) - swarm sequences in fasta format The following analyses had been conducted: - otu formation at swarm distance 1; default settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 2; default settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 3; default settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 5; default settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 10; default settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 1; relaxt settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 2; relaxt settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 3; relaxt settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 1; strict settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 2; strict settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 3; strict settings for preceding sequence filtering and subsequent taxonomic annotation - otu formation at swarm distance 1; very strict settings settings for preceding sequence filtering and subsequent taxonomic annotation Settings into more detail: relaxt settings: trimmomatic filtering: sliding window length of 3 bp - threshold of average quality within of 5 vsearch paired-end merging: length of minimum overlap of 25 bp - number of mismatches allowed of 5 bp cutadapt primer removal: percentage primer to sequence overlap of 75% - percentage mismatches allowed of 20% vsearch eeMax filtering: max number of errors expected per sequence of 1 bp minimum sequence length of 300 bp and maximum sequence length of 550 bp mothur classification cutoff (refers to confidence threshold of NBC) of 0.6 default settings (used for the manuscript): trimmomatic filtering: sliding window length of 3 bp - threshold of average quality within of 8 vsearch paired-end merging: length of minimum overlap of 50 bp - number of mismatches allowed of 5 cutadapt primer removal: percentage primer to sequence overlap of 90% - percentage mismatches allowed of 10% vsearch eeMax filtering: max number of errors expected per sequence of 0.25 bp minimum sequence length of 300 bp and maximum sequence length of 550 bp mothur classification cutoff (refers to confidence threshold of NBC) of 0.8 strict settings: trimmomatic filtering: sliding window length of 1 bp - threshold of average quality within of 15 vsearch paired-end merging: length of minimum overlap of 50 bp - number of mismatches allowed of 0 cutadapt primer removal: percentage primer to sequence overlap of 90% - percentage mismatches allowed of 10% vsearch eeMax filtering: max number of errors expected per sequence of 0.1 bp minimum sequence length of 300 bp and maximum sequence length of 550 bp mothur classification cutoff (refers to confidence threshold of NBC) of 0.9 very strict settings: trimmomatic filtering: sliding window length of 1 bp - threshold of average quality within of 15 vsearch paired-end merging: length of minimum overlap of 50 bp - number of mismatches allowed of 0 cutadapt primer removal: percentage primer to sequence overlap of 100% - percentage mismatches allowed of 0% vsearch eeMax filtering: max number of errors expected per sequence of 0.1 bp minimum sequence length of 300 bp and maximum sequence length of 550 bp mothur classification cutoff (refers to confidence threshold of NBC) of 0.9

Citations (0)

Mentions (0)

Metrics

Dataset Index

0.7

FAIR Score

69%

Citations

0

Mentions

0

Metrics Over Time

Publication Details

DOI

Publisher

Zenodo

Assigned Domain

Subfield

Ecology

Field

Environmental Science

Domain

Physical Sciences

Confidence Score

100%

Source

Open Alex

Normalization Factors

FT

30.77

CTw

1.00

MTw

1.00