Published on 01 January 2022

Bacillus Carbohydrate Metabolism Protvec model

View Dataset
Grigson, Susie;McKerral, Jody C.;Mitchell, James G;Edwards, Robert

Description

Protvec model trained using 8,743 sequences from the Genome Taxonomy Database (GTDB). Sequences were filtered to remove sequences containing 'X', sequences shorter than 30 amino acids and sequences longer than 1024 amino acids. Training used a vector size of 100 and a context size of 25 to produce a dictionary object containing a 100-dimensional vector for each 3-mer present in the training data.
Model is stored as a .pkl file which can be imported using the Python pickle module.

Citations (1)

Mentions (3)

Metrics

Dataset Index

1.8

FAIR Score

13%

Citations

1

Mentions

3

Metrics Over Time

Publication Details

DOI

Publisher

Flinders University

Assigned Domain

Subfield

Molecular Biology

Field

Biochemistry, Genetics and Molecular Biology

Domain

Life Sciences

Confidence Score

46%

Source

Scholar Data Model

Keywords

60102 BioinformaticsFOS: Computer and information sciences

Normalization Factors

FT

30.77

CTw

1.00

MTw

1.00