DNA
Database-Based
k-mer
Kraken
2014
minikraken_20171019_4GB
minikraken_20171019_8GB
Timing data
Accuracy data
software and document
DNA
Database-Based
k-mer
Kraken
2
2020
Standard(Refeq
archaea, bacteria, viral, plasmid, human1,
UniVec_Core)
PlusPFP(Standard
plus Refeq protozoa, fungi & plant)
nt Database(Very
large collection, inclusive of GenBank,
RefSeq, TPA and PDB)
software and document
DNA
Database-Based
k-mer
Centrifuge
2016
Refseq: bacteria,
archaea, viral, human(8g)
NCBI: nucleotide
non-redundant sequences(64g)
software and document
DNA
Database-Based
k-mer
CLARK
2015
Using software downloads
software and document
DNA
Database-Based
k-mer
CLARK-S
2016
Using software downloads
software and document
DNA
Database-Based
k-mer
KrakenUniq
2018
Standard_377GB_kdb(archaea,
bacteria, viral, human,
UniVec_Core)
Standard_377GB_tar_gz(download
both)
MicrobialDB_384GB_kdb(archaea,
bacteria, viral, human,
UniVec_Core, Eukaryotic pathogen genomes with contaminants
removed)
MicrobialDB_384GB_tar_gz(download
both)
source code
DNA
Database-Based
k-mer
k-SLAM
2017
Using software downloads
source code
software online
DNA
Database-Based
k-mer
MegaBlast
2008
Index and software
package(FTP)
software(ftp)
You can also:sudo apt-get install blastp/blastn
DNA
Database-Based
k-mer
MetaOthello
2018
The database index download link provided by the author has
expired, please build it yourself
source code
source code github
DNA
Database-Based
k-mer
PathSeq
2018
Please use git LFS to download the pre built index on the
GitHub page of the software
source code
DNA
Database-Based
k-mer
taxMaps
2018
Index(ftp)
source code
Protein
Database-Based
k-mer
DIAMOND
2015
One can either utilize existing BLAST databases directly or
construct a database independently following the provided
tutorial.
source code
Direct installation method: conda install -c bioconda -c
conda-forge diamond
Protein
Database-Based
k-mer
Kaiju
2016
nr_177g(Subset
of NCBI BLAST nr database containing Archaea,
bacteria and viruses. )
nr_euk_204g(Like
nr, but additionally including fungi and
microbial eukaryotes)
refseq_ref_116g(Protein
sequences from Archaea, bacteria from
NCBI RefSeq representative assemblies, as well as viral protein
sequences from NCBI RefSeq. )
software and document
Web server
source code
Protein
Database-Based
k-mer
MMseqs2
2017
Data Resources from the MMseqs2 Family
source code
DNA
Database-Based
Marker
MetaPhlAn2
2015
Run 'metaphlan --install' command in the software to
download
source code
DNA
Database-Based
Marker
MetaPhlAn3
2021
Run 'metaphlan --install' command in the software to
download
source code
DNA
Database-Based
Marker
MetaPhlAn 4
2023
Recommended software
download
source code
DNA
Database-Based
Marker
mOTUs3
2022
Data Sets
source code
Virus
Machine Learning-Based
Traditional
vConTACT
2019
Vcontact has its own reference database after downloading the
software
software
and document
Virus
Machine Learning-Based
Traditional
VirusTaxo
2022
test
data
source code
Virus
Machine Learning-Based
Deep Learning
Virtifier
2022
RefSeq genomes
The CAMI Challenge Dataset
real human gut metagenomes
source code
Virus
Machine Learning-Based
Traditional
VirFinder
2017
trained model
for predicting both prokaryotic and eukaryotic
viruses
source code
DNA
Machine Learning-Based
Traditional
IDTAXA
2018
Training sets for Classification
software and document
Virus
Machine Learning-Based
Deep Learning
ViBE
2022
pre-trained
BPDR150
BPDR250
DNA150
DNA250
RNA150
RNA250
source code
RNA
Machine Learning-Based
Traditional
RdRpBin
2022
reference
dataset and taxonomy files
source code
DNA
Machine Learning-Based
Deep Learning
BERTax
2022
Installed Git LFS download model: git clone https://github.com/f-kretschmer/bertax.git
source code
DNA
Machine Learning-Based
Deep Learning
DeepMicrobes
2020
DeepMicrobes-data
source code
Virus
Machine Learning-Based
Deep Learning
CHEER
2021
pre-trained model
source code
DNA
Machine Learning-Based
Traditional
ML-DSP
2019
Database
source code
DNA
Database-Based
k-mer
CDKAM
2020
A sample of sequencing
Nanopore MinION data
Zymo mock dataset
source code
DNA
Machine Learning-Based
Traditional
QIIME 2
2019
Silva 138 99%
OTUs full-length sequences
Silva 138
99% OTUs from 515F/806R region of sequences
Greengenes2
2022.10 full length sequences
Greengenes2
2022.10 from 515F/806R region of sequences
Weighted
Silva 138 99% OTUs full-length sequences
Weighted
Greengenes 13_8 99% OTUs full-length sequences
Weighted
Greengenes 13_8 99% OTUs from 515F/806R region of sequences
software and document
Top