Returns the taxonomic rank information for common reference database formats. For prefix-based formats (unite, sintax, greengenes2), returns a named character vector of prefixes. For positional formats (pr2), returns a named integer vector of rank positions.
Use the result with list_ranks_db() and summarize_db() via their
tax_format parameter.
Note: dada2::assignTaxonomy() is a classifier, not a taxonomy format.
It accepts any semicolon-separated taxonomy with any number of levels,
regardless of whether prefixes are present or not. Use the taxLevels
argument in dada2::assignTaxonomy() to specify the rank names.
Usage
tax_prefixes(tax_format = c("unite", "sintax", "greengenes2", "pr2"))Arguments
- tax_format
(Character) One of:
"unite":k__/p__/... format used by UNITE general FASTA releases."sintax":d:/k:/p:/... format used by VSEARCH SINTAX and USEARCH UTAX databases (UNITE SINTAX, PR2 UTAX). Note that UNITE SINTAX files usek:(kingdom) as their first rank and do not included:(domain). When callingsummarize_db()on a UNITE SINTAX file, thed:row will show 0 sequences — this is expected."greengenes2":d__/p__/... format used by Greengenes2 (starts with domaind__instead of kingdomk__)."pr2": positional format with 9 levels specific to protist taxonomy: Domain, Supergroup, Division, Subdivision, Class, Order, Family, Genus, Species.
Value
For prefix-based formats: a named character vector of rank prefixes. For positional formats: a named integer vector of rank positions.
Examples
tax_prefixes("unite")
#> k p c o f g s
#> "k__" "p__" "c__" "o__" "f__" "g__" "s__"
tax_prefixes("sintax")
#> d k p c o f g s
#> "d:" "k:" "p:" "c:" "o:" "f:" "g:" "s:"
tax_prefixes("greengenes2")
#> d p c o f g s
#> "d__" "p__" "c__" "o__" "f__" "g__" "s__"
tax_prefixes("pr2")
#> Domain Supergroup Division Subdivision Class Order
#> 1 2 3 4 5 6
#> Family Genus Species
#> 7 8 9