Converts taxonomy headers to the format expected by
dada2::assignTaxonomy(): unprefixed semicolon-delimited taxonomy
(>Kingdom;Phylum;Class;Order;Family;Genus;). Wrapper around
format_fasta_db().
Usage
format2dada2(
fasta_db = NULL,
taxnames = NULL,
input_format = "auto",
output_path = NULL,
pattern_to_remove = NULL
)Arguments
- fasta_db
(Character) Path to a FASTA file. Mutually exclusive with
taxnames.- taxnames
(Character vector) Taxonomy header strings (without leading
>). Mutually exclusive withfasta_db.- input_format
(Character, default
"auto") Input taxonomy format. One of"auto","sintax","unite","greengenes2".- output_path
(Character) If provided and
fasta_dbis used, write the reformatted FASTA to this path. TheDNAStringSetis returned invisibly.- pattern_to_remove
(Character) Optional regex pattern to remove from the reformatted names (applied after conversion).
Value
If taxnames is used, a character vector. If fasta_db is
used, a DNAStringSet with reformatted names. When output_path is
provided, returned invisibly.
Examples
# SINTAX format → dada2
format2dada2(
taxnames = "AB123;tax=k:Fungi,p:Ascomycota,c:Sordariomycetes"
)
#> [1] "Fungi;Ascomycota;Sordariomycetes;"
# UNITE format → dada2
format2dada2(
taxnames = "AB123;k__Fungi;p__Ascomycota;c__Sordariomycetes",
input_format = "unite"
)
#> [1] "Fungi;Ascomycota;Sordariomycetes;"