Find taxa whose reference sequences match primer sequences
Source:R/find_primers_pq.R
find_primers_pq.RdSearches every sequence in @refseq for occurrences of the supplied
primers (forward and reverse complement) using IUPAC-aware matching via
Biostrings::vcountPattern(). Returns a data frame of taxa that match at
least one primer, which can be passed directly to
tidypq::filter_taxa_pq() to prune them from the phyloseq object.
Arguments
- physeq
(required) A phyloseq::phyloseq object with a populated
@refseqslot.- primers
(required) A named character vector of primer sequences. IUPAC ambiguity codes (M, R, Y, S, W, K, B, D, H, V, N) are supported.
- verbose
(logical, default
TRUE) IfTRUE, print a summary message.
Value
A data.frame (or NULL if no matches) with columns:
taxonCharacter. Taxa name as in
taxa_names(physeq).matched_primersCharacter. Comma-separated names of matching primers.
n_readsNumeric. Total read count across all samples (
phyloseq::taxa_sums(physeq)).
Examples
primers <- c(
mcrA_fwd = "GGTGGTGTMGGDTTCACMCARTA",
mcrA_rev = "CGTTCATBGCGTAGTTVGGRTAGT"
)
bad <- find_primers_pq(data_fungi_mini, primers)
#> 0 taxa matched at least one primer out of 45.
bad
#> NULL
# Prune contaminated taxa (requires tidypq):
# if (!is.null(bad)) {
# tidypq::filter_taxa_pq(
# data_fungi_mini,
# !taxa_names(data_fungi_mini) %in% bad$taxon,
# clean_phyloseq_object = TRUE
# )
# }