The main objective of this function is to complete the
phyloseq::subset_taxa()
function by propose a more easy way of
subset_taxa using a named boolean vector. Names must match taxa_names.
Usage
subset_taxa_pq(
physeq,
condition,
verbose = TRUE,
clean_pq = TRUE,
taxa_names_from_physeq = FALSE
)
Arguments
- physeq
(required): a
phyloseq-class
object obtained using thephyloseq
package.- condition
A named boolean vector to subset taxa. Length must fit the number of taxa and names must match taxa_names. Can also be a condition using a column of the tax_table slot (see examples). If the order of condition is the same as taxa_names(physeq), you can use the parameter
taxa_names_from_physeq = TRUE
.- verbose
(logical) Informations are printed
- clean_pq
(logical) If set to TRUE, empty samples are discarded after subsetting ASV
- taxa_names_from_physeq
(logical) If set to TRUE, rename the condition vector using taxa_names(physeq). Carefully check the result of this function if you use this parameter. No effect if the condition is of class
tax_table
.
Examples
subset_taxa_pq(data_fungi, data_fungi@tax_table[, "Phylum"] == "Ascomycota")
#> Cleaning suppress 0 taxa ( ) and 0 sample(s) ( ).
#> Number of non-matching ASV 0
#> Number of matching ASV 1420
#> Number of filtered-out ASV 354
#> Number of kept ASV 1066
#> Number of kept samples 185
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 1066 taxa and 185 samples ]
#> sample_data() Sample Data: [ 185 samples by 7 sample variables ]
#> tax_table() Taxonomy Table: [ 1066 taxa by 12 taxonomic ranks ]
#> refseq() DNAStringSet: [ 1066 reference sequences ]
cond_taxa <- grepl("Endophyte", data_fungi@tax_table[, "Guild"])
names(cond_taxa) <- taxa_names(data_fungi)
subset_taxa_pq(data_fungi, cond_taxa)
#> Cleaning suppress 0 taxa ( ) and 9 sample(s) ( A10-005-B_S188_MERGED.fastq.gz / A10-005-M_S190_MERGED.fastq.gz / BE9-006-H_S28_MERGED.fastq.gz / C21-NV1-M_S64_MERGED.fastq.gz / CA12-024_S66_MERGED.fastq.gz / H10-018-M_S110_MERGED.fastq.gz / L23-002-H_S123_MERGED.fastq.gz / O24-003-H_S146_MERGED.fastq.gz / O26-004-M_S150_MERGED.fastq.gz ).
#> Number of non-matching ASV 0
#> Number of matching ASV 1420
#> Number of filtered-out ASV 1292
#> Number of kept ASV 128
#> Number of kept samples 176
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 128 taxa and 176 samples ]
#> sample_data() Sample Data: [ 176 samples by 7 sample variables ]
#> tax_table() Taxonomy Table: [ 128 taxa by 12 taxonomic ranks ]
#> refseq() DNAStringSet: [ 128 reference sequences ]
subset_taxa_pq(data_fungi, grepl("mycor", data_fungi@tax_table[, "Guild"]),
taxa_names_from_physeq = TRUE
)
#> Cleaning suppress 0 taxa ( ) and 47 sample(s) ( A10-005-H_S189_MERGED.fastq.gz / A10-005-M_S190_MERGED.fastq.gz / A15-004_S3_MERGED.fastq.gz / A8-005_S4_MERGED.fastq.gz / BA16-036bis_S20_MERGED.fastq.gz / BB6-019-M_S25_MERGED.fastq.gz / BE9-006-B_S27_MERGED.fastq.gz / BE9-006-H_S28_MERGED.fastq.gz / BP11-001-H_S44_MERGED.fastq.gz / BP11-001-M_S45_MERGED.fastq.gz / BQ3-019_S48_MERGED.fastq.gz / BQ4-018-H_S50_MERGED.fastq.gz / C21-NV1-B_S62_MERGED.fastq.gz / C21-NV1-M_S64_MERGED.fastq.gz / C9-005_S65_MERGED.fastq.gz / CA9-027_S67_MERGED.fastq.gz / CB8-019-B_S69_MERGED.fastq.gz / CB8-019-M_S71_MERGED.fastq.gz / CC8-003_S74_MERGED.fastq.gz / D17-011_S77_MERGED.fastq.gz / D18-003-B_S78_MERGED.fastq.gz / D18-003-M_S80_MERGED.fastq.gz / DS1-ABM002-H_S92_MERGED.fastq.gz / DZ6-ABM-001_S99_MERGED.fastq.gz / F6-ABM-001_S105_MERGED.fastq.gz / F7-015-M_S106_MERGED.fastq.gz / H10-018-M_S110_MERGED.fastq.gz / N22-001-B_S129_MERGED.fastq.gz / N23-002-M_S132_MERGED.fastq.gz / N25-ABMX_S133_MERGED.fastq.gz / NVABM-0058_S134_MERGED.fastq.gz / NVABM0244-M_S137_MERGED.fastq.gz / O21-007-H_S143_MERGED.fastq.gz / O21-007-M_S144_MERGED.fastq.gz / O24-003-B_S145_MERGED.fastq.gz / O24-003-H_S146_MERGED.fastq.gz / O24-003-M_S147_MERGED.fastq.gz / O26-004-M_S150_MERGED.fastq.gz / P27-ABM001_S155_MERGED.fastq.gz / W26-001-M_S167_MERGED.fastq.gz / X24-009-B_S170_MERGED.fastq.gz / X24-009-H_S171_MERGED.fastq.gz / X24-009-M_S172_MERGED.fastq.gz / X24-010_S173_MERGED.fastq.gz / Y28-002-H_S179_MERGED.fastq.gz / Y28-002-M_S180_MERGED.fastq.gz / Z30-002_S186_MERGED.fastq.gz ).
#> Number of non-matching ASV 0
#> Number of matching ASV 1420
#> Number of filtered-out ASV 1376
#> Number of kept ASV 44
#> Number of kept samples 138
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 44 taxa and 138 samples ]
#> sample_data() Sample Data: [ 138 samples by 7 sample variables ]
#> tax_table() Taxonomy Table: [ 44 taxa by 12 taxonomic ranks ]
#> refseq() DNAStringSet: [ 44 reference sequences ]