Skip to contents

lifecycle-experimental

This function build tree phylogenetic tree and if nb_bootstrap is set, it build also the 3 corresponding bootstrapped tree.

Default parameters are based on doi:10.12688/f1000research.8986.2 and phangorn vignette Estimating phylogenetic trees with phangorn. You should understand your data, especially the markers, before using this function.

Note that phylogenetic reconstruction with markers used for metabarcoding are not robust. You must verify the robustness of your phylogenetic tree using taxonomic classification (see vignette Tree visualization) and bootstrap or multi-tree visualization

Usage

build_phytree_pq(
  physeq,
  nb_bootstrap = 0,
  model = "GTR",
  optInv = TRUE,
  optGamma = TRUE,
  rearrangement = "NNI",
  control = phangorn::pml.control(trace = 0),
  optNni = TRUE,
  multicore = FALSE,
  ...
)

Arguments

physeq

(required): a phyloseq-class object obtained using the phyloseq package.

nb_bootstrap

(default 0): If a positive number is set, the function also build 3 bootstrapped trees using nb_bootstrap bootstrap samples

model

allows to choose an amino acid models or nucleotide model, see phangorn::optim.pml() for more details

optInv

Logical value indicating whether topology gets optimized (NNI). See phangorn::optim.pml() for more details

optGamma

Logical value indicating whether gamma rate parameter gets optimized. See phangorn::optim.pml() for more details

rearrangement

type of tree tree rearrangements to perform, one of "NNI", "stochastic" or "ratchet" see phangorn::optim.pml() for more details

control

A list of parameters for controlling the fitting process. see phangorn::optim.pml() for more details

optNni

Logical value indicating whether topology gets optimized (NNI). see phangorn::optim.pml() for more details

multicore

(logical) whether models should estimated in parallel. see phangorn::bootstrap.pml() for more details

...

Other params for be passed on to phangorn::optim.pml() function

Value

A list of phylogenetic tree

Details

This function is mainly a wrapper of the work of others. Please make a reference to phangorn package if you use this function.

Author

Adrien Taudière

Examples

# \donttest{
if (requireNamespace("phangorn")) {
  set.seed(22)
  df <- subset_taxa_pq(data_fungi_mini, taxa_sums(data_fungi_mini) > 9000)
  df_tree <- build_phytree_pq(df, nb_bootstrap = 2)
  plot(df_tree$UPGMA)
  phangorn::plotBS(df_tree$UPGMA, df_tree$UPGMA_bs, main = "UPGMA")
  plot(df_tree$NJ, "unrooted")
  plot(df_tree$ML)

  phangorn::plotBS(df_tree$ML$tree, df_tree$ML_bs, p = 20, frame = "circle")
  phangorn::plotBS(
    df_tree$ML$tree,
    df_tree$ML_bs,
    p = 20,
    frame = "circle",
    method = "TBE"
  )
  plot(phangorn::consensusNet(df_tree$ML_bs))
  plot(phangorn::consensusNet(df_tree$NJ_bs))
  ps_tree <- merge_phyloseq(df, df_tree$ML$tree)
}
#> Cleaning suppress 0 taxa (  ) and 6 sample(s) ( AD26-005-H_S10_MERGED.fastq.gz / CB8-019-H_S70_MERGED.fastq.gz / DY5-004-H_S97_MERGED.fastq.gz / N23-002-B_S130_MERGED.fastq.gz / NVABM0244-M_S137_MERGED.fastq.gz / T28-ABM602-B_S162_MERGED.fastq.gz ).
#> Number of non-matching ASV 0
#> Number of matching ASV 45
#> Number of filtered-out ASV 23
#> Number of kept ASV 22
#> Number of kept samples 131
#> Determining distance matrix based on shared 8-mers:
#> ================================================================================
#> 
#> Time difference of 0.01 secs
#> 
#> Clustering into groups by similarity:
#> ================================================================================
#> 
#> Time difference of 0 secs
#> 
#> Aligning Sequences:
#> ================================================================================
#> 
#> Time difference of 0.15 secs
#> 
#> Iteration 1 of 2:
#> 
#> Determining distance matrix based on alignment:
#> ================================================================================
#> 
#> Time difference of 0 secs
#> 
#> Reclustering into groups by similarity:
#> ================================================================================
#> 
#> Time difference of 0 secs
#> 
#> Realigning Sequences:
#> ================================================================================
#> 
#> Time difference of 0.1 secs
#> 
#> Iteration 2 of 2:
#> 
#> Determining distance matrix based on alignment:
#> ================================================================================
#> 
#> Time difference of 0 secs
#> 
#> Reclustering into groups by similarity:
#> ================================================================================
#> 
#> Time difference of 0 secs
#> 
#> Realigning Sequences:
#> ================================================================================
#> 
#> Time difference of 0.08 secs
#> 
#> Refining the alignment:
#> ================================================================================
#> 
#> Time difference of 0.03 secs
#> 
#> optimize edge weights:  -4404.542 --> -4270.276 
#> optimize edge weights:  -4270.276 --> -4270.272 
#> optimize topology:  -4270.272 --> -4252.159  NNI moves:  4 
#> optimize edge weights:  -4252.159 --> -4252.158 
#> optimize topology:  -4252.158 --> -4252.158  NNI moves:  0 
#> optimize edge weights:  -4252.158 --> -4252.158 
#> optimize edge weights:  -4394.939 --> -4255.858 
#> optimize edge weights:  -4255.858 --> -4255.85 
#> optimize topology:  -4255.85 --> -4252.235  NNI moves:  1 
#> optimize edge weights:  -4252.235 --> -4252.233 
#> optimize topology:  -4252.233 --> -4252.233  NNI moves:  0 
#> optimize edge weights:  -4252.233 --> -4252.233 








#> Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
#> Also defined by ‘RNeXML’ ‘tidytree’
#> Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
#> Also defined by ‘RNeXML’ ‘tidytree’
#> Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
#> Also defined by ‘RNeXML’ ‘tidytree’
#> Found more than one class "phylo" in cache; using the first, from namespace 'phyloseq'
#> Also defined by ‘RNeXML’ ‘tidytree’
# }