Skip to contents

[Experimental]

Usage

chimera_detection_vs(
  seq2search,
  nb_seq,
  vsearchpath = "vsearch",
  abskew = 2,
  min_seq_length = 100,
  vsearch_args = "--fasta_width 0",
  keep_temporary_files = FALSE
)

Arguments

seq2search

(required) a list of DNA sequences coercible by function Biostrings::DNAStringSet()

nb_seq

(required) a numeric vector giving the number of sequences for each DNA sequences

vsearchpath

(default: vsearch) path to vsearch

abskew

(int, default 2) The abundance skew is used to distinguish in a three way alignment which sequence is the chimera and which are the parents. The assumption is that chimeras appear later in the PCR amplification process and are therefore less abundant than their parents. The default value is 2.0, which means that the parents should be at least 2 times more abundant than their chimera. Any positive value equal or greater than 1.0 can be used.

min_seq_length

(int, default 100)) Minimum length of sequences to be part of the analysis

vsearch_args

(default "--fasta_width 0") A list of other args for vsearch command

keep_temporary_files

(logical, default: FALSE) Do we keep temporary files ?

  • non_chimeras.fasta

  • chimeras.fasta

  • borderline.fasta

Value

A list of 3 including non-chimera taxa ($non_chimera), chimera taxa ($chimera) and bordeline taxa ($borderline)

Details

This function is mainly a wrapper of the work of others. Please make vsearch.

Author

Adrien Taudière

Examples

# \donttest{
chimera_detection_vs(
  seq2search = data_fungi@refseq,
  nb_seq = taxa_sums(data_fungi)
)
#> Filtering for sequences under 100 bp remove a total of 0 ( 0 %) unique sequences for a total of 0 sequences removed ( 0 %)
#> $non_chimera
#> AAStringSet object of length 1051:
#>        width seq                                            names               
#>    [1]   312 AAATGCGATAAGTAATGTGAAT...TAGGAATACCCGCTGAACTTA ASV1;size=92884
#>    [2]   301 AAATGCGATAAGTAATGTGAAT...TAGGAATACCCGCTGAACTTA ASV2;size=53538
#>    [3]   349 AAATGCGATAAGTAATGTGAAT...TGGGACTACCCGCTGAACTTA ASV3;size=47410
#>    [4]   357 AAATGCGATAAGTAATGTGAAT...TGGGACTACCCGCTGAACTTA ASV4;size=46857
#>    [5]   300 AAATGCGATAAGTAATGTGAAT...TAGGAATACCCGCTGAACTTA ASV5;size=41082
#>    ...   ... ...
#> [1047]   260 AAACGCGAAAAGTGTTATGATG...AAGATCACCCGCTGAACTTAA ASV1420;size=2
#> [1048]   365 AAATGCGATAAGTAATGTGAAT...TAGGACTACCCGCTGAACTTA ASV602;size=2
#> [1049]   344 AAATGCGATAAGTAATGTGAAT...TAGGAATACGCGCTGAACTTA ASV1142;size=1
#> [1050]   290 GAAATGCGATAAGTAATGTGAA...TAGGGATACCCGCTGAACTTA ASV246;size=1
#> [1051]   318 GAAATGCGATACGTAATGTGAA...AGGGATACCCGCTGAACTTAA ASV412;size=1
#> 
#> $chimera
#> AAStringSet object of length 242:
#>       width seq                                             names               
#>   [1]   341 GAAATGCGATAAGTAATGTGAA...GTAGGATTACCCGCTGAACTTA ASV136;size=2743
#>   [2]   307 AAATGCGATAAGTAATGTGAAT...GTAGGGATACCCGCTGAACTTA ASV163;size=2028
#>   [3]   339 AAATGCGATAAGTAATGTGAAT...GTAGGATTACCCGCTGAACTTA ASV206;size=1471
#>   [4]   312 AAATGCGATAAGTAATGTGAAT...GTAGGAATACCCGCTGAACTTA ASV286;size=875
#>   [5]   306 AAATGCGAAAAGTAGTGTGAAT...GTAGGGATACCCGCTGAACTTA ASV294;size=832
#>   ...   ... ...
#> [238]   303 GAAATGCGATACTTGGTGTGAA...GTGGGACTACCCGCTGAACTTA ASV1390;size=29
#> [239]   293 GAACTACGATAAGTAATGTGAA...TAGGGATACCCGCTGAACTTAA ASV1395;size=25
#> [240]   301 AAATGCGAAAAGTAGTGTGAAT...TAGGGATACCCGCTGAACTTAA ASV1407;size=18
#> [241]   382 AAATGCGATAAATAATATGAAT...GCAAGATTACCCGCTGAACTTA ASV1411;size=12
#> [242]   287 GAAATGCGATAAGTAATGTGAA...TAGGGCTACCCGCTGAACTTAA ASV789;size=2
#> 
#> $borderline
#> AAStringSet object of length 127:
#>       width seq                                             names               
#>   [1]   344 AAATGCGATAAGTAATGTGAAT...GTAGGACTACCCGCTGAACTTA ASV74;size=5699
#>   [2]   298 AAATGCGATAAGTAATGTGAAT...GTAGGGATACCCGCTGAACTTA ASV123;size=3164
#>   [3]   300 AAATGCGATAAGTAATGTGAAT...GTAGGGATACCCGCTGAACTTA ASV139;size=2712
#>   [4]   304 AAATGCGATAAGTAATGTGAAT...GTAGGACTACCCGCTGAACTTA ASV141;size=2611
#>   [5]   295 AAATGCGATAAGTAATGTGAAT...GTAGGGATACCCGCTGAACTTA ASV153;size=2294
#>   ...   ... ...
#> [123]   330 AAACGCGATAGGTAATGTGAAT...GTAGGACTACCCGCTGAACTTA ASV1292;size=62
#> [124]   332 AAATGCGATAAGTAATGTGAAT...GTAGGAATACCCGCTGAACTTA ASV1353;size=48
#> [125]   300 GAAATGCGATAAGTAATGCGAA...GTAGGGATACCCGCTGAACTTA ASV1384;size=31
#> [126]   302 GAAATGCGATACGTAATGTGAA...TAGGAATACCCGCTGAACTTAA ASV1388;size=29
#> [127]   316 GAAATGCGATAAATAATATGAA...ACAAGATTACCCGCTGAACTTA ASV1399;size=22
#> 
# }