Downloads the PR2 protist ribosomal reference database from GitHub releases. PR2 provides 18S rRNA gene sequences for protists and other eukaryotes.
For more advanced access to PR2 data (e.g., full taxonomy tables, metadata, or custom queries), see the pr2database R package.
Arguments
- dest_dir
(Character, default
".") Directory to save the downloaded file.- version
(Character) PR2 version number (e.g.,
"5.0.0"). IfNULL(default), the latest release is fetched from GitHub.- format
(Character, default
"dada2") One of"dada2","mothur","UTAX", or"sintax"(alias for"UTAX"). See Taxonomic ranks below: the"dada2"file keeps PR2's 9 ranks, whereas the"UTAX"/"sintax"file collapses them to 8 (Division and Subdivision are merged).- marker
(Character, default
"SSU") One of"SSU"or"plastid".- verbose
(Logical, default
TRUE) Print progress messages.
Details
PR2 releases are hosted on GitHub at https://github.com/pr2database/pr2database/releases.
Taxonomic ranks
PR2 uses 9 taxonomic ranks. The "dada2" file keeps all nine as a
positional, semicolon-delimited lineage (no rank prefixes); pass them to
dada2::assignTaxonomy() (via MiscMetabar::add_new_taxonomy_pq() with
method = "dada2") through taxLevels:
c("Domain", "Supergroup", "Division", "Subdivision", "Class", "Order", "Family", "Genus", "Species")
The "UTAX"/"sintax" file targets VSEARCH/USEARCH SINTAX and uses the
8 standard single-letter rank prefixes (k, d, p, c, o, f, g, s). To
fit PR2's nine ranks onto them, PR2 merges Division and Subdivision
into the p: rank (joined by -). The 9 → 8 mapping is:
| PR2 rank (dada2) | SINTAX prefix (UTAX) |
| Domain | k: |
| Supergroup | d: |
| Division + Subdivision | p: (e.g. Alveolata-Dinoflagellata) |
| Class | c: |
| Order | o: |
| Family | f: |
| Genus | g: |
| Species | s: |
Mind the per-method argument when calling
MiscMetabar::add_new_taxonomy_pq():
method = "dada2"(the"dada2"download) keeps all 9 ranks — pass the 9 names above astaxLevels(forwarded todada2::assignTaxonomy()).method = "sintax"(the"sintax"/"UTAX"download) has 8 ranks — pass 8 names astaxa_ranks, e.g.c("Domain", "Supergroup", "Division_Subdivision", "Class", "Order", "Family", "Genus", "Species"). The dada2taxLevelsargument is ignored by the SINTAX path, so the default 7 ranks would be used and parsing the 8-rank output fails.
Please cite: Guillou L et al. (2013) The Protist Ribosomal Reference database (PR2). Nucleic Acids Research 41:D1108-D1113. doi:10.1093/nar/gks1160
See also
format2dada2(), format2sintax(),
pr2database R package