
Standardize taxonomic names in a data frame
standardize_taxonomic_batch.RdProcess a data frame with taxonomic names and add standardized matches. This is a batch wrapper around `match_taxonomic_names()` that returns the original data with added columns for the best match.
Uses SQL-side fuzzy matching for optimal performance with slow connections.
Usage
standardize_taxonomic_batch(
data,
name_column,
method = c("auto", "exact", "genus_constrained", "fuzzy"),
min_similarity = 0.3,
include_synonyms = TRUE,
include_authors = FALSE,
con = NULL,
verbose = TRUE,
keep_all_matches = FALSE
)Arguments
- data
A data frame or tibble containing taxonomic names
- name_column
Name of column containing taxonomic names (quoted or unquoted)
- method
Matching method: "auto" (default), "exact", "genus_constrained", "fuzzy"
- min_similarity
Minimum similarity score (0-1, default: 0.3)
- include_synonyms
Include synonym information (default: TRUE)
Try matching with author names (default: FALSE)
- con
Database connection (if NULL, will call call.mydb.taxa())
- verbose
Show progress messages (default: TRUE)
- keep_all_matches
Keep all matches (default: FALSE, only keeps best match)
Value
The input data frame with added columns: - matched_name: Best matching name from backbone (or NA if no match) - idtax_n: Taxa ID for matched name - idtax_good_n: Accepted taxa ID (for synonyms) - match_method: How the match was found - match_score: Similarity score - match_genus: Matched genus - match_species: Matched species epithet - match_family: Matched family - is_synonym: Whether match is a synonym - accepted_name: Accepted name (if synonym) If keep_all_matches = TRUE, returns one row per match with match_rank column
Examples
if (FALSE) { # \dontrun{
# Standardize names in a data frame
data <- tibble(
plot_id = c(1, 1, 2),
tree_id = c("A01", "A02", "B01"),
species = c("Pericopsis elata", "Garcinea kola", "Brachystegia laurentii")
)
# Add best match for each name
data_matched <- standardize_taxonomic_batch(data, name_column = "species")
# Keep all matches (for manual review)
data_all_matches <- standardize_taxonomic_batch(
data,
name_column = "species",
keep_all_matches = TRUE
)
} # }