
Using the Taxonomic Name Standardization App
taxonomic-app.RmdIntroduction
The launch_taxonomic_match_app() function provides an
interactive Shiny application for standardizing taxonomic names against
the Central African plant taxonomic backbone database. This visual
interface is ideal for:
- Exploring and cleaning taxonomic data interactively
- Understanding match quality through visual feedback
- Manually reviewing uncertain matches
- Enriching data with species-level traits
Prerequisites
Before launching the app, ensure you have:
-
Database credentials configured (see
setup_db_credentials()) -
Data to standardize in one of these formats:
- Excel file (.xlsx, .xls)
- CSV file (.csv)
- A column containing taxonomic names (e.g., genus + species) or separate columns for genus, species, and family
Quick Start
Launch the app with a single command:
Alternatively, pre-load your data:
# With R data.frame
my_data <- read.csv("tree_inventory.csv")
launch_taxonomic_match_app(data = my_data, name_column = "species_name")
# Adjust fuzzy matching sensitivity (default is 0.7)
launch_taxonomic_match_app(min_similarity = 0.5) # More permissive matchingStep-by-Step Walkthrough
Phase 1: Initial View
When you first launch the app, you’ll see the main interface with a sidebar for configuration and tabs for different workflow phases:

The app uses a tabbed workflow that guides you through each phase sequentially:
- Auto Match - Automatic matching
- Review - Manual review of unmatched names
- Export - Download results
- Traits Enrichment - Add species traits
Phase 2: Upload Your Data
The first step is to provide your data. The app offers two input methods:
File Upload (Default)
- Upload an Excel file using the file browser (supports .xlsx, .xls)
- Upload a CSV file
-
Use pre-loaded R data (if you passed
dataparameter)

For Excel files with multiple sheets, you can select which sheet to use. The app will display a preview of your uploaded data so you can verify it was read correctly.
Text Input (Copy-Paste) - NEW
For quick standardization of a few names, or when you have a list copied from another source, use the Text input method:

- Select “Text input (paste/type)” from the input method radio buttons
- Paste or type your taxonomic names in the text area
- Click “Load names” to process the input
Accepted separators: - One name per line
(recommended) - Comma-separated:
Lophira alata, Terminalia superba, Aucoumea klaineana -
Semicolon-separated:
Lophira alata; Terminalia superba; Aucoumea klaineana -
Tab-separated (useful when pasting from Excel)
The app automatically: - Removes empty lines and whitespace - Removes
duplicate names (preserving order) - Creates a single column named
taxon_name for matching
This method is ideal for: - Quick checks of a few species names - Pasting lists from emails or documents - Testing the app without preparing a file
Phase 3: Select Name Column(s)
Once data is loaded, you have two options for selecting taxonomic names:
Single Column Mode (Default)
Select one column containing the full taxonomic name:

The dropdown menu shows all available columns from your dataset. Choose the one containing species names (typically formatted as “Genus species” or “Genus species Author”).
Multiple Column Mode (NEW)
If your data has separate columns for genus, species, and family, enable “Use multiple columns”:

The app will automatically combine these columns into a single taxonomic name for matching, using a hierarchical approach: - If genus and species are available: “Genus species” - If only genus: “Genus” - If only family: “Family”
You can also optionally include an author column.
Phase 4: Automatic Matching
Click the “Start Matching” button to begin the automatic matching process. The app uses a five-tier matching strategy:
- Exact match on species: Direct lookup of full name (genus + species)
- Exact match on genus: Match at genus level
- Exact match on family: Match at family level
- Exact match on class: Match at higher taxonomic level
- Fuzzy matching: Approximate string matching for remaining names

The progress bar shows real-time status. For large datasets, this may take a few minutes. The sidebar displays live statistics:
- Number of exact matches
- Number of genus-level matches
- Number of fuzzy matches
- Number of unmatched names
Phase 5: Review Match Results
After matching completes, the Auto Match tab shows a summary table with all names and their match status:

The results table includes:
- Original name: Your input name
- matched_name: Name found in backbone
- match_method: How it was matched (exact_species, exact_genus, exact_family, fuzzy, manual)
- match_score: Similarity score (0-1, higher is better)
- idtax_n: Taxon ID in database
- is_synonym: Whether matched name is a synonym
- accepted_name: Current accepted name (if synonym)
Match quality indicators:
- Exact match (1.0): Perfect match, no review needed
- High similarity (>0.8): Very likely correct, quick review recommended
- Medium similarity (0.5-0.8): Possible match, review suggested
- Low similarity (<0.5): Uncertain, manual review required
- No match: Requires manual selection
Phase 6: Manual Review
For unmatched or uncertain names, switch to the “Review” tab to manually review and select matches:

The review interface provides two ways to find matches:
Fuzzy Suggestions Panel
Shows automatic suggestions ranked by similarity with advanced filtering options:

Filtering options:
- Number of suggestions: Slider to show 5-30 suggestions
- Minimum similarity: Adjust threshold (0.3-1.0)
- Taxonomic level filter: Filter by All, Species, Genus, Family, Order, Class, or Infraspecific
- Sort by: Similarity score or alphabetical order
Each suggestion card displays:
- Name with color-coded similarity badge (green = high, blue = medium, yellow = low)
- Taxonomic level and family
- Synonym information if applicable
- Select button for one-click acceptance
Manual Search Panel
For names without good suggestions, use the manual search:

- Type any search term to query the taxonomic backbone
- Filter results by taxonomic level
- View detailed information for each match
- Select the correct match or mark as “unresolved”
Navigation:
- Use Previous/Skip/Next buttons to browse unmatched names
- Progress counter shows reviewed vs. remaining names
- The app remembers your selections and automatically updates the results
Phase 7: Enrich Data with Traits
Switch to the “Traits Enrichment” tab to add species-level traits to your matched data:

Options:
-
Categorical aggregation mode:
- “mode” - Use most frequent value per taxon
- “concat” - Concatenate all unique values
-
Select columns to include:
- Original input names
- Corrected names
- Taxonomic IDs
- Match metadata
Available traits include:
- Growth form
- Wood density
- Leaf traits
- Ecological characteristics
The enriched data combines your matched taxa with selected traits:

Note: The enriched export creates one row per unique taxon, not per input row. Input names are concatenated with pipe separators.
Phase 8: Export Results
Switch to the “Export” tab to download your standardized dataset:

Available formats:
- Excel (.xlsx): Best for sharing with collaborators
- CSV (.csv): Universal tabular format
- RDS (.rds): R-native format preserving data types
Selectable columns:
- Original data (all your input columns)
- Matched IDs (idtax_n, idtax_good_n)
- Corrected names (corrected_name, matched_name)
- Match metadata (match_method, match_score, is_synonym, accepted_name)
A preview table shows the data before export with pagination controls.
Understanding Output Columns
The app adds these columns to your data:
| Column | Description |
|---|---|
idtax_n |
Matched taxon ID in backbone database |
idtax_good_n |
Accepted taxon ID (for synonyms) |
matched_name |
Name found in backbone |
corrected_name |
Final standardized name |
match_method |
Matching strategy used (exact_species, exact_genus, exact_family, fuzzy, manual, unresolved) |
match_score |
Similarity score (0-1) |
is_synonym |
TRUE if matched name is a synonym |
accepted_name |
Current accepted name (if synonym) |
family |
Taxonomic family |
genus |
Taxonomic genus |
Advanced Options
Language Selection
The app now supports bilingual operation with French and English interfaces. French is the default language.
In the App Interface:
A language toggle is located in the top-right corner of the app: - Click “FR” for French interface - Click “EN” for English interface
The language switch is instant and affects all UI elements including: - Tab labels - Button text - Instructions and help text - Column headers - Error messages and notifications
Setting Initial Language Programmatically:
# Launch app in English
launch_taxonomic_match_app(language = "en")
# Launch app in French (default)
launch_taxonomic_match_app(language = "fr")
# or simply:
launch_taxonomic_match_app()The language setting is interactive - users can switch languages at any time during their session without losing work progress or data.
Adjusting Fuzzy Matching
Control matching sensitivity with the min_similarity
parameter:
# Very strict - only high-quality matches
launch_taxonomic_match_app(min_similarity = 0.8)
# Default setting
launch_taxonomic_match_app(min_similarity = 0.7)
# More permissive - allows lower-quality matches
launch_taxonomic_match_app(min_similarity = 0.5)Lower values cast a wider net but may include false positives. Higher values are more conservative but may miss valid matches.
Increasing Suggestions
Show more fuzzy match suggestions per name:
# Show top 20 suggestions instead of default 10
launch_taxonomic_match_app(max_suggestions = 20)Useful when initial suggestions don’t include the correct match. You can also adjust this interactively in the Review tab using the slider.
Function Parameters
launch_taxonomic_match_app(
data = NULL, # Optional: pre-load data.frame
name_column = NULL, # Optional: pre-select column
min_similarity = 0.7, # Fuzzy match threshold (0-1)
max_suggestions = 10 # Max suggestions per unmatched name
)Troubleshooting
Connection Issues
Problem: “Failed to connect to database”
Solution:
# Check connection
db_diagnostic()
# Reset credentials if needed
remove_db_credentials()
setup_db_credentials()No Fuzzy Matches Found
Problem: No suggestions appear for unmatched names
Possible causes: - min_similarity
threshold too high - Taxonomic names contain typos or non-standard
formatting - Names not present in the taxonomic backbone (e.g.,
non-African taxa)
Solutions: - Lower min_similarity:
launch_taxonomic_match_app(min_similarity = 0.5) - Use the
taxonomic level filter to search at genus or family level - Clean input
names (remove extra spaces, fix obvious typos) - Verify names are
African taxa
Slow Matching Performance
Problem: Matching takes very long for large datasets
Solutions: - Use batch processing instead:
match_taxonomic_names() for programmatic workflow - Process
data in chunks (split large datasets) - The app downloads the entire
backbone once for efficiency, so initial load may be slow
When to Use the App vs. Programmatic Approach
Use the Shiny App when:
- Exploring data interactively
- You prefer visual interfaces
- Dataset is small to medium size (<5,000 rows)
- Need to manually review uncertain matches
- Learning the matching process
Use match_taxonomic_names() when:
- Processing large datasets (>5,000 rows)
- Automating workflows in scripts
- Integrating with data pipelines
- Reproducibility is critical (NEVER REMOVE THE COLUMN THAT CONTAINS THE ORIGINAL NAME)
- Batch processing multiple files
Example programmatic approach:
# Load data
my_data <- read.csv("tree_inventory.csv")
# Match names
matched <- match_taxonomic_names(
names = my_data$species_name,
min_similarity = 0.7
)
# Merge back with original data
result <- cbind(my_data, matched)
# Export
write.csv(result, "standardized_inventory.csv", row.names = FALSE)See Also
-
match_taxonomic_names(): Underlying matching function for programmatic use -
query_taxa(): Query taxonomic backbone directly -
match_tax(): Simple taxonomic lookup function -
vignette("using-query-plots"): Guide to querying plot data
Tips for Best Results
- Clean your data first: Remove obvious typos, extra whitespace, and special characters
- Understand your data: Know which taxonomic groups are in your dataset
- Use multi-column mode: If you have separate genus/species/family columns, combine them for better matching
- Filter by taxonomic level: Use the level filter in Review tab to find genus or family matches
- Review match scores: Don’t blindly accept low-similarity matches (<0.6)
- Save incrementally: Export intermediate results to avoid losing manual review work
-
Document parameters: Note which
min_similarityvalue you used for reproducibility
Example Workflow
Here’s a complete workflow from start to finish:
# 1. Load your data
trees <- read.csv("forest_inventory.csv")
# Columns: plot_id, tree_number, species_name, dbh, height
# 2. Launch app with data
launch_taxonomic_match_app(
data = trees,
name_column = "species_name",
min_similarity = 0.7
)
# 3. In the app:
# - Review automatic matches in Auto Match tab
# - Use Review tab to resolve unmatched names
# - Apply taxonomic level filters if needed
# - Optionally enrich with traits in Traits Enrichment tab
# - Export as "forest_inventory_standardized.xlsx"
# 4. Continue analysis with standardized data
standardized <- readxl::read_excel("forest_inventory_standardized.xlsx")
# Now you have clean taxonomic IDs for further analysis!This workflow ensures your taxonomic data is standardized and ready for downstream analyses like diversity metrics, trait-based analyses, or database integration.
Suggested Screenshots
To complete this documentation, the following screenshots should be captured:
- app-initial-view.png - Full app interface after launch with all tabs visible
- app-upload-data.png - Data upload panel with file browser and sheet selection
- app-text-input.gif - Text input interface with text area and “Load names” button (NEW)
- app-column-select.png - Single column selection dropdown
- app-column-select-multi.png - Multiple column mode with genus/species/family selectors
- app-matching-progress.png - Matching in progress with progress bar and live statistics
- app-matching-results.png - Results table in Auto Match tab showing matched names
- app-review-interface.png - Review tab overview with unmatched name display
- app-review-suggestions.png - Fuzzy suggestions panel with filtering options (level filter, sort, slider)
- app-review-manual-search.png - Manual search interface with search box and results
- app-enrich-data-interface.png - Traits enrichment tab with aggregation mode and column selection
- app-enrich-data-results.png - Preview of enriched data with traits
- app-export-options.png - Export tab with format selection and column checkboxes