Taxonomy¶
pyinaturalist_convert.taxonomy
Helper utilities for navigating tabular taxonomy data as a tree and adding additional derived information to it.
- Extra dependencies:
polarssqlalchemy
Example:
>>> from pyinaturalist_convert import load_dwca_tables, aggregate_taxon_db
>>> load_dwca_tables()
>>> aggregate_taxon_db()
Main functions:
Add aggregate and hierarchical values to the taxon database: |
|
Get taxon counts based on GBIF export (exact rank counts only, no aggregate counts) |
- class pyinaturalist_convert.taxonomy.LoggerProgress¶
Bases:
objectBase class for progress display. Just logs messages to a logger, with placeholders for progress bars.
- advance(name, amount=1)¶
- log(message)¶
- start(total)¶
- start_task(name, total, description='')¶
- stop()¶
- class pyinaturalist_convert.taxonomy.RichProgress¶
Bases:
LoggerProgressContainer for multiprocessing queues used for progress reporting.
- advance(name, amount=1)¶
Advance progress for a task.
- log(message)¶
Send a log message to the progress display.
- start(total=1)¶
Start the progress display process.
- start_task(name, total, description='')¶
Register a new task with the progress display.
- stop()¶
Stop the progress display process.
- pyinaturalist_convert.taxonomy.aggregate_taxon_db(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), backup_path=PosixPath('/home/docs/.local/share/pyinaturalist/taxon_aggregates.parquet'), common_names_path=PosixPath('/home/docs/.local/share/pyinaturalist/inaturalist-taxonomy.dwca/VernacularNames-english.csv'), max_workers=None, progress_bars=True)¶
Add aggregate and hierarchical values to the taxon database:
Ancestor IDs
Child IDs
Iconic taxon ID
Aggregated observation taxon counts
Aggregated leaf taxon counts
Common names
Requires GBIF datasets to be downloaded and processed first.
- Parameters:
backup_path (
Path|str) – Path to save a minimal copy of aggregate valuescommon_names_path (
Path|str) – Path to a CSV file containing taxon common names. See the DwC-A taxonomy dataset for available languages.max_workers (
Optional[int]) – Max worker processes for parallel aggregation (None = cpu_count)progress_bars (
bool) – Show detailed progress bars in addition to log output
- Return type:
DataFrame
- pyinaturalist_convert.taxonomy.get_observation_taxon_counts(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))¶
Get taxon counts based on GBIF export (exact rank counts only, no aggregate counts)
- pyinaturalist_convert.taxonomy.update_taxon_agg(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), agg_path=PosixPath('/home/docs/.local/share/pyinaturalist/taxon_aggregates.parquet'))¶
Update an existing taxon database with new aggregate values
- Return type:
DataFrame