Taxonomy#

pyinaturalist_convert.taxonomy

Helper utilities for navigating tabular taxonomy data as a tree and adding additional derived information to it.

Extra dependencies:

pandas
sqlalchemy

Example:

>>> from pyinaturalist_convert import load_dwca_tables, aggregate_taxon_db
>>> load_dwca_tables()
>>> aggregate_taxon_db()

Main functions:

`aggregate_taxon_db`	Add aggregate and hierarchical values to the taxon database:
`get_observation_taxon_counts`	Get taxon counts based on GBIF export (exact rank counts only, no aggregage counts)

pyinaturalist_convert.taxonomy.aggregate_taxon_db(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), counts_path=PosixPath('/home/docs/.local/share/pyinaturalist/taxon_counts.parquet'), common_names_path=PosixPath('/home/docs/.local/share/pyinaturalist/inaturalist-taxonomy.dwca/VernacularNames-english.csv'), progress_bars=True)#

Add aggregate and hierarchical values to the taxon database:

Ancestor IDs
Child IDs
Iconic taxon ID
Aggregated observation taxon counts
Aggregated leaf taxon counts
Common names

Requires GBIF datasets to be downloaded and processed first.

Parameters:

db_path (Union[Path, str]) – Path to SQLite database
counts_path (Union[Path, str]) – Path to optionally save a copy of observation taxon counts
common_names_path (Union[Path, str]) – Path to a CSV file containing taxon common names. See the DwC-A taxonomy dataset for available languages.
progress_bars (bool) – Show detailed progress bars in addition to log output

Return type:

DataFrame

pyinaturalist_convert.taxonomy.get_observation_taxon_counts(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))#

Get taxon counts based on GBIF export (exact rank counts only, no aggregage counts)

Return type:: Dict[int, int]