DB

pyinaturalist_convert.db

Data models and utilities for storing observations and related data in a relational database, including SQLite, PostgreSQL, and any other database supported by SQLAlchemy.

These models contain a relevant subset of columns common to most iNat data sources, suitable for combining data from API results, CSV export, DwC-A, and/or inaturalist-open-data.

Some helper functions are included for the most common cases of saving and loading taxon and observation data. Requirements for a relational database are highly variable, so this won’t suit all use cases, but at least provides a starting point.

Extra dependencies: sqlalchemy

Example:

>>> from pyinaturalist import iNatClient
>>> from pyinaturalist_convert import create_tables, read_observations, save_observations

>>> # Fetch all of your own observations
>>> client = iNatClient()
>>> observations = client.observations.search(user_id='my_username').all()

>>> # Save to a SQLite database
>>> create_tables('observations.db')
>>> save_observations(observations, 'observations.db')

>>> # Read them back from the database
>>> for observation in get_db_observations('observations.db'):
...    print(observation)

Main functions:

create_tables

Create all tables defined in the registry in a SQLite database, if they don't already exist; and optionally create secondary indexes, if they don't already exist.

migrate

Apply all pending database migrations to upgrade the schema to the latest version.

get_db_observations

Load observation records and associated taxa from SQLite

get_db_taxa

Load taxon records from SQLite

save_observations

Save Observation objects (and associated taxa and photos) to SQLite

save_taxa

Save Taxon objects (plus ancestors and children, if available) to SQLite

Models:

DbObservation

Intermediate data model for persisting Observation data to a relational database

DbPhoto

Intermediate data model for persisting Photo data to a relational database

DbTaxon

Intermediate data model for persisting Taxon data to a relational database.

DbUser

Intermediate data model for persisting User data to a relational database

pyinaturalist_convert.db.create_table(model, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), indexes=True)

Create a single table for the specified model, if it doesn’t already exist; and optionally create secondary indexes, if they don’t already exist.

Parameters:
  • model – SQLAlchemy model class

  • db_path (Path | str) – Path to SQLite database file

  • indexes (bool) – Whether to create secondary indexes (in addition to the primary key index)

pyinaturalist_convert.db.create_tables(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), indexes=True)

Create all tables defined in the registry in a SQLite database, if they don’t already exist; and optionally create secondary indexes, if they don’t already exist.

Also adds the current alembic head revision, so future migrations can be applied.

Parameters:
  • db_path (Path | str) – Path to SQLite database file

  • indexes (bool) – Whether to create secondary indexes (in addition to the primary key index)

pyinaturalist_convert.db.get_alembic_config(db_path)

Get alembic config for the specified SQLite database

pyinaturalist_convert.db.get_db_observations(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), ids=None, username=None, limit=None, page=None, order_by_created=False, order_by_observed=False)

Load observation records and associated taxa from SQLite

Return type:

Iterator[Observation]

pyinaturalist_convert.db.get_db_taxa(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), ids=None, accept_partial=True, limit=200)

Load taxon records from SQLite

Return type:

Iterator[Taxon]

pyinaturalist_convert.db.get_session(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))

Get a SQLAlchemy session for a SQLite database

Return type:

Session

pyinaturalist_convert.db.migrate(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))

Apply all pending database migrations to upgrade the schema to the latest version.

This is an alternative to :py create_tables() that handles incremental schema changes. It also handles previously created tables with no alembic revision state.

pyinaturalist_convert.db.save_observations(observations, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))

Save Observation objects (and associated taxa and photos) to SQLite

pyinaturalist_convert.db.save_taxa(taxa, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))

Save Taxon objects (plus ancestors and children, if available) to SQLite

class pyinaturalist_convert._models.DbObservation(id, captive=None, created_at=None, description=None, identifications_count=0, geoprivacy=None, latitude=None, longitude=None, license_code=None, observed_on=None, place_guess=None, place_ids=None, positional_accuracy=None, quality_grade=None, taxon_id=None, updated_at=None, user_id=None, user_login=None, uuid=None, annotations=None, comments=None, identifications=None, ofvs=None, tags=None)

Bases: object

Intermediate data model for persisting Observation data to a relational database

Notes

  • Datetimes are stored as strings, since SQLAlchemy DateTime doesn’t handle timezone offsets from SQLite.

  • Nested collections (annotations, comments, IDs, OFVs, tags) are stored as denormalized JSON fields rather than in separate tables, since current use cases for this don’t require a full relational structure.

  • Some data sources may provide a count of identifications, but not full identification details. For that reason, a separate identifications_count column is added.

annotations
captive
comments
created_at
description
classmethod from_model(obs, skip_taxon=False)
Return type:

DbObservation

geoprivacy
id
identifications
identifications_count
latitude
license_code
longitude
observed_on
ofvs
photos
place_guess
place_ids
positional_accuracy
quality_grade
property sorted_photos

Get photos sorted by original position in the observation

tags
taxon
taxon_id
to_model()
Return type:

Observation

updated_at
user
user_id
user_login
uuid
class pyinaturalist_convert._models.DbPhoto(id, extension=None, file_path=None, height=None, license=None, observation_id=None, observation_uuid=None, original_filename=None, position=None, url=None, user_id=None, width=None, uuid=None)

Bases: object

Intermediate data model for persisting Photo data to a relational database

extension
file_path
classmethod from_model(photo, **kwargs)
Return type:

DbPhoto

height
id
license
observation
observation_id
observation_uuid
original_filename
position
to_model()
Return type:

Photo

url
user_id
uuid
width
class pyinaturalist_convert._models.DbTaxon(id, ancestor_ids=None, child_ids=None, conservation_status=None, establishment_means=None, iconic_taxon_id=0, is_active=None, leaf_taxa_count=0, observations_count=0, observations_count_rg=0, name=None, parent_id=None, partial=False, photo_urls=None, preferred_common_name=None, rank=None, reference_url=None, wikipedia_summary=None, wikipedia_url=None)

Bases: object

Intermediate data model for persisting Taxon data to a relational database.

Since different data sources provide different levels of detail, a partial field is added that indicates that some fields are missing, and can be fetched from the API if needed. As an example, this helps distinguish between taxa with no children and taxa with unlisted children.

ancestor_ids
child_ids
conservation_status
establishment_means
classmethod from_model(taxon)
Return type:

DbTaxon

iconic_taxon_id
id
is_active
leaf_taxa_count
name
observations_count
observations_count_rg
parent_id
partial
photo_urls
preferred_common_name
rank
reference_url
to_model()
Return type:

Taxon

update(taxon)

Merge new values into an existing record

wikipedia_summary
wikipedia_url
class pyinaturalist_convert._models.DbUser(id, login=None, name=None)

Bases: object

Intermediate data model for persisting User data to a relational database

classmethod from_model(user, **kwargs)
Return type:

DbUser

id
login
name
to_model()
Return type:

User

pyinaturalist_convert._models.sa_field(col_type, index=False, primary_key=False, **kwargs)

Get a dataclass field with SQLAlchemy column metadata