DB

pyinaturalist_convert.db

Data models and utilities for storing observations and related data in a relational database, including SQLite, PostgreSQL, and any other database supported by SQLAlchemy.

These models contain a relevant subset of columns common to most iNat data sources, suitable for combining data from API results, CSV export, DwC-A, and/or inaturalist-open-data.

Some helper functions are included for the most common cases of saving and loading taxon and observation data. Requirements for a relational database are highly variable, so this won’t suit all use cases, but at least provides a starting point.

Extra dependencies: sqlalchemy

Example:

>>> from pyinaturalist import iNatClient
>>> from pyinaturalist_convert import create_tables, read_observations, save_observations

>>> # Fetch all of your own observations
>>> client = iNatClient()
>>> observations = client.observations.search(user_id='my_username').all()

>>> # Save to a SQLite database
>>> create_tables('observations.db')
>>> save_observations(observations, 'observations.db')

>>> # Read them back from the database
>>> for observation in get_db_observations('observations.db'):
...    print(observation)

Main functions:

create_tables

Create all tables in a SQLite database

get_db_observations

Load observation records and associated taxa from SQLite

get_db_taxa

Load taxon records from SQLite

save_observations

Save Observation objects (and associated taxa and photos) to SQLite

save_taxa

Save Taxon objects (plus ancestors and children, if available) to SQLite

Models:

DbObservation

Intermediate data model for persisting Observation data to a relational database

DbPhoto

Intermediate data model for persisting Photo metadata to a relational database

DbTaxon

Intermediate data model for persisting Taxon data to a relational database.

DbUser

Intermediate data model for persisting User data to a relational database

pyinaturalist_convert.db.create_table(model, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))

Create a single table for the specified model, if it doesn’t already exist

pyinaturalist_convert.db.create_tables(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))

Create all tables in a SQLite database

pyinaturalist_convert.db.get_db_observations(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), ids=None, username=None, limit=None, page=None, order_by_created=False, order_by_observed=False)

Load observation records and associated taxa from SQLite

Return type:

Iterator[Observation]

pyinaturalist_convert.db.get_db_taxa(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), ids=None, accept_partial=True, limit=200)

Load taxon records from SQLite

Return type:

Iterator[Taxon]

pyinaturalist_convert.db.get_session(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))

Get a SQLAlchemy session for a SQLite database

Return type:

Session

pyinaturalist_convert.db.save_observations(observations, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))

Save Observation objects (and associated taxa and photos) to SQLite

pyinaturalist_convert.db.save_taxa(taxa, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))

Save Taxon objects (plus ancestors and children, if available) to SQLite

class pyinaturalist_convert._models.DbObservation(id, captive=None, created_at=None, description=None, identifications_count=0, geoprivacy=None, latitude=None, longitude=None, license_code=None, observed_on=None, place_guess=None, place_ids=None, positional_accuracy=None, quality_grade=None, taxon_id=None, updated_at=None, user_id=None, user_login=None, uuid=None, annotations=None, comments=None, identifications=None, ofvs=None, tags=None)

Bases: object

Intermediate data model for persisting Observation data to a relational database

Notes

  • Datetimes are stored as strings, since SQLAlchemy DateTime doesn’t handle timezone offsets from SQLite.

  • Nested collections (annotations, comments, IDs, OFVs, tags) are stored as denormalized JSON fields rather than in separate tables, since current use cases for this don’t require a full relational structure.

  • Some data sources may provide a count of identifications, but not full identification details. For that reason, a separate identifications_count column is added.

annotations
captive
comments
created_at
description
classmethod from_model(obs, skip_taxon=False)
Return type:

DbObservation

geoprivacy
id
identifications
identifications_count
latitude
license_code
longitude
observed_on
ofvs
photos
place_guess
place_ids
positional_accuracy
quality_grade
property sorted_photos

Get photos sorted by original position in the observation

tags
taxon
taxon_id
to_model()
Return type:

Observation

updated_at
user
user_id
user_login
uuid
class pyinaturalist_convert._models.DbPhoto(id, extension=None, height=None, license=None, observation_id=None, observation_uuid=None, position=None, url=None, user_id=None, width=None, uuid=None)

Bases: object

Intermediate data model for persisting Photo metadata to a relational database

extension
classmethod from_model(photo, **kwargs)
Return type:

DbPhoto

height
id
license
observation
observation_id
observation_uuid
position
to_model()
Return type:

Photo

url
user_id
uuid
width
class pyinaturalist_convert._models.DbTaxon(id, ancestor_ids=None, child_ids=None, conservation_status=None, establishment_means=None, iconic_taxon_id=0, is_active=None, leaf_taxa_count=0, observations_count=0, observations_count_rg=0, name=None, parent_id=None, partial=False, photo_urls=None, preferred_common_name=None, rank=None, reference_url=None, wikipedia_summary=None, wikipedia_url=None)

Bases: object

Intermediate data model for persisting Taxon data to a relational database.

Since different data sources provide different levels of detail, a partial field is added that indicates that some fields are missing, and can be fetched from the API if needed. As an example, this helps distinguish between taxa with no children and taxa with unlisted children.

ancestor_ids
child_ids
conservation_status
establishment_means
classmethod from_model(taxon)
Return type:

DbTaxon

iconic_taxon_id
id
is_active
leaf_taxa_count
name
observations_count
observations_count_rg
parent_id
partial
photo_urls
preferred_common_name
rank
reference_url
to_model()
Return type:

Taxon

update(taxon)

Merge new values into an existing record

wikipedia_summary
wikipedia_url
class pyinaturalist_convert._models.DbUser(id, login=None, name=None)

Bases: object

Intermediate data model for persisting User data to a relational database

classmethod from_model(user, **kwargs)
Return type:

DbUser

id
login
name
to_model()
Return type:

User

pyinaturalist_convert._models.sa_field(col_type, index=False, primary_key=False, **kwargs)

Get a dataclass field with SQLAlchemy column metadata