DB¶
pyinaturalist_convert.db
Data models and utilities for storing observations and related data in a relational database, including SQLite, PostgreSQL, and any other database supported by SQLAlchemy.
These models contain a relevant subset of columns common to most iNat data sources, suitable for combining data from API results, CSV export, DwC-A, and/or inaturalist-open-data.
Some helper functions are included for the most common cases of saving and loading taxon and observation data. Requirements for a relational database are highly variable, so this won’t suit all use cases, but at least provides a starting point.
Extra dependencies: sqlalchemy
Example:
>>> from pyinaturalist import iNatClient
>>> from pyinaturalist_convert import create_tables, read_observations, save_observations
>>> # Fetch all of your own observations
>>> client = iNatClient()
>>> observations = client.observations.search(user_id='my_username').all()
>>> # Save to a SQLite database
>>> create_tables('observations.db')
>>> save_observations(observations, 'observations.db')
>>> # Read them back from the database
>>> for observation in get_db_observations('observations.db'):
... print(observation)
Main functions:
Create all tables defined in the registry in a SQLite database, if they don't already exist; and optionally create secondary indexes, if they don't already exist. |
|
Apply all pending database migrations to upgrade the schema to the latest version. |
|
Load observation records and associated taxa from SQLite |
|
Load taxon records from SQLite |
|
Save Observation objects (and associated taxa and photos) to SQLite |
|
Save Taxon objects (plus ancestors and children, if available) to SQLite |
Models:
Intermediate data model for persisting Observation data to a relational database |
|
Intermediate data model for persisting Photo data to a relational database |
|
Intermediate data model for persisting Taxon data to a relational database. |
|
Intermediate data model for persisting User data to a relational database |
- pyinaturalist_convert.db.create_table(model, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), indexes=True)¶
Create a single table for the specified model, if it doesn’t already exist; and optionally create secondary indexes, if they don’t already exist.
- pyinaturalist_convert.db.create_tables(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), indexes=True)¶
Create all tables defined in the registry in a SQLite database, if they don’t already exist; and optionally create secondary indexes, if they don’t already exist.
Also adds the current alembic head revision, so future migrations can be applied.
- pyinaturalist_convert.db.get_alembic_config(db_path)¶
Get alembic config for the specified SQLite database
- pyinaturalist_convert.db.get_db_observations(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), ids=None, username=None, limit=None, page=None, order_by_created=False, order_by_observed=False)¶
Load observation records and associated taxa from SQLite
- Return type:
- pyinaturalist_convert.db.get_db_taxa(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), ids=None, accept_partial=True, limit=200)¶
Load taxon records from SQLite
- pyinaturalist_convert.db.get_session(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))¶
Get a SQLAlchemy session for a SQLite database
- Return type:
Session
- pyinaturalist_convert.db.migrate(db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))¶
Apply all pending database migrations to upgrade the schema to the latest version.
This is an alternative to :py
create_tables()that handles incremental schema changes. It also handles previously created tables with no alembic revision state.
- pyinaturalist_convert.db.save_observations(observations, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))¶
Save Observation objects (and associated taxa and photos) to SQLite
- pyinaturalist_convert.db.save_taxa(taxa, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'))¶
Save Taxon objects (plus ancestors and children, if available) to SQLite
- class pyinaturalist_convert._models.DbObservation(id, captive=None, created_at=None, description=None, identifications_count=0, geoprivacy=None, latitude=None, longitude=None, license_code=None, observed_on=None, place_guess=None, place_ids=None, positional_accuracy=None, quality_grade=None, taxon_id=None, updated_at=None, user_id=None, user_login=None, uuid=None, annotations=None, comments=None, identifications=None, ofvs=None, tags=None)¶
Bases:
objectIntermediate data model for persisting Observation data to a relational database
Notes
Datetimes are stored as strings, since SQLAlchemy DateTime doesn’t handle timezone offsets from SQLite.
Nested collections (annotations, comments, IDs, OFVs, tags) are stored as denormalized JSON fields rather than in separate tables, since current use cases for this don’t require a full relational structure.
Some data sources may provide a count of identifications, but not full identification details. For that reason, a separate
identifications_countcolumn is added.
- annotations¶
- captive¶
- comments¶
- created_at¶
- description¶
- classmethod from_model(obs, skip_taxon=False)¶
- Return type:
- geoprivacy¶
- id¶
- identifications¶
- identifications_count¶
- latitude¶
- license_code¶
- longitude¶
- observed_on¶
- ofvs¶
- photos¶
- place_guess¶
- place_ids¶
- positional_accuracy¶
- quality_grade¶
- property sorted_photos¶
Get photos sorted by original position in the observation
- tags¶
- taxon¶
- taxon_id¶
- to_model()¶
- Return type:
- updated_at¶
- user¶
- user_id¶
- user_login¶
- uuid¶
- class pyinaturalist_convert._models.DbPhoto(id, extension=None, file_path=None, height=None, license=None, observation_id=None, observation_uuid=None, original_filename=None, position=None, url=None, user_id=None, width=None, uuid=None)¶
Bases:
objectIntermediate data model for persisting Photo data to a relational database
- extension¶
- file_path¶
- height¶
- id¶
- license¶
- observation¶
- observation_id¶
- observation_uuid¶
- original_filename¶
- position¶
- url¶
- user_id¶
- uuid¶
- width¶
- class pyinaturalist_convert._models.DbTaxon(id, ancestor_ids=None, child_ids=None, conservation_status=None, establishment_means=None, iconic_taxon_id=0, is_active=None, leaf_taxa_count=0, observations_count=0, observations_count_rg=0, name=None, parent_id=None, partial=False, photo_urls=None, preferred_common_name=None, rank=None, reference_url=None, wikipedia_summary=None, wikipedia_url=None)¶
Bases:
objectIntermediate data model for persisting Taxon data to a relational database.
Since different data sources provide different levels of detail, a
partialfield is added that indicates that some fields are missing, and can be fetched from the API if needed. As an example, this helps distinguish between taxa with no children and taxa with unlisted children.- ancestor_ids¶
- child_ids¶
- conservation_status¶
- establishment_means¶
- iconic_taxon_id¶
- id¶
- is_active¶
- leaf_taxa_count¶
- name¶
- observations_count¶
- observations_count_rg¶
- parent_id¶
- partial¶
- photo_urls¶
- preferred_common_name¶
- rank¶
- reference_url¶
- update(taxon)¶
Merge new values into an existing record
- wikipedia_summary¶
- wikipedia_url¶
- class pyinaturalist_convert._models.DbUser(id, login=None, name=None)¶
Bases:
objectIntermediate data model for persisting User data to a relational database
- id¶
- login¶
- name¶
- pyinaturalist_convert._models.sa_field(col_type, index=False, primary_key=False, **kwargs)¶
Get a dataclass field with SQLAlchemy column metadata