SQLite

pyinaturalist_convert.sqlite

Helper utilities to load data directly from CSV into a SQLite database

class pyinaturalist_convert.sqlite.ChunkReader(f, chunk_size=2000, fields=None, transform=None, **kwargs)

Bases: object

A CSV reader that yields chunks of rows, with optional per-row transforms.

Parameters:
  • chunk_size (int) – Number of rows to yield at a time

  • fields (Optional[list[str]]) – List of fields to include in each chunk

  • transform (Optional[Callable]) – Optional callback (row: list, field_index: dict[str, int]) -> list that modifies each row in place. field_index maps CSV column names to list positions. For extra fields listed in fields but absent from the CSV header, the transform should append values in the order they appear.

pyinaturalist_convert.sqlite.get_fields(csv_path, delimiter=',')
Return type:

list[str]

pyinaturalist_convert.sqlite.load_table(csv_path, db_path, table_name=None, column_map=None, pk='id', progress=None, delimiter=',', transform=None, clear=False)

Load a CSV file into a sqlite3 table. This is less efficient than the sqlite3 shell .import command, but easier to use.

Example

# Minimal example to load data into a ‘taxon’ table in ‘my_database.db’ >>> from pyinaturalist_convert import load_table >>> load_table(‘taxon.csv’, ‘my_database.db’)

Parameters:
  • csv_path (Path | str) – Path to CSV file

  • db_path (Path | str) – Path to SQLite database

  • table_name (Optional[str]) – Name of table to load into (defaults to csv_path basename)

  • column_map (Optional[dict]) – Dictionary mapping CSV column names to SQLite column names. And columns not listed will be ignored.

  • pk (str) – Primary key column name

  • progress (Optional[MultiProgress]) – Progress bar, if tracking loading from multiple files

  • transform (Optional[Callable]) – Callback to transform a row before inserting into the database

  • clear (bool) – Whether to clear existing data from the table before loading

pyinaturalist_convert.sqlite.vacuum_analyze(table_names, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), show_spinner=False)

Vacuum a SQLite database and analyze one or more tables. If loading multiple tables, this should be done once after loading all of them.