SQLite¶
pyinaturalist_convert.sqlite
Helper utilities to load data directly from CSV into a SQLite database
- class pyinaturalist_convert.sqlite.ChunkReader(f, chunk_size=2000, fields=None, transform=None, **kwargs)¶
Bases:
objectA CSV reader that yields chunks of rows, with optional per-row transforms.
- Parameters:
chunk_size (
int) – Number of rows to yield at a timefields (
Optional[list[str]]) – List of fields to include in each chunktransform (
Optional[Callable]) – Optional callback(row: list, field_index: dict[str, int]) -> listthat modifies each row in place.field_indexmaps CSV column names to list positions. For extra fields listed in fields but absent from the CSV header, the transform should append values in the order they appear.
- pyinaturalist_convert.sqlite.load_table(csv_path, db_path, table_name=None, column_map=None, pk='id', progress=None, delimiter=',', transform=None, clear=False)¶
Load a CSV file into a sqlite3 table. This is less efficient than the sqlite3 shell
.importcommand, but easier to use.Example
# Minimal example to load data into a ‘taxon’ table in ‘my_database.db’ >>> from pyinaturalist_convert import load_table >>> load_table(‘taxon.csv’, ‘my_database.db’)
- Parameters:
table_name (
Optional[str]) – Name of table to load into (defaults to csv_path basename)column_map (
Optional[dict]) – Dictionary mapping CSV column names to SQLite column names. And columns not listed will be ignored.pk (
str) – Primary key column nameprogress (
Optional[MultiProgress]) – Progress bar, if tracking loading from multiple filestransform (
Optional[Callable]) – Callback to transform a row before inserting into the databaseclear (
bool) – Whether to clear existing data from the table before loading
- pyinaturalist_convert.sqlite.vacuum_analyze(table_names, db_path=PosixPath('/home/docs/.local/share/pyinaturalist/observations.db'), show_spinner=False)¶
Vacuum a SQLite database and analyze one or more tables. If loading multiple tables, this should be done once after loading all of them.