Converters#

pyinaturalist_convert.converters

Base utilities for converting observation data to common formats.

Extra dependencies by format:
  • Excel: pandas, openpyxl

  • Feather, Parquet: pandas, pyarrow

  • HDF5: pandas, tables

Examples:

Get some observations:

>>> from pyinaturalist import iNatClient
>>> client = iNatClient()
>>> observations = client.observations.search(user_id='my_username').all()

Convert to multiple formats:

>>> from pyinaturalist_convert import *
>>>
>>> to_csv(observations, 'my_observations.csv')
>>> to_excel(observations, 'my_observations.xlsx')
>>> to_feather(observations, 'my_observations.feather')
>>> to_hdf(observations, 'my_observations.hdf')
>>> to_json(observations, 'my_observations.json')
>>> to_parquet(observations, 'my_observations.parquet')

Load back into Observation objects:

>>> observations = read('my_observations.csv')
>>> observations = read('my_observations.xlsx')
>>> observations = read('my_observations.feather')
>>> observations = read('my_observations.hdf')
>>> observations = read('my_observations.json')
>>> observations = read('my_observations.parquet')

Export functions:

to_csv

Convert observations to CSV

to_excel

Convert observations to an Excel spreadsheet (xlsx)

to_feather

Convert observations into a Feather file

to_hdf

Convert observations into a HDF5 file

to_json

Convert observations into a JSON file

to_parquet

Convert observations into a Parquet file

Import and helper functions:

read

Load observations from any of the following file formats:

to_dataframe

Convert observations into a pandas DataFrame

to_dataset

Convert observations to a generic tabular dataset.

to_dicts

Convert any supported input type into a observation (or other record type) dicts

to_observations

Convert any supported input type into Observation objects.

to_taxa

Convert any supported input type into Taxon objects

pyinaturalist_convert.converters.flatten_observations(observations, tabular=False, semitabular=False)#

Flatten nested dict attributes, for example {"taxon": {"id": 1}} -> {"taxon.id": 1}

Parameters:
  • semitabular (bool) – Accept one level of nested collections, for formats that can handle them (like parquet)

  • tabular (bool) – Drop all collections that can’t be flattened (for CSV)

pyinaturalist_convert.converters.read(filename)#

Load observations from any of the following file formats:

  • JSON

  • CSV (exported from pyinaturalist-convert)

  • CSV (exported from iNaturalist export tool)

  • Feather

  • HDF5

  • Parquet

  • Excel

Return type:

List[Observation]

pyinaturalist_convert.converters.to_csv(observations, filename=None)#

Convert observations to CSV

pyinaturalist_convert.converters.to_dataframe(observations)#

Convert observations into a pandas DataFrame

pyinaturalist_convert.converters.to_dataset(observations)#

Convert observations to a generic tabular dataset. This can be converted to any of the formats supported by tablib.

Return type:

Dataset

pyinaturalist_convert.converters.to_dicts(value)#

Convert any supported input type into a observation (or other record type) dicts

Return type:

Iterable[Dict]

pyinaturalist_convert.converters.to_excel(observations, filename)#

Convert observations to an Excel spreadsheet (xlsx)

pyinaturalist_convert.converters.to_feather(observations, filename)#

Convert observations into a Feather file

pyinaturalist_convert.converters.to_hdf(observations, filename)#

Convert observations into a HDF5 file

pyinaturalist_convert.converters.to_json(observations, filename)#

Convert observations into a JSON file

pyinaturalist_convert.converters.to_observations(value)#

Convert any supported input type into Observation objects. Input types include:

Return type:

Iterable[Observation]

pyinaturalist_convert.converters.to_parquet(observations, filename)#

Convert observations into a Parquet file

pyinaturalist_convert.converters.to_taxa(value)#

Convert any supported input type into Taxon objects

Return type:

Iterable[Taxon]

pyinaturalist_convert.converters.write(content, filename, mode='w')#

Write converted observation data to a file, creating parent dirs first