API

pyncei.bot

Tools to access data from NOAA’s Climate Data Online Web Services v2 API

class pyncei.bot.NCEIBot(token, wait=0.2, cache_name=None, **cache_kwargs)[source]

Bases: object

Contains functions to request data from the NCEI web services

wait

time in seconds between requests. NCEI allows a maximum of five queries per second.

Type

float

validate_params

whether to validate query parameters before making a GET request. Defaults to False.

Type

bool

max_retries

number of times to retry requests that fail because of temporary connectivity or server lapses. Retries use an exponential backoff. Defaults to 12.

Type

int

The get functions described below use a common set of keyword arguments. The sortorder, limit, offset, and max arguments can be used in any get function; other keywords vary by endpoint. Most values appear to be case-sensitive. Query validation, if enabled, should capture most but not all case errors.

Parameters
  • datasetid (str or list) – the id or name of a NCEI dataset. Multiple values allowed for most functions. Examples: GHCND; PRECIP_HLY; Weather Radar (Level III).

  • datacategoryid (str or list) – the id or name of a NCEI data category. Data categories are broader than data types. Multiple values allowed. Examples: TEMP, WXTYPE, Degree Days.

  • datatypeid (str or list) – the id or name of a data type. Multiple values allowed. Examples: TMIN; SNOW; Long-term averages of fall growing degree days with base 70F.

  • locationid (str or list) – the id or name of a location. Multiple values allowed. If a name is given, the script will try to map it to an id. Examples: Maryland; FIPS:24; ZIP:20003; London, UK.

  • stationid (str or list) – the id of name of a station in the NCEI database. Multiple values allowed. Examples: COOP:010957.

  • startdate (str or datetime) – the earliest date available

  • enddate (str or datetime) – the latest date available

  • sortfield (str) – field by which to sort the query results. Available sort fields vary by endpoint.

  • sortorder (str) – specifies whether sort is ascending or descending. Must be ‘asc’ or ‘desc’.

  • limit (int) – number of records to return per query

  • offset (int) – index of the first record to return

  • max (int) – maximum number of records to return. Not part of the API.

__init__(token, wait=0.2, cache_name=None, **cache_kwargs)[source]

Initializes NCEIBot object

Parameters
  • token (str) – NCEI token

  • wait (float or int) – time in seconds to wait between requests

  • cache_name (str) – path to cache

  • cache_kwargs – any keyword argument accepted by requests_cache.CachedSession

get_data(**kwargs)[source]

Retrieves historical climate data matching the given parameters

See NCEIBot for more details about each keyword argument.

Parameters
  • datasetid (str) – Required. Only one value allowed.

  • startdate (str or datetime) – Required. Returned stations will have data for the specified dataset/type from on or after this date.

  • enddate (str or datetime) – Required. Returned stations will have data for the specified dataset/type from on or before this date.

  • datatypeid (str or list) – Optional

  • locationid (str or list) – Optional

  • stationid (str or list) – Optional

  • units (str) – Optional. One of ‘standard’ or ‘metric’.

  • sortfield (str) – Optional. If provided, must be one of ‘datatype’, ‘date’, or ‘station’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns

List of dicts containing historical weather data

get_datasets(datasetid=None, **kwargs)[source]

Returns data from the NCEI dataset endpoint

See NCEIBot for more details about each keyword argument.

Parameters
  • datasetid (str) – a single dataset to return information about. Optional. The kwargs are ignored if this is provided.

  • datatypeid (str or list) – Optional

  • locationid (str or list) – Optional

  • stationid (str or list) – Optional

  • sortfield (str) – Optional. If provided, must be one of ‘id’, ‘name’, ‘mindate’, ‘maxdate’, or ‘datacoverage’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns

List of dicts containing metadata for all matching datasets

get_data_categories(datacategoryid=None, **kwargs)[source]

Returns codes and labels for NCDI data categories

See NCEIBot for more details about each keyword argument.

Parameters
  • datacategoryid (str) – a single data category to return information about. Optional. The kwargs are ignored if this is provided.

  • datasetid (str or list) – Optional

  • locationid (str or list) – Optional

  • stationid (str or list) – Optional

  • startdate (str or datetime) – Optional

  • enddate (str or datetime) – Optional

  • sortfield (str) – Optional. If provided, must be one of ‘id’, ‘name’, ‘mindate’, ‘maxdate’, or ‘datacoverage’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns

List of dicts containing metadata for all matching data categories

get_data_types(datatypeid=None, **kwargs)[source]

Returns information about NCEI data categories

See NCEIBot for more details about each keyword argument.

Parameters
  • datatypeid (str) – a single data type to return information about. Optional. The kwargs are ignored if this is provided.

  • datasetid (str or list) – Optional

  • locationid (str or list) – Optional

  • stationid (str or list) – Optional

  • datacategoryid (str or list) – Optional

  • startdate (str or datetime) – Optional

  • enddate (str or datetime) – Optional

  • sortfield (str) – Optional. If provided, must be one of ‘id’, ‘name’, ‘mindate’, ‘maxdate’, or ‘datacoverage’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns

List of dicts containing metadata for all matching data types

get_location_categories(locationcategoryid=None, **kwargs)[source]

Returns information about NCEI location categories

See NCEIBot for more details about each keyword argument.

Parameters
  • locationcategoryid (str) – a single location category to return information about. Optional. The kwargs are ignored if this is provided.

  • datasetid (str or list) – Optional

  • sortfield (str) – Optional. If provided, must be one of ‘id’ or ‘name’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns

List of dicts containing metadata about location categories

get_locations(locationid=None, **kwargs)[source]

Returns metadata for locations matching the given parameters

See NCEIBot for more details about each keyword argument.

Parameters
  • locationid (str) – a single location to return information about. Optional. The kwargs are ignored if this is provided.

  • datasetid (str or list) – Optional

  • locationcategoryid (str or list) – Optional

  • datacategoryid (str or list) – Optional

  • sortfield (str) – Optional. If provided, must be one of ‘id’, ‘name’, ‘mindate’, ‘maxdate’, or ‘datacoverage’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns

List of dicts containing metadata for all matching locations

get_stations(stationid=None, **kwargs)[source]

Returns metadata for stations matching the given parameters

See NCEIBot for more details about each keyword argument.

Parameters
  • stationid (str) – a single station to return information about. Optional. The kwargs are ignored if this is provided.

  • datasetid (str or list) – Optional

  • locationid (str or list) – Optional

  • datacategoryid (str or list) – Optional

  • datatypeid (str or list) – Optional

  • extent (str or iterable) – comma-delimited bounding box of form ‘min_lat, min_lng, max_lat, max_lng’ or equivalent iterable. Optional.

  • sortfield (str) – Optional. If provided, must be one of ‘id’, ‘name’, ‘mindate’, ‘maxdate’, or ‘datacoverage’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns

List of dicts containing metadata for all matching stations

find_ids(term=None, endpoints=None)[source]

Find key terms that match the search string for the given endpoints

Parameters
  • term (str) – the term to search for. If None, returns a list of all available terms for the specified endpoint(s).

  • endpoints (str or list) – name of one or more NCEI endpoints

Returns

List of (endpoint, id, name) for matching key terms from the specified endpoint

refresh_lookups(keys=None)[source]

Update the csv files used to populate the endpoint lookups

Parameters

keys (list) – list of endpoints to populate. If empty, everything but stations will be populated.

Returns

None

class pyncei.bot.NCEIResponse(iterable=(), /)[source]

Bases: list

Wraps results of one or more calls to the NCEI API

Extends list. Each response is stored as an entry in the list.

key_order = ['id', 'uid', 'name', 'station', 'latitude', 'longitude', 'elevation', 'elevationUnit', 'datacoverage', 'date', 'mindate', 'maxdate', 'datatype', 'attributes', 'value', 'url', 'retrieved']

list used to order the keys in the NCEI data

date_formats = {'date': '%Y-%m-%dT%H:%M:%S', 'maxdate': '%Y-%m-%d', 'mindate': '%Y-%m-%d', 'retrieved': '%Y-%m-%dT%H:%M:%S'}

dict mapping NCEI fields to date formats

values()[source]

Gets the results from all responses

Returns

generator of dicts

first()[source]

Gets the first result from the compiled responses

Returns

dict

count()[source]

Counts the number of results that have been returned

Returns

number of records returned as int

total()[source]

Counts the total number of results available for all URLs

Returns

total number of records matching the responses as int

to_csv(path)[source]

Writes data to a CSV

Parameters

path (str) – path to csv

to_dataframe()[source]

Writes data to a dataframe

Returns

pandas.DataFrame or geopandas.GeoDataFrame if geopandas is installed and the responses include coordinates