API

pyncei.bot

Tools to access data from NOAA’s Climate Data Online Web Services v2 API

class pyncei.bot.NCEIBot(token, wait=0.2, cache_name=None, **cache_kwargs)[source]

Bases: object

Contains functions to request data from the NCEI web services

wait

time in seconds between requests. NCEI allows a maximum of five queries per second.

Type:

float

validate_params

whether to validate query parameters before making a GET request. Defaults to False.

Type:

bool

max_retries

number of times to retry requests that fail because of temporary connectivity or server lapses. Retries use an exponential backoff. Defaults to 12.

Type:

int

The get functions described below use a common set of keyword arguments. The sortorder, limit, offset, and max arguments can be used in any get function; other keywords vary by endpoint. Most values appear to be case-sensitive. Query validation, if enabled, should capture most but not all case errors.

Parameters:
  • datasetid (str or list) – the id or name of a NCEI dataset. Multiple values allowed for most functions. Examples: GHCND; PRECIP_HLY; Weather Radar (Level III).

  • datacategoryid (str or list) – the id or name of a NCEI data category. Data categories are broader than data types. Multiple values allowed. Examples: TEMP, WXTYPE, Degree Days.

  • datatypeid (str or list) – the id or name of a data type. Multiple values allowed. Examples: TMIN; SNOW; Long-term averages of fall growing degree days with base 70F.

  • locationid (str or list) – the id or name of a location. Multiple values allowed. If a name is given, the script will try to map it to an id. Examples: Maryland; FIPS:24; ZIP:20003; London, UK.

  • stationid (str or list) – the id of name of a station in the NCEI database. Multiple values allowed. Examples: COOP:010957.

  • startdate (str or datetime) – the earliest date available

  • enddate (str or datetime) – the latest date available

  • sortfield (str) – field by which to sort the query results. Available sort fields vary by endpoint.

  • sortorder (str) – specifies whether sort is ascending or descending. Must be ‘asc’ or ‘desc’.

  • limit (int) – number of records to return per query

  • offset (int) – index of the first record to return

  • max (int) – maximum number of records to return. Not part of the API.

__init__(token, wait=0.2, cache_name=None, **cache_kwargs)[source]

Initializes NCEIBot object

Parameters:
  • token (str) – NCEI token

  • wait (float or int) – time in seconds to wait between requests

  • cache_name (str) – path to cache

  • cache_kwargs – any keyword argument accepted by requests_cache.CachedSession

get_data(**kwargs)[source]

Retrieves historical climate data matching the given parameters

See NCEIBot for more details about each keyword argument.

Parameters:
  • datasetid (str) – Required. Only one value allowed.

  • startdate (str or datetime) – Required. Returned stations will have data for the specified dataset/type from on or after this date.

  • enddate (str or datetime) – Required. Returned stations will have data for the specified dataset/type from on or before this date.

  • datatypeid (str or list) – Optional

  • locationid (str or list) – Optional

  • stationid (str or list) – Optional

  • units (str) – Optional. One of ‘standard’ or ‘metric’.

  • sortfield (str) – Optional. If provided, must be one of ‘datatype’, ‘date’, or ‘station’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns:

List of dicts containing historical weather data

get_datasets(datasetid=None, **kwargs)[source]

Returns data from the NCEI dataset endpoint

See NCEIBot for more details about each keyword argument.

Parameters:
  • datasetid (str) – a single dataset to return information about. Optional. The kwargs are ignored if this is provided.

  • datatypeid (str or list) – Optional

  • locationid (str or list) – Optional

  • stationid (str or list) – Optional

  • sortfield (str) – Optional. If provided, must be one of ‘id’, ‘name’, ‘mindate’, ‘maxdate’, or ‘datacoverage’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns:

List of dicts containing metadata for all matching datasets

get_data_categories(datacategoryid=None, **kwargs)[source]

Returns codes and labels for NCDI data categories

See NCEIBot for more details about each keyword argument.

Parameters:
  • datacategoryid (str) – a single data category to return information about. Optional. The kwargs are ignored if this is provided.

  • datasetid (str or list) – Optional

  • locationid (str or list) – Optional

  • stationid (str or list) – Optional

  • startdate (str or datetime) – Optional

  • enddate (str or datetime) – Optional

  • sortfield (str) – Optional. If provided, must be one of ‘id’, ‘name’, ‘mindate’, ‘maxdate’, or ‘datacoverage’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns:

List of dicts containing metadata for all matching data categories

get_data_types(datatypeid=None, **kwargs)[source]

Returns information about NCEI data categories

See NCEIBot for more details about each keyword argument.

Parameters:
  • datatypeid (str) – a single data type to return information about. Optional. The kwargs are ignored if this is provided.

  • datasetid (str or list) – Optional

  • locationid (str or list) – Optional

  • stationid (str or list) – Optional

  • datacategoryid (str or list) – Optional

  • startdate (str or datetime) – Optional

  • enddate (str or datetime) – Optional

  • sortfield (str) – Optional. If provided, must be one of ‘id’, ‘name’, ‘mindate’, ‘maxdate’, or ‘datacoverage’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns:

List of dicts containing metadata for all matching data types

get_location_categories(locationcategoryid=None, **kwargs)[source]

Returns information about NCEI location categories

See NCEIBot for more details about each keyword argument.

Parameters:
  • locationcategoryid (str) – a single location category to return information about. Optional. The kwargs are ignored if this is provided.

  • datasetid (str or list) – Optional

  • sortfield (str) – Optional. If provided, must be one of ‘id’ or ‘name’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns:

List of dicts containing metadata about location categories

get_locations(locationid=None, **kwargs)[source]

Returns metadata for locations matching the given parameters

See NCEIBot for more details about each keyword argument.

Parameters:
  • locationid (str) – a single location to return information about. Optional. The kwargs are ignored if this is provided.

  • datasetid (str or list) – Optional

  • locationcategoryid (str or list) – Optional

  • datacategoryid (str or list) – Optional

  • sortfield (str) – Optional. If provided, must be one of ‘id’, ‘name’, ‘mindate’, ‘maxdate’, or ‘datacoverage’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns:

List of dicts containing metadata for all matching locations

get_stations(stationid=None, **kwargs)[source]

Returns metadata for stations matching the given parameters

See NCEIBot for more details about each keyword argument.

Parameters:
  • stationid (str) – a single station to return information about. Optional. The kwargs are ignored if this is provided.

  • datasetid (str or list) – Optional

  • locationid (str or list) – Optional

  • datacategoryid (str or list) – Optional

  • datatypeid (str or list) – Optional

  • extent (str or iterable) – comma-delimited bounding box of form ‘min_lat, min_lng, max_lat, max_lng’ or equivalent iterable. Optional.

  • sortfield (str) – Optional. If provided, must be one of ‘id’, ‘name’, ‘mindate’, ‘maxdate’, or ‘datacoverage’.

  • sortorder (str) – Optional

  • limit (int) – Optional

  • offset (int) – Optional

  • max (int) – Optional

Returns:

List of dicts containing metadata for all matching stations

find_ids(term=None, endpoints=None)[source]

Find key terms that match the search string for the given endpoints

Parameters:
  • term (str) – the term to search for. If None, returns a list of all available terms for the specified endpoint(s).

  • endpoints (str or list) – name of one or more NCEI endpoints

Returns:

List of (endpoint, id, name) for matching key terms from the specified endpoint

refresh_lookups(keys=None)[source]

Update the csv files used to populate the endpoint lookups

Parameters:

keys (list) – list of endpoints to populate. If empty, everything but stations will be populated.

Returns:

None

class pyncei.bot.NCEIResponse(iterable=(), /)[source]

Bases: list

Wraps results of one or more calls to the NCEI API

Extends list. Each response is stored as an entry in the list.

key_order = ['id', 'uid', 'name', 'station', 'latitude', 'longitude', 'elevation', 'elevationUnit', 'datacoverage', 'date', 'mindate', 'maxdate', 'datatype', 'attributes', 'value', 'url', 'retrieved']

list used to order the keys in the NCEI data

date_formats = {'date': '%Y-%m-%dT%H:%M:%S', 'maxdate': '%Y-%m-%d', 'mindate': '%Y-%m-%d', 'retrieved': '%Y-%m-%dT%H:%M:%S'}

dict mapping NCEI fields to date formats

values()[source]

Gets the results from all responses

Returns:

generator of dicts

first()[source]

Gets the first result from the compiled responses

Returns:

dict

count()[source]

Counts the number of results that have been returned

Returns:

number of records returned as int

total()[source]

Counts the total number of results available for all URLs

Returns:

total number of records matching the responses as int

to_csv(path)[source]

Writes data to a CSV

Parameters:

path (str) – path to csv

to_dataframe()[source]

Writes data to a dataframe

Returns:

pandas.DataFrame or geopandas.GeoDataFrame if geopandas is installed and the responses include coordinates