monkey_wrench.query._api module
- class monkey_wrench.query._api.EumetsatQuery(collection: EumetsatCollection = EumetsatCollection.seviri, log_context: str = 'EUMETSAT Query')[source]
Bases:
Query- __init__(collection: EumetsatCollection = EumetsatCollection.seviri, log_context: str = 'EUMETSAT Query') None[source]
Initialize an instance of the class with API credentials read from the environment variables.
This constructor method sets up a private eumdac datastore by obtaining an authentication token using the provided API
loginandpasswordwhich are read from the environment variables.- Parameters:
collection – The collection, defaults to
sevirifor SEVIRI.log_context – A string that will be used in log messages to determine the context. Defaults to an empty string.
- query(datetime_period: DateTimePeriodStrict, polygon: Polygon | None = None) SearchResults[source]
Query product IDs in a single batch.
This method wraps around the
eumdac.Collection().search()method to perform a search for product IDs within a specified time range and the polygon.Note
For a given SEVIRI collection, an example product ID is
"MSG3-SEVI-MSG15-0100-NA-20150731221240.036000000Z-NA".Note
start_timeandend_timeare treated respectively as inclusive and exclusive when querying the IDs. For example, to obtain all the data up to and including2022/12/31, we must setend_time=datetime(2023, 1, 1).- Parameters:
datetime_period – The datetime period to query for.
polygon – An object of type
Polygon.
- Returns:
The results of the search, containing the product IDs found within the specified period and the polygon.
- Raises:
ValueError – Refer to
assert_start_time_is_before_end_time().
- query_in_batches(datetime_range_in_batches: DateTimeRangeInBatches) Generator[tuple[SearchResults, int], None, None][source]
Retrieve all the product IDs, given a time range and a batch interval, fetching one batch at a time.
- Parameters:
datetime_range_in_batches – The datetime range to query for.
Note
As an example, for SEVIRI, we expect to have one file (product ID) per
15minutes, i.e.4files per hour or96files per day. If our re-analysis period is2022/01/01(inclusive) to2023/01/01(exclusive), i.e.365days. This results in a maximum of35040files.If we split our datetime range into intervals of
30days and fetch product IDs in batches, there is a maximum of2880 = 96 x 30IDs in each batch retrieved by a single request. One might need to adapt this value to avoid running into the issue of sending too many requests to the server.- Yields:
A generator of 2-tuples. The first element of each tuple is the collection of products retrieved in that batch. The second element is the number of the retrieved products for that batch. The search results can be in turn iterated over to retrieve individual products.
Example
>>> from datetime import datetime, timedelta, UTC >>> >>> range_in_batches = DateTimeRangeInBatches( ... start_datetime=datetime(2022, 1, 1, tzinfo=UTC), ... end_datetime=datetime(2022, 1, 3, tzinfo=UTC), ... batch_interval=timedelta(days=1) ... ) >>> >>> try: ... api = EumetsatQuery() ... for batch, retrieved_count in api.query_in_batches(range_in_batches): ... assert retrieved_count == batch.total_results ... for product in batch: ... pass ... except KeyError as e: # If the API credentials are not set! ... assert "environment variable" in str(e)
- fetch_products(search_results: SearchResults, output_directory: Path, bounding_box: BoundingBox | None = None, output_file_format: str = 'netcdf4', sleep_time: Annotated[int, Gt(gt=0)] = 10) list[Path | None][source]
Fetch all products from search results and write product files to disk.
- Parameters:
search_results – Search results for which the files will be fetched.
output_directory – The directory to save the files in.
bounding_box – Bounding box, i.e. (north, south, west, east) limits. Defaults to
Nonewhich meansBoundingBox(90., -90, -180., 180)will be used.output_file_format – Desired format of the output file(s). Defaults to
netcdf4.sleep_time – Sleep time, in seconds, between requests. Defaults to
10seconds.
- Returns:
A list paths for the fetched files.
- fetch_product(product: Product, chain: Chain, output_directory: Path, sleep_time: Annotated[int, Gt(gt=0)]) Path | None[source]
Fetch the file for a single product and write the product file to disk.
- Parameters:
product – The Product whose corresponding file will be fetched.
chain – Chain to apply for customization of the output file.
output_directory – The directory to save the file in.
sleep_time – Sleep time, in seconds, between requests.
- Returns:
The path of the saved file on the disk, Otherwise
Nonein case of a failure.