monkey_wrench.query._list module

class monkey_wrench.query._list.List(items: list, datetime_parser: Annotated[Callable[[...], ReturnType], BeforeValidator(func=validate_function_path, json_schema_input_type=PydanticUndefined)] | Callable[[...], ReturnType] | type[DateTimeParserBase] | None = None, log_context: str = 'List')[source]

Bases: Query

A class to provide generic functionalities to query lists.

Note

This class is meant to behave as an immutable list.

Note

This class utilizes numpy.ndarray objects under the hood.

__init__(items: list, datetime_parser: Annotated[Callable[[...], ReturnType], BeforeValidator(func=validate_function_path, json_schema_input_type=PydanticUndefined)] | Callable[[...], ReturnType] | type[DateTimeParserBase] | None = None, log_context: str = 'List') → None[source]

Make an instance of the class.

Parameters:

items – The complete list of items to query.
datetime_parser – A function such as parse() or a class of type DateTimeParser to parse items into datetime objects. Defaults to None which means it is assumed that the list items are datetime objects.
log_context – A string that will be used in log messages to determine the context. Defaults to an empty string.

property parsed_items: ndarray

static len(item) → int[source]: Return the number of items in the List object.

to_python_list() → list[source]: Convert the List object into a Python built-in list object.

query(datetime_period: DateTimePeriod) → Self[source]

Query items from the List object, given a start datetime and an end datetime.

Parameters:: datetime_period – The datetime period to query the items from.
Returns:: A new List object including items that match the given query.
Raises:: ValueError – Refer to assert_start_time_is_before_end_time().

query_indices(datetime_period: DateTimePeriod) → list[int][source]: Similar to query(), but returns the indices of items as a Python built-in list.

__get_indices(datetime_period: DateTimePeriod) → array: Similar to query_indices(), but returns the numpy indices instead.

normalize_index(index: int) → int[source]

Convert a negative index into its positive equivalent, or return the original index if it is non-negative.

Raises:: IndexError – If the positive index or its positive equivalent exceeds the size of the List object.

generate_k_sized_batches_by_index(k: Annotated[int, Gt(gt=0)], index_start: int = 0, index_end: int = -1, batches_as_python_lists: bool = True, strict: bool = True) → Generator[source]

Generate batches (sub-lists) of size k and move forward by 1 index each time.

A batch consists of the item at the current index, as well as k-1 preceding items. In other words, a batch includes k adjacent items, with the item at the current index being the last item of the batch. Next batch is retrieved by incrementing the current index by +1. As a result, two consecutive batches have k-2 common objects.

Note

Both index_start and index_end are considered as inclusive. They can be negative as well.

Note

The indices are zero-based. If index_start is less than or equal to k-1, the first batch includes items from index 0 to index k-1 (inclusive). The next batch includes indices [1, k].

Parameters:

k – The size of the batches. Each batch includes the current item as well as k-1 preceding items.
index_start – The zero-based index of the first item to start generating the batches from. Defaults to 0 and can be negative as well.
index_end – The zero-based index of the last item (inclusive) up to which the batches are generated. Defaults to -1 meaning the last item of the list makes the last item of the final batch.
batches_as_python_lists – A boolean determining whether to return each batch as a Python built-in list or as List objects. Defaults to True.
strict – Whether to raise an exception of the number of lits items is less than the requested batch size. Defaults to True.

Yields:

A generator that yields batches of size k. Adjacent batches overlap by k-2 items.

Raises:

ValueError – If index_start is greater than index_end.
ValueError – If k exceeds the size of the list and strict=True.
IndexError – If normalized indices exceed the size of the List object. Refer to normalize_index().

partition_in_k_sized_batches_by_index(k: Annotated[int, Gt(gt=0)], index_start: int = 0, index_end: int = -1, batches_as_python_lists: bool = True) → Generator[source]: Partition the list, where the batches are of size k or less.

Note

The partition is given for all items that are in [index_start, index_end] (both inclusive).

Note

This is similar to generate_k_sized_batches_by_index(), but there are differences. First, this method generates partitions, i.e. sub-lists do not have any common items. Second, there could be one sub-list whose size is less than k. This happens when the length of available items to partition is less than k.