bi_etl.lookups.disk_lookup module
Created on May 15, 2015
@author: Derek Wood
- class bi_etl.lookups.disk_lookup.DiskLookup(lookup_name: str, lookup_keys: list, parent_component: ETLComponent, config: BI_ETL_Config_Base = None, use_value_cache: bool = True, path=None, init_parent: bool = True, **kwargs)[source]
Bases:
Lookup
- COLLECTION_INDEX = datetime.datetime(1900, 1, 1, 0, 0)
- DB_LOOKUP_WARNING = 1000
- DEFAULT_PATH = None
- VERSION_COLLECTION_TYPE
alias of
OOBTree
- __init__(lookup_name: str, lookup_keys: list, parent_component: ETLComponent, config: BI_ETL_Config_Base = None, use_value_cache: bool = True, path=None, init_parent: bool = True, **kwargs)[source]
Optional parameter path where the lookup files should be persisted to disk
- cache_row(row: Row, allow_update: bool = True, allow_insert: bool = True)
Adds the given row to the cache for this lookup.
- Parameters:
- Raises:
ValueError – If allow_update is False and an already existing row (lookup key) is passed in.
- cache_set(lk_tuple: tuple, version_collection: OOBTree[datetime, Row], allow_update: bool = True)
Adds the given set of rows to the cache for this lookup.
- Parameters:
- Raises:
ValueError – If allow_update is False and an already existing row (lookup key) is passed in.
- commit()
Placeholder for other implementations that might need it
- estimated_row_size()
- find(row: ROW_TYPES, fallback_to_db: bool = True, maintain_cache: bool = True, stats: Statistics = None, **kwargs) Row
- find_in_cache(row: ROW_TYPES, **kwargs) Row
Find a matching row in the lookup based on the lookup index (keys)
- find_in_remote_table(row: ROW_TYPES, **kwargs) Row
Find a matching row in the lookup based on the lookup index (keys)
Only works if parent_component is based on bi_etl.components.readonlytable
- Parameters:
row¶ – The row with keys to search row
- Return type:
A row
- find_versions_list(row: ROW_TYPES, fallback_to_db: bool = True, maintain_cache: bool = True, stats: Statistics = None) list
- find_where(key_names: Sequence, key_values_dict: Mapping, limit: int = None)
Scan all cached rows (expensive) to find list of rows that match criteria.
- get_hashable_combined_key(row: ROW_TYPES) Sequence
- get_versions_collection(row: ROW_TYPES) MutableMapping[datetime, Row]
This method exists for compatibility with range caches
- Parameters:
row¶ – The row with keys to search row
- Return type:
A MutableMapping of rows
- has_done_get_estimate_row_size()
- has_row(row: ROW_TYPES) bool
Does the row exist in the cache (for any date if it’s a date range cache)
- Parameters:
row¶ –
- property lookup_keys_set
- row_iteration_header_has_lookup_keys(row_iteration_header: RowIterationHeader) bool
- static rstrip_key_value(val: object) object
Since most, if not all, DBs consider two strings that only differ in trailing blanks to be equal, we need to rstrip any string values so that the lookup does the same.
- Parameters:
val¶ –
- Returns:
- uncache_row(row: ROW_TYPES)
- uncache_set(row: ROW_TYPES)