bi_etl.lookups.disk_lookup module
Created on May 15, 2015
@author: Derek Wood
- class bi_etl.lookups.disk_lookup.DiskLookup(lookup_name: str, lookup_keys: list, parent_component: ETLComponent, config: BI_ETL_Config_Base = None, use_value_cache: bool = True, path=None, init_parent: bool = True, **kwargs)[source]
- Bases: - Lookup- COLLECTION_INDEX = datetime.datetime(1900, 1, 1, 0, 0)
 - DB_LOOKUP_WARNING = 1000
 - DEFAULT_PATH = None
 - VERSION_COLLECTION_TYPE
- alias of - OOBTree
 - __init__(lookup_name: str, lookup_keys: list, parent_component: ETLComponent, config: BI_ETL_Config_Base = None, use_value_cache: bool = True, path=None, init_parent: bool = True, **kwargs)[source]
- Optional parameter path where the lookup files should be persisted to disk 
 - cache_row(row: Row, allow_update: bool = True, allow_insert: bool = True)
- Adds the given row to the cache for this lookup. - Parameters:
- Raises:
- ValueError – If allow_update is False and an already existing row (lookup key) is passed in. 
 
 - cache_set(lk_tuple: tuple, version_collection: OOBTree[datetime, Row], allow_update: bool = True)
- Adds the given set of rows to the cache for this lookup. - Parameters:
- Raises:
- ValueError – If allow_update is False and an already existing row (lookup key) is passed in. 
 
 - commit()
- Placeholder for other implementations that might need it 
 - estimated_row_size()
 - find(row: ROW_TYPES, fallback_to_db: bool = True, maintain_cache: bool = True, stats: Statistics = None, **kwargs) Row
 - find_in_cache(row: ROW_TYPES, **kwargs) Row
- Find a matching row in the lookup based on the lookup index (keys) 
 - find_in_remote_table(row: ROW_TYPES, **kwargs) Row
- Find a matching row in the lookup based on the lookup index (keys) - Only works if parent_component is based on bi_etl.components.readonlytable - Parameters:
- row¶ – The row with keys to search row 
- Return type:
- A row 
 
 - find_versions_list(row: ROW_TYPES, fallback_to_db: bool = True, maintain_cache: bool = True, stats: Statistics = None) list
 - find_where(key_names: Sequence, key_values_dict: Mapping, limit: int = None)
- Scan all cached rows (expensive) to find list of rows that match criteria. 
 - get_hashable_combined_key(row: ROW_TYPES) Sequence
 - get_versions_collection(row: ROW_TYPES) MutableMapping[datetime, Row]
- This method exists for compatibility with range caches - Parameters:
- row¶ – The row with keys to search row 
- Return type:
- A MutableMapping of rows 
 
 - has_done_get_estimate_row_size()
 - has_row(row: ROW_TYPES) bool
- Does the row exist in the cache (for any date if it’s a date range cache) - Parameters:
- row¶ – 
 
 - property lookup_keys_set
 - row_iteration_header_has_lookup_keys(row_iteration_header: RowIterationHeader) bool
 - static rstrip_key_value(val: object) object
- Since most, if not all, DBs consider two strings that only differ in trailing blanks to be equal, we need to rstrip any string values so that the lookup does the same. - Parameters:
- val¶ – 
- Returns:
 
 - uncache_row(row: ROW_TYPES)
 - uncache_set(row: ROW_TYPES)