bi_etl.components.row.row_case_insensitive module

Created on Sep 17, 2014

@author: Derek Wood

class bi_etl.components.row.row_case_insensitive.RowCaseInsensitive(iteration_header: RowIterationHeaderCaseInsensitive, data: MutableMapping | list | namedtuple | None = None, status: RowStatus = None, allocate_space=True)[source]

Bases: Row

Replacement for core SQL Alchemy, CSV or other dictionary based rows. Handles converting column names (keys) between upper and lower case. Handles column names (keys) that are SQL Alchemy column objects. Keeps order of the columns (see columns_in_order)

NUMERIC_TYPES = {<class 'decimal.Decimal'>, <class 'float'>, <class 'int'>}
RAISE_ON_NOT_EXIST_NAME = 'raise_on_not_exist'
RowIterationHeader_Class

alias of RowIterationHeaderCaseInsensitive

__init__(iteration_header: RowIterationHeaderCaseInsensitive, data: MutableMapping | list | namedtuple | None = None, status: RowStatus = None, allocate_space=True)[source]

Note: If data is passed here, it uses bi_etl.components.row.row.Row.update() to map the data into the columns. That is nicely automatic, but slower since it has to try various ways to read the data object.

Fastest way would be to not pass any data values, and follow with a call to one of:

property as_dict: dict
property as_key_value_list: list
clear() None.  Remove all items from D.
clone() Row

Create a clone of this row.

property column_count: int

Returns count of how many columns are in this row.

Pass through call to iteration_header.column_count.

column_position(column_name)

Get the column position (1 based) given a column name.

Parameters:

column_name (str) – The column name to find the position of

property column_set: frozenset

An ImmutableSet of the columns of this row. Used to store different row configurations in a dictionary or set.

WARNING: The resulting set is not ordered. Do not use if the column order affects the operation. See positioned_column_set instead.

Pass through call to iteration_header.column_set.

property columns_in_order: Sequence

A list of the columns of this row in the order they were defined.

Note: If the Row was created using a dict or dict like source, there was no order for the Row to work with.

compare_to(other_row: Row, exclude: Iterable = None, compare_only: Iterable = None, coerce_types: bool = True) MutableSequence[ColumnDifference]

Compare one RowCaseInsensitive to another. Returns a list of differences.

Parameters:
  • other_row

  • exclude

  • compare_only

  • coerce_types

Return type:

List of differences

get(k[, d]) D[k] if k in D, else d.  d defaults to None.
get_by_position(position)

Get the column value by position. Note: The first column position is 1 (not 0 like a python list).

get_column_name(column_specifier, raise_on_not_exist=True)
get_column_position(column_specifier) int

Get the ordinal column position based on a column name (str or sqlalchemy.sql.schema.Column)

get_name_by_position(position)

Get the column name in a given position. Note: The first column position is 1 (not 0 like a python list).

items() a set-like object providing a view on D's items
iteration_header

The bi_etl.components.row.row_iteration_header.RowIterationHeader instance that provides a shared definition of columns across many Row instances.

NOTE: Changes to the columns, such as adding a new column, will replace the iteration_header of this Row. If two or more Row’s get the same change, they will all share the same new RowIterationHeader instance as their iteration_header value.

keys() a set-like object providing a view on D's keys
property name
pop(k[, d]) v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair

as a 2-tuple; but raise KeyError if D is empty.

property positioned_column_set: Set[tuple]

An ImmutableSet of the tuples (column, position) for this row. Used to store different row configurations in a dictionary or set.

Note: column_set would not always work here because the set is not ordered even though the columns are.

Pass through call to iteration_header.positioned_column_set.

property primary_key
remove_columns(remove_list, ignore_missing=False)

Remove columns from this row instance (changes to a new RowIterationHeader)

Parameters:
  • remove_list – A list of column names to remove

  • ignore_missing – Ignore (don’t raise error) if we don’t have a column with a given name Defaults to False

rename_column(old_name, new_name, ignore_missing=False)

Rename a column

Parameters:
  • old_name (str) – The name of the column to find and rename.

  • new_name (str) – The new name to give the column.

  • ignore_missing (boolean) – Ignore (don’t raise error) if we don’t have a column with the name in old_name. Defaults to False

rename_columns(rename_map: dict | List[tuple], ignore_missing: bool = False)

Rename many columns at once.

Parameters:
  • rename_map – A dict or list of tuples to use to rename columns. Note: a list of tuples is better to use if the renames need to happen in a certain order.

  • ignore_missing – Ignore (don’t raise error) if we don’t have a column with the name in old_name. Defaults to False

set_by_position(position, value)

Set the column value by position. Note: The first column position is 1 (not 0 like a python list).

set_by_zposition(zposition, value)

Set the column value by zposition (zero based) Note: The first column position is 0 for this method

set_by_zposition_unsafe(zposition, value)
set_keeping_parent(column_name: str | Column, value)

Save and restore the iteration header parent in case we are adding the key to the header. This saves time in build_row since it can know the row is “safe” for quick building

Parameters:
  • column_name

  • value

Returns:

None

setdefault(k[, d]) D.get(k,d), also set D[k]=d if k not in D
status
str_formatted()
subset(exclude: Iterable | None = None, rename_map: dict | List[tuple] | None = None, keep_only: Iterable | None = None) Row

Return a new row instance with a subset of the columns. Original row is not modified Excludes are done first, then renames and finally keep_only. New instance will have a different RowIterationHeader.

Parameters:
  • exclude – A list of column names (before renames) to exclude from the subset. Optional. Defaults to no excludes.

  • rename_map – A dict to use to rename columns. Optional. Defaults to no renames.

  • keep_only – A list of column names (after renames) of columns to keep. Optional. Defaults to keep all.

transform(column_specifier: str, transform_function: Callable, *args, **kwargs)

Apply a transformation to a column. The transformation function must take the value to be transformed as it’s first argument.

Parameters:
  • column_specifier (str) – The column name in the row to be transformed

  • transform_function (func) – The transformation function to use. It must take the value to be transformed as it’s first argument.

  • args – Positional arguments to pass to transform_function

  • kwargs – Keyword arguments to pass to transform_function

  • Directly (Keyword _sphinx_paramlinks_bi_etl.components.row.row_case_insensitive.RowCaseInsensitive.transform.Parameters Used) –

  • --------------------------------

  • raise_on_not_exist – Should this function raise an error if the column_specifier doesn’t match an existing column. Must be passed as a keyword arg Defaults to True

  • transform_function (All _sphinx_paramlinks_bi_etl.components.row.row_case_insensitive.RowCaseInsensitive.transform.other keyword parameters are passed along to the) –

update(*args, **key_word_arguments)

Update the row values from a dict instance. Adds columns for any new names found.

NOTE: This method is easy (nicely automatic) to use but slow since it has to try various ways to read the data container object.

Consider using the appropriate one of the more specific update methods based on the source data container.

update_from_dataclass(dataclass_inst)

Update the row values from a dataclass instance. Adds columns for any new names found.

update_from_dict(source_dict: dict)

Update the row values from a dict instance. Adds columns for any new names found.

update_from_namedtuple(source_data: namedtuple)

Update the row values from a namedtuple instance. Adds columns for any new names found.

update_from_pydantic(pydantic_inst: BaseModel)

Update the row values from a pydantic instance of BaseModel. Adds columns for any new names found.

update_from_row_proxy(source_row: Row)

Update the row values from a SQL Alchemy result row instance. Adds columns for any new names found.

update_from_tuples(tuples_list: List[tuple])

Update the row values from a list of tuples.

Each tuple should have 2 values:
  1. Column name

  2. Column value

Adds columns for any new names found.

update_from_values(values_list: list)

Update the row from a list of values. The length of the list should be at least as long as the number of columns (un-filled columns will be null). Extra values past the number of columns will be discarded.

values() List

Return a list of the row values in the same order as the columns.