converter.runner.pandas
¶
Module Contents¶
Classes¶
Base class for the pandas implementation for any and all groups |
|
Pandas specific implementation of the |
|
Pandas specific implementation of the |
|
Default implementation for a pandas like runner |
Functions¶
|
|
|
|
|
|
|
|
|
|
|
- converter.runner.pandas.get_logger()¶
- class converter.runner.pandas.PandasGroupWrapper(values)¶
Bases:
converter.transformers.transform.GroupWrapper
Base class for the pandas implementation for any and all groups
- in_operator(self, x, y)¶
Checks the left hand side of the operator is contained in the right hand side
- Parameters
lhs – The left hand side of the operator
rhs – The right hand side of the operator
- Returns
True if lhs in rhs, False otherwise
- not_in_operator(self, x, y)¶
Checks the left hand side of the operator is not contained in the right hand side
- Parameters
lhs – The left hand side of the operator
rhs – The right hand side of the operator
- Returns
True if lhs not in rhs, False otherwise
- class converter.runner.pandas.PandasAnyWrapper(values)¶
Bases:
PandasGroupWrapper
Pandas specific implementation of the
any
expression- check_fn(self, values)¶
Checks the results of the operator. This should be a reduction of each result in the values list into a single value.
- Parameters
checks – The results from the operator comparison
- Returns
The reduced result
- class converter.runner.pandas.PandasAllWrapper(values)¶
Bases:
PandasGroupWrapper
Pandas specific implementation of the
all
expression- check_fn(self, values)¶
Checks the results of the operator. This should be a reduction of each result in the values list into a single value.
- Parameters
checks – The results from the operator comparison
- Returns
The reduced result
- converter.runner.pandas.logical_and_transformer(row, lhs, rhs)¶
- converter.runner.pandas.logical_or_transformer(row, lhs, rhs)¶
- converter.runner.pandas.logical_not_transformer(row, value)¶
- converter.runner.pandas.in_transformer(row, lhs, rhs)¶
- converter.runner.pandas.not_in_transformer(row, lhs, rhs)¶
- class converter.runner.pandas.StrReplace(series_type)¶
- __call__(self, row: converter.transformers.transform.RowType, target, *pattern_repl)¶
- class converter.runner.pandas.StrMatch(series_type)¶
- __call__(self, row: converter.transformers.transform.RowType, target, pattern: re.Pattern)¶
- class converter.runner.pandas.StrSearch(series_type)¶
- __call__(self, row: converter.transformers.transform.RowType, target, pattern: re.Pattern)¶
- class converter.runner.pandas.StrJoin(series_type)¶
- to_str(self, obj)¶
- concat(self, left, right)¶
- join(self, left, join, right)¶
- __call__(self, row: converter.transformers.transform.RowType, join, *elements)¶
- class converter.runner.pandas.ConversionError(value=None, reason=None)¶
- converter.runner.pandas.type_converter(to_type, nullable, null_values)¶
- class converter.runner.pandas.PandasRunner(config: converter.config.Config, **options)¶
Bases:
converter.runner.base.BaseRunner
Default implementation for a pandas like runner
- row_value_conversions¶
- dataframe_type¶
- series_type¶
- coerce_row_types(self, row, conversions: converter.mapping.base.ColumnConversions)¶
Changes data types of each input column. If a cast fails a warning will be written to the logs and the row will be ignored.
- Parameters
row – The input row.
conversions – The set of conversions to run
- Returns
The updated input row if there are no errors,
None
if any updates fail.
- create_series(self, index, value)¶
- get_dataframe(self, extractor: converter.connector.base.BaseConnector) pandas.DataFrame ¶
Builds a dataframe from the extractors data
- Parameters
extractor – The extractor providing the input data
- Returns
The created dataframe
- combine_column(self, row, current_column_value: Union[pandas.Series, converter.types.notset.NotSetType], entry: converter.mapping.base.TransformationEntry)¶
Combines the current column value with the result of the transformation. If the current value is
NotSet
the value of the current transformation will be calculated and applied.- Parameters
row – The row loaded from the extractor
current_column_value – Series representing the current transformed value
entry – The transformation to apply
- Returns
The combined column value
- assign(self, input_row: pandas.DataFrame, output_row: Union[pandas.DataFrame, converter.types.notset.NotSetType], **assignments)¶
Helper function for assigning a series to a dataframe. Some implementations of pandas are less efficient if we start with an empty dataframe so here we allow for None to be passed and create the initial dataframe from the first assigned series.
- Parameters
input_row – The row loaded from the extractor
output_row – The data frame to assign to or None
assignments – The assignments to apply to the dataframe
- Returns
The updated dataframe
- apply_transformation_entry(self, input_df: pandas.DataFrame, entry: converter.mapping.base.TransformationEntry) Union[pandas.Series, converter.types.notset.NotSetType] ¶
Applies a single transformation to the dataset returning the result as a series.
- Parameters
input_df – The dataframe loaded from the extractor
entry – The transformation to apply
- Returns
The transformation result
- transform(self, extractor: converter.connector.base.BaseConnector, mapping: converter.mapping.base.BaseMapping) Iterable[Dict[str, Any]] ¶
Performs the transformation
- Parameters
extractor – The data connection to extract data from
mapping – Mapping object describing the transformations to apply
- Returns
An iterable containing the transformed data