pm4py.objects.log.util.pl_lazy_fea_utils module#
- pm4py.objects.log.util.pl_lazy_fea_utils.automatic_feature_selection_df(df: polars.LazyFrame, parameters: Dict[Any, Any] | None = None) polars.LazyFrame[source]#
Selects useful features from a Polars lazyframe for ML purposes.
- pm4py.objects.log.util.pl_lazy_fea_utils.select_number_column(df: polars.LazyFrame, fea_df: polars.LazyFrame, col: str, case_id_key: str = 'case:concept:name') polars.LazyFrame[source]#
Adds a numeric column to the feature lazyframe.
- Notes on column duplication:
If fea_df already contained col (e.g., repeated calls / duplicate inputs), Polars would create col_right during the join. We explicitly drop any prior versions first to keep the output schema stable.
We also ensure the internal row-number column does not collide with user data.
- pm4py.objects.log.util.pl_lazy_fea_utils.select_string_column(df: polars.LazyFrame, fea_df: polars.LazyFrame, col: str, case_id_key: str = 'case:concept:name', count_occurrences: bool = False) polars.LazyFrame[source]#
Adds one-hot or count encoded columns for a categorical attribute.
- pm4py.objects.log.util.pl_lazy_fea_utils.select_string_columns(df: polars.LazyFrame, fea_df: polars.LazyFrame, columns: List[str], case_id_key: str = 'case:concept:name', count_occurrences: bool = False) polars.LazyFrame[source]#
Adds one-hot or count encoded columns for the provided categorical attributes.