pm4py.statistics.chaotic_activities.variants.niek_sidorova module#
- class pm4py.statistics.chaotic_activities.variants.niek_sidorova.Parameters(*values)[source]#
Bases:
Enum- ACTIVITY_KEY = 'pm4py:param:activity_key'#
- ALPHA = 'alpha'#
- pm4py.statistics.chaotic_activities.variants.niek_sidorova.apply(log: DataFrame | EventLog, parameters: Dict[Any, Any] | None = None) List[Dict[str, Any]][source]#
Compute information–theoretic metrics used to detect chaotic activities in an event log, as defined in:
Tax, Niek, Natalia Sidorova, and Wil MP van der Aalst. “Discovering more precise process models from event logs by filtering out chaotic activities.” Journal of Intelligent Information Systems 52.1 (2019): 107-139.
The result maps each activity to:
freq – absolute frequency #(a,L)
entropy – H(a,L) (direct entropy)
entropy_smooth – Hₛ(a,L) (Laplace‑smoothed entropy)
entropy_gain – ΔH (drop in total log‑entropy if a is removed)
chaotic_score – simple aggregate = (entropy_smooth+entropy_gain)/2
- Parameters:
log – Event log or Pandas dataframe
parameters – Variant-specific parameters, including:
Parameters.ALPHA: Laplace/Lidstone smoothing parameter α. None reproduces the raw entropy H(a,L); a typical choice following the paper is
α = 1/|A|.Parameters.ACTIVITY_KEY: the attribute to be used as activity. Default: “concept:name”
- Returns:
List of dictionaries, each representing an activity, sorted decreasingly based on the chaotic score.
- Return type:
chaotic_activities
- pm4py.statistics.chaotic_activities.variants.niek_sidorova.chaotic_metrics(traces, alpha=None)[source]#
- Parameters:
traces (list[list[str]]) – The event log where each inner list is a trace (ordered events).
alpha (float | None) – Laplace/Lidstone smoothing parameter α. None reproduces the raw entropy H(a,L); a typical choice following the paper is
α = 1/|A|.
- Return type: