pm4py.statistics.chaotic_activities.variants.niek_sidorova module#

class pm4py.statistics.chaotic_activities.variants.niek_sidorova.Parameters(*values)[source]#

Bases: Enum

ACTIVITY_KEY = 'pm4py:param:activity_key'#
ALPHA = 'alpha'#
pm4py.statistics.chaotic_activities.variants.niek_sidorova.apply(log: DataFrame | EventLog, parameters: Dict[Any, Any] | None = None) List[Dict[str, Any]][source]#

Compute information–theoretic metrics used to detect chaotic activities in an event log, as defined in:

Tax, Niek, Natalia Sidorova, and Wil MP van der Aalst. “Discovering more precise process models from event logs by filtering out chaotic activities.” Journal of Intelligent Information Systems 52.1 (2019): 107-139.

The result maps each activity to:

  • freq – absolute frequency #(a,L)

  • entropy – H(a,L) (direct entropy)

  • entropy_smooth – Hₛ(a,L) (Laplace‑smoothed entropy)

  • entropy_gain – ΔH (drop in total log‑entropy if a is removed)

  • chaotic_score – simple aggregate = (entropy_smooth+entropy_gain)/2

Parameters:
  • log – Event log or Pandas dataframe

  • parameters – Variant-specific parameters, including:

    • Parameters.ALPHA: Laplace/Lidstone smoothing parameter α. None reproduces the raw entropy H(a,L); a typical choice following the paper is α = 1/|A|.

    • Parameters.ACTIVITY_KEY: the attribute to be used as activity. Default: “concept:name”

Returns:

List of dictionaries, each representing an activity, sorted decreasingly based on the chaotic score.

Return type:

chaotic_activities

pm4py.statistics.chaotic_activities.variants.niek_sidorova.chaotic_metrics(traces, alpha=None)[source]#
Parameters:
  • traces (list[list[str]]) – The event log where each inner list is a trace (ordered events).

  • alpha (float | None) – Laplace/Lidstone smoothing parameter α. None reproduces the raw entropy H(a,L); a typical choice following the paper is α = 1/|A|.

Return type:

dict[str, dict] (activity → metrics)

pm4py.statistics.chaotic_activities.variants.niek_sidorova.total_entropy(traces, alpha=None)[source]#

Return Σₐ H(a,L) or Σₐ Hₛ(a,L).