Skip to content

API Reference

core

Core components of the SigAlg library, including fundamental classes and functions for sample spaces, probability measures, probability spaces, time indices, events, \(\sigma\)-algebras and their filtrations, and random variables and vectors.

Event

Bases: SampleSpaceMethods, Index

A class representing an event \(A\) in a sample space \(\Omega\).

In the mathematical theory, an event is supposed to be a measurable subset \(A\) of a sample space \(\Omega\) with respect to a given \(\sigma\)-algebra. However, in SigAlg, we do not enforce this requirement. Any subset of the sample space can be represented as an Event, regardless of whether it is measurable or not.

Parameters:

Name Type Description Default
sample_space SampleSpace

The sample space to which this event belongs.

required
name Hashable | None

Name identifier for the event.

"A"
data_name Hashable | None

Name for the index of values.

"sample"

Raises:

Type Description
TypeError

If sample_space is not a SampleSpace instance.

Examples:

>>> from sigalg.core import Event, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=4)
>>> A = Event(name="A", sample_space=Omega).from_list(["omega_0", "omega_1"])
>>> B = Event(name="B", sample_space=Omega).from_list(["omega_1", "omega_2"])
>>> union = A | B
>>> union
Event 'A union B':
['omega_0', 'omega_1', 'omega_2']
>>> intersection = A & B
>>> intersection
Event 'A intersect B':
['omega_1']
>>> complement = ~A
>>> complement
Event 'A complement':
['omega_2', 'omega_3']

complement

complement()

Return the complement of this event.

Returns:

Name Type Description
event Event

An event containing all sample points not in this event.

Examples:

>>> from sigalg.core import Event, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=3)
>>> A = Event(name="A", sample_space=Omega).from_list(indices=["omega_0"])
>>> A.complement()
Event 'A complement':
['omega_1', 'omega_2']

difference

difference(other)

Return the set difference of this event and another event.

Parameters:

Name Type Description Default
other Event

Another event from the same sample space.

required

Returns:

Name Type Description
event Event

An event containing sample points in this event but not in other.

Raises:

Type Description
ValueError

If events are from different sample spaces.

Examples:

>>> from sigalg.core import Event, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=3)
>>> A = Event(name="A", sample_space=Omega).from_list(indices=["omega_0", "omega_1"])
>>> B = Event(name="B", sample_space=Omega).from_list(indices=["omega_1", "omega_2"])
>>> A.difference(B)
Event 'A difference B':
['omega_0']

from_list

from_list(indices)

Create an Event from a list of sample point indices.

Parameters:

Name Type Description Default
indices list[Hashable]

List of sample point indices to include in the event.

required

Returns:

Name Type Description
self Event

The event instance with the specified sample points.

Examples:

>>> from sigalg.core import Event, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=4)
>>> A = Event(name="A", sample_space=Omega).from_list(indices=["omega_0", "omega_2"])
>>> A
Event 'A':
['omega_0', 'omega_2']

intersection

intersection(other)

Return the intersection of this event with another event.

Parameters:

Name Type Description Default
other Event

Another event from the same sample space.

required

Returns:

Name Type Description
event Event

An event containing sample points in both events.

Raises:

Type Description
ValueError

If events are from different sample spaces.

Examples:

>>> from sigalg.core import Event, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=3)
>>> A = Event(name="A", sample_space=Omega).from_list(indices=["omega_0", "omega_1"])
>>> B = Event(name="B", sample_space=Omega).from_list(indices=["omega_1", "omega_2"])
>>> A.intersection(B)
Event 'A intersect B':
['omega_1']

to_sample_space

to_sample_space()

Convert this event to a sample space.

Creates a new SampleSpace containing only the sample points in this event.

Returns:

Name Type Description
sample_space SampleSpace

A sample space containing this event's outcomes.

Examples:

>>> from sigalg.core import Event, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=3)
>>> A = Event(name="A", sample_space=Omega).from_list(indices=["omega_0", "omega_1"])
>>> A.to_sample_space()
Sample space 'A':
['omega_0', 'omega_1']

union

union(other)

Return the union of this event with another event.

Parameters:

Name Type Description Default
other Event

Another event from the same sample space.

required

Returns:

Name Type Description
event Event

An event containing sample points in either event.

Raises:

Type Description
ValueError

If events are from different sample spaces.

Examples:

>>> from sigalg.core import Event, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=3)
>>> A = Event(name="A", sample_space=Omega).from_list(indices=["omega_0"])
>>> B = Event(name="B", sample_space=Omega).from_list(indices=["omega_1"])
>>> A.union(B)
Event 'A union B':
['omega_0', 'omega_1']

EventSpace

Bases: SampleSpaceMethods, SigmaAlgebraMethods

A class representing a measurable space \((\Omega, \mathcal{F})\).

An event space \((\Omega, \mathcal{F})\) consists of a sample space \(\Omega\) and a \(\sigma\)-algebra \(\mathcal{F}\) that defines which subsets of the sample space are measurable events.

EventSpace has attributes sample_space and sigma_algebra that access the underlying components. It also inherits methods from SampleSpaceMethods and SigmaAlgebraMethods, allowing direct access to their functionalities directly on the EventSpace instance.

Parameters:

Name Type Description Default
sample_space SampleSpace

The underlying sample space containing all possible outcomes.

required
sigma_algebra SigmaAlgebra

Sigma-algebra defining measurable events. If None, a power set sigma-algebra is created, making all subsets measurable.

None

Raises:

Type Description
TypeError

If sample_space is not a SampleSpace instance or sigma_algebra is not a SigmaAlgebra instance.

ValueError

If sigma_algebra's sample space does not match the provided sample_space.

Examples:

>>> from sigalg.core import EventSpace, SampleSpace, SigmaAlgebra
>>> Omega = SampleSpace.generate_sequence(size=3)
>>> # Create with default power set sigma-algebra
>>> event_space = EventSpace(sample_space=Omega)
>>> # Create with custom sigma-algebra
>>> F = SigmaAlgebra(sample_space=Omega).from_dict(
...     sample_id_to_atom_id={"omega_0": 0, "omega_1": 0, "omega_2": 1},
... )
>>> event_space = EventSpace(
...     sample_space=Omega,
...     sigma_algebra=F
... )

sigma_algebra property writable

sigma_algebra

Get the sigma-algebra defining measurable events.

Returns:

Name Type Description
sigma_algebra SigmaAlgebra

The sigma-algebra of this event space.

make_probability_space

make_probability_space(probability_measure=None)

Convert this event space to a probability space.

Creates a ProbabilitySpace by adding a probability measure to this event space. If no probability measure is provided, a uniform probability measure is created.

Parameters:

Name Type Description Default
probability_measure ProbabilityMeasure

Probability measure to use. If None, a uniform probability measure is created.

None

Returns:

Name Type Description
probability_space ProbabilitySpace

A probability space with this event space's sample space and sigma-algebra.

Examples:

>>> from sigalg.core import EventSpace, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=3, prefix="s")
>>> event_space = EventSpace(sample_space=Omega)
>>> prob_space = event_space.make_probability_space()
>>> prob_space.P("s_0")
0.333...

FeatureVector

A class representing a feature vector for a single sample point.

Given a random vector \(X: \Omega \to \mathbb{R}^n\), a FeatureVector represents the output \(X(\omega)\) for a specific sample point \(\omega\) in the sample space \(\Omega\).

data property writable

data

Get the feature vector data.

Returns:

Name Type Description
data Series

The feature values as a pandas Series, indexed by feature names.

feature_at property

feature_at

Get indexer for positional access to features.

Returns:

Name Type Description
indexer _iLocIndexer

Indexer for accessing features by integer position.

name property writable

name

Get the sample point identifier.

Returns:

Name Type Description
name Hashable

The identifier for this sample point.

random_vector property

random_vector

Get the associated random vector.

Returns:

Name Type Description
random_vector RandomVector | None

The random vector from which these features were derived, or None if not set.

from_pandas

from_pandas(data)

Create a FeatureVector from a pd.Series.

Parameters:

Name Type Description Default
data Series

A pd.Series containing feature values, indexed by feature names.

required

Returns:

Name Type Description
self FeatureVector

The created FeatureVector instance.

from_rv

from_rv(sample_index, random_vector)

Associate a RandomVector with this FeatureVector.

Parameters:

Name Type Description Default
random_vector RandomVector

The random vector to associate.

required

Returns:

Name Type Description
self FeatureVector

The updated FeatureVector instance.

sum

sum()

Return the sum of all feature values.

Returns:

Name Type Description
total Any

The sum of all feature values.

FilteredSigmaAlgebra

A class representing a filtered sigma algebra.

Parameters:

Name Type Description Default
filtration Filtration

The filtration associated with this filtered sigma algebra.

required
sigma_algebra SigmaAlgebra | None

The sigma algebra at the finest level of the filtration. If not provided, defaults to the finest sigma algebra of the filtration.

None

Raises:

Type Description
TypeError

If filtration is not an instance of Filtration or if sigma_algebra is not an instance of SigmaAlgebra.

ValueError

If sigma_algebra is provided and does not match the finest sigma algebra of the filtration.

Filtration

A class representing a nested sequence of \(\sigma\)-algebras.

A filtration is an increasing sequence

\[ \mathcal{F}_0 \subset \mathcal{F}_1 \subset \ldots \subset \mathcal{F}_n \]

of \(\sigma\)-algebras defined on the same sample space \(\Omega\).

Parameters:

Name Type Description Default
time Index | None

An index for the time points corresponding to each sigma algebra in the filtration. Does not have to be an instance of Time; may be an instance of the parent class Index.

None
name Hashable | None

An optional name for the filtration.

"Ft"

Raises:

Type Description
TypeError

If name is not a hashable or None.

Examples:

>>> from sigalg.core import Filtration, SampleSpace, SigmaAlgebra, Time
>>> # Define sample space and sigma algebras
>>> sample_space = SampleSpace.generate_sequence(size=3)
>>> F = SigmaAlgebra.trivial(sample_space=sample_space, name="F")
>>> G = SigmaAlgebra(sample_space=sample_space, name="G").from_dict(
...     sample_id_to_atom_id={"omega_0": 0, "omega_1": 0, "omega_2": 1},
... )
>>> H = SigmaAlgebra.power_set(sample_space=sample_space, name="H")
>>> # Define continous time index
>>> time = Time.continuous(start=0.0, stop=1.5, num_points=3)
>>> # Create and print filtration
>>> Ft = Filtration(time=time, name="Ft").from_list([F, G, H])
>>> print(Ft)
Filtration (Ft)
===============

* Time 'T':
[0.0, 0.75, 1.5]

* At time 0.0:
Sigma algebra 'F':
        atom ID
sample
omega_0        0
omega_1        0
omega_2        0

* At time 0.75:
Sigma algebra 'G':
        atom ID
sample
omega_0        0
omega_1        0
omega_2        1

* At time 1.5:
Sigma algebra 'H':
        atom ID
sample
omega_0        0
omega_1        1
omega_2        2

at property

at

Get an indexer for accessing sigma algebras at specific times.

Returns:

Name Type Description
indexer _FiltrationIndexer

An indexer for accessing sigma algebras at specific times.

Examples:

>>> from sigalg.core import Filtration, SampleSpace, SigmaAlgebra, Time
>>> # Define sample space and sigma algebras
>>> sample_space = SampleSpace.generate_sequence(size=3)
>>> F = SigmaAlgebra.trivial(sample_space=sample_space, name="F")
>>> G = SigmaAlgebra(sample_space=sample_space, name="G").from_dict(
...     sample_id_to_atom_id={"omega_0": 0, "omega_1": 0, "omega_2": 1},
... )
>>> H = SigmaAlgebra.power_set(sample_space=sample_space, name="H")
>>> # Define continous time index
>>> time = Time.continuous(start=0.0, stop=1.5, num_points=3)
>>> # Create and print filtration
>>> Ft = Filtration(time=time, name="Ft").from_list([F, G, H])
>>> print(Ft)
Filtration (Ft)
===============

* Time 'T':
[0.0, 0.75, 1.5]

* At time 0.0:
Sigma algebra 'F':
        atom ID
sample
omega_0        0
omega_1        0
omega_2        0

* At time 0.75:
Sigma algebra 'G':
        atom ID
sample
omega_0        0
omega_1        0
omega_2        1

* At time 1.5:
Sigma algebra 'H':
        atom ID
sample
omega_0        0
omega_1        1
omega_2        2
>>> # Access sigma algebra at time 0.0
>>> print(Ft.at[0.0])
Sigma algebra 'F':
        atom ID
sample
omega_0        0
omega_1        0
omega_2        0
>>> # Access sigma algebra at time 0.5 (returns the same as at time 0.0)
>>> print(Ft.at[0.5])
Sigma algebra 'F':
        atom ID
sample
omega_0        0
omega_1        0
omega_2        0
>>> # Access sigma algebra at time 0.75
>>> print(Ft.at[0.75])
Sigma algebra 'G':
        atom ID
sample
omega_0        0
omega_1        0
omega_2        1
>>> # Access sigma algebra at time 1.2 (returns the same as at time 0.75)
>>> print(Ft.at[1.2])
Sigma algebra 'G':
        atom ID
sample
omega_0        0
omega_1        0
omega_2        1
>>> # Access sigma algebra at time 1.5
>>> print(Ft.at[1.5])
Sigma algebra 'H':
        atom ID
sample
omega_0        0
omega_1        1
omega_2        2

coarsest property

coarsest

Get the coarsest sigma algebra in the filtration.

Returns:

Name Type Description
coarsest SigmaAlgebra

The coarsest sigma algebra in the filtration.

data property

data

Get the underlying data of the filtration.

Returns:

Name Type Description
data DataFrame

The underlying data of the filtration.

finest property

finest

Get the finest sigma algebra in the filtration.

Returns:

Name Type Description
finest SigmaAlgebra

The finest sigma algebra in the filtration.

name property writable

name

Get the name of the filtration.

Returns:

Name Type Description
name Hashable | None

The name of the filtration.

sample_space property

sample_space

Get the sample space of the filtration.

Returns:

Name Type Description
sample_space SampleSpace

The sample space common to all sigma algebras in the filtration.

sigma_algebras property

sigma_algebras

Get the list of sigma algebras in the filtration.

Returns:

Name Type Description
sigma_algebras list[SigmaAlgebra]

The list of sigma algebras in the filtration.

time property

time

Get the time index of the filtration.

Returns:

Name Type Description
time Index

The time index of the filtration.

time_to_pos property

time_to_pos

Get the mapping from time points to positions in the sigma algebras list.

Returns:

Name Type Description
time_to_pos dict

A mapping from time points to positions in the sigma algebras list.

from_list

from_list(sigma_algebras)

Initialize the filtration from a list of sigma algebras.

If the time parameter was not provided at initialization, it will be set to a discrete time index of the same length as the provided list of sigma algebras.

Parameters:

Name Type Description Default
sigma_algebras list[SigmaAlgebra]

A list of sigma algebras that form a filtration. The order of the list determines the order of the filtration (i.e., the first element is the coarsest sigma algebra and the last element is the finest sigma algebra).

required

Returns:

Name Type Description
filtration Filtration

The filtration initialized from the provided list of sigma algebras.

from_pandas

from_pandas(data)

Initialize the filtration from a pd.DataFrame.

The columns of the DataFrame represent the atom IDs of the sigma algebras in the filtration.

If the time parameter was not provided at initialization, it will be set to an index matching the columns of the provided DataFrame.

Parameters:

Name Type Description Default
data DataFrame

A DataFrame where each column represents the atom IDs of a sigma algebra in the filtration. The order of the columns determines the order of the filtration (i.e., the first column is the coarsest sigma algebra and the last column is the finest sigma algebra).

required

Raises:

Type Description
TypeError

If data is not a pandas DataFrame or if the time index of the filtration (if given) does not match the columns of the provided DataFrame.

ValueError

If the provided data does not represent a valid filtration (i.e., if the atom IDs in the columns do not form a nested sequence of sigma algebras).

Returns:

Name Type Description
filtration Filtration

The filtration initialized from the provided DataFrame.

Index

A base class representing an ordered collection of hashable items.

The Index class provides a foundation for representing ordered collections with validation, indexing, iteration, equality operations, and other attributes. It wraps a pd.Index internally.

Parameters:

Name Type Description Default
name Hashable | None

Name identifier for the index.

None
data_name Hashable | None

Name for the internal pd.Index.

None
**kwargs

Additional keyword arguments passed to subclasses.

{}

Raises:

Type Description
TypeError

If name or data_name is not None and is not hashable.

Examples:

>>> from sigalg.core import Index
>>> idx = Index(name="an_index").from_list(indices=["a", "b", "c"])
>>> idx
Index 'an_index':
['a', 'b', 'c']

data property writable

data

Get the underlying pd.Index.

Returns:

Name Type Description
data Index

The underlying pd.Index object.

indices property

indices

Get the list of hashable items in the index.

Returns:

Name Type Description
indices list[Hashable]

The list of hashable items in this index.

name property writable

name

Get the name identifier for this index.

Returns:

Name Type Description
name Hashable | None

The name of this index.

from_list

from_list(indices)

Create an Index from a list of hashable items.

Parameters:

Name Type Description Default
indices list[Hashable]

List of hashable items to use for the index.

required

Raises:

Type Description
TypeError

If indices is not a list of hashable items.

ValueError

If indices contains duplicate items.

Returns:

Name Type Description
self Index

The current Index instance with updated indices.

from_pandas

from_pandas(data)

Create an Index from a pd.Index.

Parameters:

Name Type Description Default
data Index

pd.Index object to use for the index.

required

Raises:

Type Description
TypeError

If data is not a pd.Index.

Returns:

Name Type Description
index Index

The current Index instance with updated data.

Examples:

>>> from sigalg.core import Index
>>> import pandas as pd
>>> pd_index = pd.Index(['a', 'b', 'c'])
>>> idx = Index(name="an_index").from_pandas(pd_index)
>>> idx
Index 'an_index':
['a', 'b', 'c']

from_sequence

from_sequence(size, initial_index=0, prefix=None)

Create an Index with sequentially numbered items.

Parameters:

Name Type Description Default
size int

Number of features to generate. Must be positive.

required
initial_index int

Starting index for sequential numbering.

0
prefix Hashable | None

Prefix for index names. If None or non-string hashable is given, then numerical indices are used.

None

Returns:

Name Type Description
index Index

A new Index with automatically generated indices.

Raises:

Type Description
ValueError

If size is not a positive integer.

TypeError

If initial_index is not an integer, prefix is not hashable, name is not hashable, or data_name is not hashable (if given).

Examples:

>>> from sigalg.core import Index
>>> index1 = Index().from_sequence(size=3, prefix="F")
>>> index1
Index:
['F_0', 'F_1', 'F_2']
>>> index2 = Index(name="an_index").from_sequence(size=2, initial_index=5)
>>> index2
Index 'an_index':
[5, 6]

generate_sequence classmethod

generate_sequence(
    size,
    initial_index=0,
    prefix=None,
    name=None,
    data_name=None,
)

Generate a sequential Index.

Creates an Index with sequentially numbered items, optionally prefixed by a given string.

Parameters:

Name Type Description Default
size int

Number of features to generate. Must be positive.

required
initial_index int

Starting index for sequential numbering.

0
prefix Hashable | None

Prefix for index names. If None or non-string hashable is given, then numerical indices are used.

None
name Hashable | None

Name identifier for the index.

None
data_name Hashable | None

Name for the index of values.

None

Returns:

Name Type Description
index Index

A new Index with automatically generated indices.

Raises:

Type Description
ValueError

If size is not a positive integer.

TypeError

If initial_index is not an integer, prefix is not hashable, name is not hashable, or data_name is not hashable (if given).

Examples:

>>> from sigalg.core import Index
>>> index1 = Index.generate_sequence(size=3, prefix="F")
>>> index1
Index:
['F_0', 'F_1', 'F_2']
>>> index2 = Index.generate_sequence(size=2, initial_index=5, name="an_index")
>>> index2
Index 'an_index':
[5, 6]

with_name

with_name(name)

Return a new Index with the given name.

Parameters:

Name Type Description Default
name Hashable | None

New name for the index.

required

Returns:

Name Type Description
index Index

A new Index with the specified name.

Operators

Class containing operators on random vectors, such as integration, expectation, variance, standard deviation, covariance, correlation, and pushforward of probability measures.

correlation classmethod

correlation(rv1, rv2, probability_measure=None)

Compute the correlation matrix of two random vectors.

Parameters:

Name Type Description Default
rv1 RandomVector

The first random vector.

required
rv2 RandomVector

The second random vector.

required
probability_measure ProbabilityMeasure | None

The probability measure to use. If None, uses rv1.probability_measure.

None

Returns:

Name Type Description
corr DataFrame | Real

If both random vectors have dimension > 1, returns a pd.DataFrame representing the correlation matrix. If both have dimension 1, returns a Real representing the correlation.

Examples:

>>> from sigalg.core import (
...     Operators,
...     ProbabilityMeasure,
...     RandomVariable,
...     RandomVector,
...     SampleSpace,
... )
>>> correlation = Operators.correlation
>>> Omega = SampleSpace().from_sequence(size=3)
>>> P = ProbabilityMeasure(sample_space=Omega).from_dict({0: 0.2, 1: 0.3, 2: 0.5})
>>> X = RandomVector(domain=Omega, name="X").from_dict({0: (1, 2), 1: (2, 1), 2: (3, 4)})
>>> Y = RandomVector(domain=Omega, name="Y").from_dict({0: (3, -2), 1: (1, 5), 2: (6, 8)})
>>> # Correlation of two 2-dimensional random vectors is a 2x2 matrix
>>> correlation(X, Y, probability_measure=P)
feature       Y_0       Y_1
feature
X_0      0.712173  0.972077
X_1      0.998304  0.576119
>>> # Correlation of two random variables is a scalar
>>> Z = RandomVariable(domain=Omega, name="Z").from_dict({0: -1, 1: 4, 2: 6})
>>> W = RandomVariable(domain=Omega, name="W").from_dict({0: 2, 1: -3, 2: 5})
>>> float(correlation(Z, W, probability_measure=P))
0.3273268353539886

covariance classmethod

covariance(rv1, rv2=None, probability_measure=None)

Compute the covariance matrix of one or two random vectors.

If rv2 is provided, computes the covariance matrix Cov(rv1, rv2). If rv2 is None, computes the covariance matrix Cov(rv1, rv1). If probability_measure is None, uses the probability measure carried by rv1. If both random vectors have dimension 1, returns a scalar covariance.

Parameters:

Name Type Description Default
rv1 RandomVector

The first random vector.

required
rv2 RandomVector | None

The second random vector. If None, computes Cov(rv1, rv1).

None
probability_measure ProbabilityMeasure | None

The probability measure to use. If None, uses rv1.probability_measure.

None

Raises:

Type Description
TypeError

If rv1 is not a RandomVector, or if rv2 is not a RandomVector or None, or if probability_measure is not a ProbabilityMeasure or None.

ValueError

If rv1 and rv2 have different domains or dimensions (when rv2 is not None), or if probability_measure is not defined on the same sample space as rv1 (when probability_measure is not None).

Returns:

Name Type Description
cov DataFrame | Real

If both random vectors have dimension > 1, returns a pd.DataFrame representing the covariance matrix. If both have dimension 1, returns a Real representing the covariance.

Examples:

>>> from sigalg.core import (
...     Operators,
...     ProbabilityMeasure,
...     RandomVariable,
...     RandomVector,
...     SampleSpace,
... )
>>> covariance = Operators.covariance
>>> Omega = SampleSpace().from_sequence(size=3)
>>> P = ProbabilityMeasure(sample_space=Omega).from_dict({0: 0.2, 1: 0.3, 2: 0.5})
>>> # Covariance of two 2-dimensional random vectors is a 2x2 matrix
>>> X = RandomVector(domain=Omega, name="X").from_dict({0: (1, 2), 1: (2, 1), 2: (3, 4)})
>>> Y = RandomVector(domain=Omega, name="Y").from_dict({0: (3, -2), 1: (1, 5), 2: (6, 8)})
>>> covariance(X, Y, probability_measure=P)
feature   Y_0   Y_1
feature
X_0      1.23  2.87
X_1      2.97  2.93
>>> # Covariance of two random variables is a scalar
>>> Z = RandomVariable(domain=Omega, name="Z").from_dict({0: 1, 1: -2, 2: 3})
>>> W = RandomVariable(domain=Omega, name="W").from_dict({0: 5, 1: 6, 2: 1})
>>> covariance(Z, W, probability_measure=P)
-4.73

expectation classmethod

expectation(
    rv, sigma_algebra=None, probability_measure=None
)

Compute the expectation of a RandomVector with respect to a ProbabilityMeasure, optionally conditioned on a SigmaAlgebra.

The conditional expectation of a random variable is another random variable that is constant on each atom of the sigma algebra, its value on an atom being the mean value of the original random variable on that atom. This mean value is computed with respect to the conditional probabilities of the atom. If an atom has probability 0, the expected value is defined to be 0 on this atom.

The unconditional expectation is the same as the conditional expectation with respect to the trivial sigma algebra (with a single atom equal to the entire sample space), so this description applies to the unconditional expectation too. In particular, the unconditional expectation of a random variable is a constant random variable equal to the mean value of the original random variable with respect to the probability measure.

Parameters:

Name Type Description Default
rv RandomVector

The random vector for which to compute the expectation.

required
sigma_algebra SigmaAlgebra | None

The sigma algebra to condition on. If None, computes the unconditional expectation.

None
probability_measure ProbabilityMeasure | None

The probability used to compute the expectation. If None, the probability measure carried by the random vector is used (accessed through its probability_measure attribute).

None

Raises:

Type Description
TypeError

If rv is not a RandomVector, or if sigma_algebra is not a SigmaAlgebra or None, or if probability_measure is not a ProbabilityMeausre or None.

Returns:

Name Type Description
exp RandomVector

The expected value of the random variable.

Examples:

>>> from sigalg.core import Operators, RandomVector, SampleSpace, SigmaAlgebra
>>> expectation = Operators.expectation
>>> domain = SampleSpace().from_sequence(size=3, prefix="omega")
>>> outputs = {"omega_0": (1, 2), "omega_1": (3, 4), "omega_2": (5, 6)}
>>> probabilities = {"omega_0": 0.2, "omega_1": 0.5, "omega_2": 0.3}
>>> X = RandomVector(domain).from_dict(outputs).with_probability_measure(probabilities)
>>> # Compute unconditional expectation
>>> expectation(X)
Random vector 'E(X)':
expectation   E(X)_0  E(X)_1
sample
omega_0          3.2     4.2
omega_1          3.2     4.2
omega_2          3.2     4.2
>>> # Compute conditional expectation given a sigma algebra
>>> F = SigmaAlgebra(domain).from_dict({"omega_0": 0, "omega_1": 0, "omega_2": 1})
>>> expectation(X, F)
Random vector 'E(X|F)':
expectation   E(X|F)_0  E(X|F)_1
sample
omega_0       2.428571  3.428571
omega_1       2.428571  3.428571
omega_2       5.000000  6.000000

integrate classmethod

integrate(rv, probability_measure=None)

Compute the integral of a RandomVector with respect to a ProbabilityMeasure.

Parameters:

Name Type Description Default
rv RandomVector

The random vector to integrate.

required
probability_measure ProbabilityMeasure | None

The probability measure with respect to which to integrate. If None, the probability measure carried by the random vector is used (accessed through its probability_measure attribute).

None

Returns:

Name Type Description
integral Series | Real

If rv has dimension > 1, returns a pd.Series representing the integral of each component of the random vector. If rv has dimension 1, returns a Real representing the integral.

Examples:

>>> from sigalg.core import (
...     Operators,
...     ProbabilityMeasure,
...     RandomVariable,
...     RandomVector,
...     SampleSpace,
... )
>>> integrate = Operators.integrate
>>> Omega = SampleSpace().from_sequence(size=3)
>>> P = ProbabilityMeasure(sample_space=Omega).from_dict({0: 0.2, 1: 0.3, 2: 0.5})
>>> X = RandomVector(domain=Omega, name="X").from_dict({0: (1, 2), 1: (1, 2), 2: (3, 4)})
>>> # Integral of a 2-dimensional random vector
>>> integrate(rv=X, probability_measure=P)
feature
X_0    2.0
X_1    3.0
Name: integral(X), dtype: float64
>>> # Integral of a random variable
>>> Y = RandomVariable(domain=Omega, name="Y").from_dict({0: 1, 1: 1, 2: 0})
>>> float(integrate(rv=Y, probability_measure=P))
0.5

pushforward classmethod

pushforward(rv, probability_measure=None)

Push forward a probability measure on the domain of a random vector to a probability measure on its range.

Given a random vector X: Omega -> S and a probability measure P on Omega, constructs the probability measure P_X on the range X.range.

Parameters:

Name Type Description Default
rv RandomVector

Random vector.

required
probability_measure ProbabilityMeasure | None

Probability measure P defining the probabilities on the domain sample space. If None, the probability measure carried by the random vector is used (accessed through its probability_measure attribute).

None

Raises:

Type Description
TypeError

If rv is not a RandomVector, or if probability_measure is not a ProbabilityMeasure (if given).

ValueError

If rv is not defined on the sample space of probability_measure (if given).

Returns:

Name Type Description
pushforward_measure ProbabilityMeasure

The resulting probability measure P_X.

Examples:

>>> import pandas as pd
>>> from sigalg.core import Operators, ProbabilityMeasure, RandomVector, SampleSpace
>>> pushforward = Operators.pushforward
>>> domain = SampleSpace.generate_sequence(size=3)
>>> X = RandomVector(domain=domain).from_dict(
...     {"omega_0": (1, 2), "omega_1": (3, 4), "omega_2": (3, 4)},
... )
>>> print(X)
Random vector 'X':
feature  X_0  X_1
sample
omega_0    1   2
omega_1    3   4
omega_2    3   4
>>> prob_measure = ProbabilityMeasure(sample_space=domain).from_dict(
...     {"omega_0": 0.2, "omega_1": 0.5, "omega_2": 0.3},
... )
>>> P_X = pushforward(probability_measure=prob_measure, rv=X)
>>> X_range = X.range
>>> print(pd.concat([X_range.data, P_X.data], axis=1))
        X_0  X_1  probability
output
x_0       1   2          0.2
x_1       3   4          0.8

std classmethod

std(rv, sigma_algebra=None, probability_measure=None)

Compute the standard deviation of a random vector.

The conditional standard deviation of a random variable is another random variable that is constant on each atom of the sigma algebra, its value on an atom being the standard deviation of the original random variable on that atom. This standard deviation is computed with respect to the conditional probabilities of the atom.

The unconditional standard deviation is the same as the conditional standard deviation with respect to the trivial sigma algebra (with a single atom equal to the entire sample space), so this description applies to the unconditional standard deviation too. In particular, the unconditional standard deviation of a random variable is a constant random variable equal to the standard deviation of the original random variable with respect to the probability measure.

Parameters:

Name Type Description Default
rv RandomVector

The random vector for which to compute the standard deviation.

required
sigma_algebra SigmaAlgebra | None

The sigma algebra to condition on. If None, computes the unconditional standard deviation.

None
probability_measure ProbabilityMeasure | None

The probability measure to use. If None, uses rv.probability_measure.

None

Raises:

Type Description
TypeError

If rv is not a RandomVector, or if sigma_algebra is not a SigmaAlgebra or None, or if probability_measure is not a ProbabilityMeasure or None.

Returns:

Name Type Description
std RandomVector

The standard deviation of the random vector, optionally conditioned on the sigma algebra.

Examples:

>>> from sigalg.core import (
...     Operators,
...     ProbabilityMeasure,
...     RandomVariable,
...     RandomVector,
...     SampleSpace,
...     SigmaAlgebra,
... )
>>> std = Operators.std
>>> Omega = SampleSpace().from_sequence(size=3)
>>> P = ProbabilityMeasure(sample_space=Omega).from_dict({0: 0.2, 1: 0.3, 2: 0.5})
>>> X = RandomVector(domain=Omega, name="X").from_dict({0: (1, 2), 1: (2, 1), 2: (3, 4)})
>>> # Unconditional standard deviation of a 2-dimensional random vector
>>> std(X, probability_measure=P)
Random vector 'std(X)':
std  std(X)_0  std(X)_1
sample
0        0.781025  1.345362
1        0.781025  1.345362
2        0.781025  1.345362
>>> # Conditional standard deviation of a 2-dimensional random vector
>>> F = SigmaAlgebra(sample_space=Omega, name="F").from_dict({0: 0, 1: 0, 2: 1})
>>> std(X, sigma_algebra=F, probability_measure=P)
Random vector 'std(X|F)':
std  std(X|F)_0  std(X|F)_1
sample
0          0.489898    0.489898
1          0.489898    0.489898
2          0.000000    0.000000
>>> # Unconditional standard deviation of a random variable
>>> Z = RandomVariable(domain=Omega, name="Z").from_dict({0: 1, 1: -2, 2: 3})
>>> std(Z, probability_measure=P)
Random variable 'std(Z)':
    std(Z)
sample
0       2.165641
1       2.165641
2       2.165641
>>> # Conditional standard deviation of a random variable
>>> std(Z, sigma_algebra=F, probability_measure=P)
Random variable 'std(Z|F)':
        std(Z|F)
sample
0       1.469694
1       1.469694
2       0.000000

variance classmethod

variance(rv, sigma_algebra=None, probability_measure=None)

Compute the variance of a random vector, optionally conditioned on a sigma algebra.

The conditional variance of a random variable is another random variable that is constant on each atom of the sigma algebra, its value on an atom being the variance of the original random variable on that atom. This variance is computed with respect to the conditional probabilities of the atom.

The unconditional variance is the same as the conditional variance with respect to the trivial sigma algebra (with a single atom equal to the entire sample space), so this description applies to the unconditional variance too. In particular, the unconditional variance of a random variable is a constant random variable equal to the variance of the original random variable with respect to the probability measure.

Parameters:

Name Type Description Default
rv RandomVector

The random vector for which to compute the variance.

required
sigma_algebra SigmaAlgebra | None

The sigma algebra with respect to which to compute the variance. If None, computes the variance without conditioning.

None
probability_measure ProbabilityMeasure | None

The probability measure to use. If None, uses rv.probability_measure.

None

Raises:

Type Description
TypeError

If rv is not a RandomVector, or if sigma_algebra is not a SigmaAlgebra or None, or if probability_measure is not a ProbabilityMeasure or None.

Returns:

Name Type Description
var RandomVector

The variance of the random vector, optionally conditioned on the sigma algebra.

Examples:

>>> from sigalg.core import (
...     Operators,
...     ProbabilityMeasure,
...     RandomVariable,
...     RandomVector,
...     SampleSpace,
...     SigmaAlgebra,
... )
>>> variance = Operators.variance
>>> Omega = SampleSpace().from_sequence(size=3)
>>> P = ProbabilityMeasure(sample_space=Omega).from_dict({0: 0.2, 1: 0.3, 2: 0.5})
>>> X = RandomVector(domain=Omega, name="X").from_dict({0: (1, 2), 1: (2, 1), 2: (3, 4)})
>>> # Unconditional variance of a 2-dimensional random vector
>>> variance(X, probability_measure=P)
Random vector 'V(X)':
variance  V(X)_0  V(X)_1
sample
0           0.61    1.81
1           0.61    1.81
2           0.61    1.81
>>> # Conditional variance of a 2-dimensional random vector
>>> F = SigmaAlgebra(sample_space=Omega, name="F").from_dict({0: 0, 1: 0, 2: 1})
>>> variance(X, sigma_algebra=F, probability_measure=P)
Random vector 'V(X|F)':
variance  V(X|F)_0  V(X|F)_1
sample
0             0.24      0.24
1             0.24      0.24
2             0.00      0.00
>>> # Unconditional variance of a random variable
>>> Z = RandomVariable(domain=Omega, name="Z").from_dict({0: 1, 1: -2, 2: 3})
>>> variance(Z, probability_measure=P)
Random variable 'V(Z)':
        V(Z)
sample
0       4.69
1       4.69
2       4.69
>>> # Conditional variance of a random variable
>>> variance(Z, sigma_algebra=F, probability_measure=P)
Random vector 'V(Z|F)':
        V(Z|F)
sample
0         2.16
1         2.16
2         0.00

ProbabilityMeasure

Bases: OperatorsMethods

A class representing a probability measure on a sample space.

A probability measure is a mapping from sample space indices to probabilities with the following properties: All probabilities are non-negative real numbers and they sum to 1. The class provides methods to compute probabilities of events, conditional probabilities, and to check for independence between events.

Parameters:

Name Type Description Default
sample_space SampleSpace

The sample space on which the probability measure is defined.

None
name Hashable

A name for the probability measure.

"P"

Raises:

Type Description
TypeError

If sample_space is not a SampleSpace instance (if given), or if name is not hashable (if given).

Examples:

>>> from sigalg.core import ProbabilityMeasure, SampleSpace
>>> sample_space = SampleSpace.generate_sequence(size=3)
>>> probabilities = {"omega_0": 0.2, "omega_1": 0.5, "omega_2": 0.3}
>>> P = ProbabilityMeasure(sample_space=sample_space).from_dict(probabilities)
>>> float(P("omega_1"))
0.5
>>> A = sample_space.get_event(["omega_0", "omega_1"], name="A")
>>> float(P(A))
0.7

data property

data

Get the probability values as a pd.Series.

Returns:

Name Type Description
data Series

A pd.Series with sample space indices as the index and their associated probabilities as values.

name property writable

name

Get the name of the probability measure.

Returns:

Name Type Description
name Hashable

The name of the probability measure.

probabilities property

probabilities

Get the mapping from sample IDs to their probabilities.

Returns:

Name Type Description
probabilities Mapping[Hashable, Real]

A mapping from sample IDs to their probabilities.

P

P(key)

Get the probability of a sample point or event.

This method is an alias for the __call__ method.

are_independent

are_independent(
    event1=None,
    event2=None,
    algebra1=None,
    algebra2=None,
    tolerance=1e-10,
)

Check if two events or sigma algebras are independent.

Parameters:

Name Type Description Default
event1 Event | None

The first event.

None
event2 Event | None

The second event.

None
algebra1 SigmaAlgebra | None

The first sigma algebra.

None
algebra2 SigmaAlgebra | None

The second sigma algebra.

None
tolerance Real

The numerical tolerance for checking independence.

1e-10

Raises:

Type Description
ValueError

If neither events nor sigma algebras are provided, or if both are provided, or if the provided objects are from a different sample space.

TypeError

If the provided objects are not of the correct type.

Returns:

Name Type Description
is_independent bool

True if the events or sigma algebras are independent, False otherwise.

conditional_probability

conditional_probability(event, given)

Compute the conditional probability P(A|B).

Parameters:

Name Type Description Default
event Event

The event A.

required
given Event

The event B.

required

Raises:

Type Description
ValueError

If event or given are from a different sample space than this probability measure's sample space, or if P(B) = 0.

from_dict

from_dict(probabilities)

Create a ProbabilityMeasure from a dictionary.

If a sample_space was not provided during initialization, it will be created from the keys of the provided dictionary. If it was provided, the keys of the dictionary must match the sample space.

Parameters:

Name Type Description Default
probabilities Mapping[Hashable, Real]

A mapping from sample space indices to their probabilities.

required

from_features classmethod

from_features(rv, pmf, name='P')

Add a probability measure on the domain of a random vector using a function of the features.

Parameters:

Name Type Description Default
rv RandomVector

The random vector whose domain will receive the probability measure.

required
pmf Callable[[FeatureVector | Hashable], Real]

Function mapping feature vectors (in dimension > 1) or hashable values (in dimension 1) to probability values. Must return non-negative values that sum to 1.

required
name Hashable | None

The name of the probability measure.

'P'

Returns:

Name Type Description
prob_measure ProbabilityMeasure

The resulting probability measure.

Examples:

>>> from sigalg.core import (
...     FeatureVector, ProbabilityMeasure, RandomVector, SampleSpace
... )
>>> domain = SampleSpace.generate_sequence(size=4)
>>> outputs = {
...     "omega_0": (0, 0),
...     "omega_1": (0, 1),
...     "omega_2": (1, 0),
...     "omega_3": (1, 1),
... }
>>> X = RandomVector(domain=domain).from_dict(outputs)
>>> def pmf(v: FeatureVector) -> Real:
...     v0, v1 = v
...     return 0.75**v0 * 0.25 ** (1 - v0) * 0.6**v1 * 0.4 ** (1 - v1)
>>> P = ProbabilityMeasure.from_features(rv=X, pmf=pmf)
>>> P
Probability measure 'P':
        probability
sample
omega_0         0.10
omega_1         0.15
omega_2         0.30
omega_3         0.45

from_pandas

from_pandas(data)

Create a ProbabilityMeasure from a pd.Series.

If a sample_space was not provided during initialization, it will be created from the index of the provided pd.Series. If it was provided, the index of the pd.Series must match the sample space.

Parameters:

Name Type Description Default
data Series

A pd.Series with sample space indices as the index and their associated probabilities as values

required

Raises:

Type Description
TypeError

If data is not a pd.Series.

uniform classmethod

uniform(sample_space, name='P')

Create a uniform ProbabilityMeasure on the given sample space.

Parameters:

Name Type Description Default
sample_space SampleSpace

The sample space on which to define the uniform probability measure.

required
name Hashable

A name for the probability measure.

"P"

Raises:

Type Description
ValueError

If the sample space is empty.

Returns:

Name Type Description
prob_measure ProbabilityMeasure

A uniform ProbabilityMeasure instance on the provided sample space.

with_name

with_name(name)

Set the name of the probability measure and return self for chaining.

Parameters:

Name Type Description Default
name Hashable

The new name for the probability measure.

required

Returns:

Name Type Description
self ProbabilityMeasure

The current instance with the updated name.

ProbabilitySpace

Bases: SampleSpaceMethods, SigmaAlgebraMethods, ProbabilityMeasureMethods

A class representing a probability space.

A probability space \((\Omega, F, P)\) consists of a sample space \(\Omega\) containing all possible outcomes, a sigma-algebra \(\mathcal{F}\) defining measurable events, and a probability measure \(P\) assigning probabilities to events.

ProbabilitySpace has attributes sample_space, sigma_algebra, and probability_measure that access the underlying components. It also inherits methods from SampleSpaceMethods, SigmaAlgebraMethods, and ProbabilityMeasureMethods, allowing direct access to their functionalities directly on the ProbabilitySpace instance.

Parameters:

Name Type Description Default
sample_space SampleSpace

The sample space containing all possible outcomes.

required
sigma_algebra SigmaAlgebra

Sigma-algebra defining measurable events. If None, a power set sigma-algebra is created.

None
probability_measure ProbabilityMeasure

Probability measure assigning probabilities to outcomes. If None, a uniform probability measure is created.

None

Raises:

Type Description
TypeError

If sample_space is not a SampleSpace, sigma_algebra is not a SigmaAlgebra, or probability_measure is not a ProbabilityMeasure.

ValueError

If sigma_algebra or probability_measure have different sample spaces than the provided sample_space.

Examples:

>>> from sigalg.core import ProbabilitySpace, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=3, prefix="s")
>>> # Create with uniform probability
>>> prob_space = ProbabilitySpace(sample_space=Omega)
>>> prob_space.probability_measure
Probability measure 'P':
        probability
sample
s_0        0.333333
s_1        0.333333
s_2        0.333333
>>> # Create with custom probabilities
>>> prob_space = ProbabilitySpace.from_dict(
...     sample_space=Omega,
...     probabilities={"s_0": 0.5, "s_1": 0.3, "s_2": 0.2}
... )
>>> prob_space.P("s_0")
0.5

probability_measure property writable

probability_measure

Get the probability measure assigning probabilities to events.

Returns:

Name Type Description
probability_measure ProbabilityMeasure

The probability measure of this probability space.

sigma_algebra property writable

sigma_algebra

Get the sigma-algebra defining measurable events.

Returns:

Name Type Description
sigma_algebra SigmaAlgebra

The sigma-algebra of this probability space.

from_dict classmethod

from_dict(
    probabilities, sample_space=None, sigma_algebra=None
)

Create a probability space from a dictionary of probabilities.

Convenience factory method that creates a probability measure from a dictionary mapping outcomes to probabilities, then constructs a probability space.

Parameters:

Name Type Description Default
probabilities dict[Hashable, Real]

Dictionary mapping sample point indices to their probabilities. Probabilities must be non-negative and sum to 1.

required
sample_space SampleSpace | None

The sample space containing all possible outcomes. If None, a sample space is created from the keys of the probabilities dictionary.

None
sigma_algebra SigmaAlgebra | None

Sigma-algebra defining measurable events. If None, a power set sigma-algebra is created.

None

Returns:

Name Type Description
probability_space ProbabilitySpace

A new probability space with the specified probabilities.

Examples:

>>> from sigalg.core import ProbabilitySpace
>>> prob_space = ProbabilitySpace.from_dict(
...     probabilities={"H": 0.6, "T": 0.4}
... )
>>> prob_space.sample_space
Sample space 'Omega':
['H', 'T']
>>> prob_space.P("H")
0.6

get_event_as_probability_space

get_event_as_probability_space(indices)

Create a conditional probability space given an event.

Given a probability space (Omega, F, P) and an event A, this method creates a new probability space (A, F_A, P_A) where F_A is the sigma-algebra restricted to A and P_A is the conditional probability measure on A.

Parameters:

Name Type Description Default
indices list[Hashable]

list[Hashable] of sample point indices defining the conditioning event.

required

Returns:

Name Type Description
probability_space ProbabilitySpace

A new probability space representing the conditional distribution.

Raises:

Type Description
ValueError

If the event has zero probability.

Examples:

>>> from sigalg.core import ProbabilitySpace, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=3, prefix="s")
>>> prob_space = ProbabilitySpace.from_dict(
...     sample_space=Omega,
...     probabilities={"s_0": 0.5, "s_1": 0.3, "s_2": 0.2}
... )
>>> cond_space = prob_space.get_event_as_probability_space(["s_0", "s_1"])
>>> bool(abs(cond_space.P("s_0") - 0.625) < 1e-10)
True

sample

sample(size=1, random_state=None)

Generate random samples from this probability space.

Samples outcomes according to the probability measure, returning a list[Hashable] of sample point indices.

Parameters:

Name Type Description Default
size int

Number of samples to generate. Must be positive.

1
random_state int

Random seed for reproducibility. If None, results are not reproducible.

None

Returns:

Name Type Description
sample list[Hashable]

list[Hashable] of sampled outcomes from the sample space.

Raises:

Type Description
ValueError

If size is not a positive integer.

Examples:

>>> from sigalg.core import ProbabilitySpace, SampleSpace
>>> Omega = SampleSpace().from_list(["H", "T"])
>>> prob_space = ProbabilitySpace(sample_space=Omega)
>>> samples = prob_space.sample(size=10, random_state=42)
>>> len(samples)
10

RandomVariable

Bases: RandomVector

A class representing a random variable, which is a 1-dimensional random vector.

indicator_of classmethod

indicator_of(event)

Create the indicator random variable of a given event.

Parameters:

Name Type Description Default
event Event

The event for which the indicator random variable is to be created.

required

Returns:

Name Type Description
indicator_rv RandomVariable

The indicator random variable of the given event.

RandomVector

Bases: OperatorsMethods

A class representing a random vector mapping between two sample spaces.

An instance of RandomVector represents a mapping X: Omega -> S from a sample space Omega to a feature space S. This means that the image X(omega) of a sample point omega is a tuple of features drawn from the component spaces, called the feature vector of omega. The number of component spaces (i.e., the length of the feature vector) is called the dimension of the random vector.

Instances of RandomVector can be constructed directly from a domain sample space and a dictionary of outputs, whose keys are the sample points in the domain and whose values are the corresponding feature vectors (as tuples). Alternatively, other methods are provided to construct a RandomVector from a pd.DataFrame or a np.ndarray.

Parameters:

Name Type Description Default
domain SampleSpace | None

The sample space over which the random vector is defined. The None value indicates that the domain will be generated later through a method like from_dict, from_pandas, or from_numpy.

None
index Index | None

The index of the random vector. The None value indicates that the index will be generated later through a method like from_dict, from_pandas, or from_numpy.

None
name Hashable | None

The name of the random vector.

"X"
**kwargs

Additional keyword arguments for subclass constructors.

{}

Raises:

Type Description
TypeError

If domain is not a SampleSpace (if given), or if index is not an Index (if given), or if name is not a Hashable (if given).

Examples:

>>> from sigalg.core import SampleSpace, RandomVector
>>> domain = SampleSpace.generate_sequence(size=3, prefix="s", name="S")
>>> outputs = {"s0": (0.1, 0.2), "s1": (0.3, 0.4), "s2": (0.5, 0.6)}
>>> # Generate a 2-dimensional random vector from outputs dict
>>> X = RandomVector(name="X").from_dict(outputs)
>>> tuple(X("s0"))
(0.1, 0.2)
>>> X.dimension
2
>>> # Generate a 1-dimensional random vector from a pd.Series
>>> import pandas as pd
>>> data = pd.Series([10, 20, 30], index=pd.Index(["s0", "s1", "s2"], name="S"))
>>> Y = RandomVector(name="Y").from_pandas(data)
>>> Y
Random vector 'Y':
       Y
S
s0     10
s1     20
s2     30

data property

data

Get the underlying pandas data structure of a random vector.

If the random vector is of dimension 2 or greater, returns the underlying pd.DataFrame; otherwise, returns the underlying pd.Series for a random vector of dimension 1.

If not initialized in the from_pandas method, lazily constructs the underlying pandas data structure from the outputs mapping.

Returns:

Name Type Description
data Series | DataFrame

The underlying pd.Series or pd.DataFrame representing the random vector.

Examples:

>>> from sigalg.core import RandomVector, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=2, prefix="s")
>>> outputs_2d = {"s_0": (1, 2), "s_1": (3, 4)}
>>> X = RandomVector(domain=Omega, name="X").from_dict(outputs_2d)
>>> # Dataframes underlie random vectors of dimension 2 or greater
>>> X.data
feature  X_0  X_1
sample
s_0        1   2
s_1        3   4
>>> outputs_1d = {"s_0": 10, "s_1": 20}
>>> Y = RandomVector(domain=Omega, name="Y").from_dict(outputs_1d)
>>> # Series underlie random vectors of dimension 1
>>> Y.data
sample
s_0     10
s_1     20
Name: Y, dtype: int64

index property writable

index

Get the index of the random vector.

Returns:

Name Type Description
index Index | None

The index of the random vector, or None if the random vector is 1-dimensional.

name property writable

name

Get the name of the random vector.

Returns:

Name Type Description
name Hashable

The name of the random vector.

outputs property

outputs

Get the outputs mapping of the random vector.

If not initialized in the from_dict method, lazily constructs the outputs mapping from the underlying pandas data structure.

Returns:

Name Type Description
outputs Mapping[Hashable, Hashable]

The mapping from sample points in the domain to their corresponding output vectors.

probability_measure property writable

probability_measure

Get the probability measure on the domain of the random vector, if set.

Returns:

Name Type Description
probability_measure ProbabilityMeasure | None

The probability measure on the domain of the random vector, or None if not set.

range property

range

Get the range of the random vector.

Mathematically, the range of a random vector X:Omega -> S is the set of all vectors X(omega), as omega varies over the sample space Omega. In this implementation, the range is represented as another RandomVector, where the domain is a SampleSpace that indexes the unique output vectors of the original random vector, and the outputs are these unique vectors themselves.

If the random vector has a string name (e.g., X), the range random vector is named range(X), the domain of range(X) has indices x0, x1, etc., and the feature indices of range(X) match those of X itself. Otherwise, numerical indices are used.

Returns:

Name Type Description
range RandomVector

A RandomVector representing the range of the original random vector.

Examples:

>>> from sigalg.core import SampleSpace, RandomVector
>>> import pandas as pd
>>> outputs = {"omega_0": (1, 2), "omega_1": (3, 4), "omega_2": (3, 4)}
>>> domain = SampleSpace.generate_sequence(size=3)
>>> X = RandomVector(domain=domain, name="X").from_dict(outputs)
>>> pd.concat([X.range.data, X.range_counts.rename("counts")], axis=1)
        X_0  X_1  counts
output
x_0       1   2       1
x_1       3   4       2

range_counts property

range_counts

Get the counts of each unique output in the range.

This property pairs with the range property to identify and provide the frequency of each unique output vector in the random vector's mapping. The dataframe range.data contains the unique output vectors, while range_counts provides the corresponding counts as an index-aligned pd.Series.

Returns:

Name Type Description
range_counts Series

A pd.Series where the index identifies the unique output vectors in the range, and the values represent the counts of each output vector in the original random vector.

Examples:

>>> from sigalg.core import SampleSpace, RandomVector
>>> import pandas as pd
>>> outputs = {"omega_0": (1, 2), "omega_1": (3, 4), "omega_2": (3, 4)}
>>> domain = SampleSpace.generate_sequence(size=3)
>>> X = RandomVector(domain=domain, name="X").from_dict(outputs=outputs)
>>> pd.concat([X.range.data, X.range_counts.rename("counts")], axis=1)
        X_0  X_1  counts
output
x_0       1   2       1
x_1       3   4       2

sigma_algebra property

sigma_algebra

Get the sigma-algebra induced by the random vector.

Returns:

Name Type Description
sigma_algebra SigmaAlgebra

The sigma-algebra induced by the random vector.

Examples:

>>> from sigalg.core import (
...     RandomVector,
...     SampleSpace,
...     SigmaAlgebra,
... )
>>> domain = SampleSpace.generate_sequence(size=3, prefix="s")
>>> X = RandomVector(domain=domain).from_dict(
...     outputs={"s_0": (1, 2), "s_1": (3, 4), "s_2": (3, 4)},
... )
>>> sigma_algebra = SigmaAlgebra.from_random_vector(X)
>>> sigma_algebra
Sigma algebra 'sigma(X)':
       atom ID
sample
s_0      (1, 2)
s_1      (3, 4)
s_2      (3, 4)

apply

apply(function)

Apply a function to the feature vector of each sample point, returning a new RandomVector.

Parameters:

Name Type Description Default
function Callable[[Hashable | FeatureVector], Hashable]

Function that takes a FeatureVector object (in dimension > 1) or a Hashable (in dimension 1) and returns a new output value.

required

Returns:

Name Type Description
new_rv RandomVector

A new RandomVector with outputs given by applying the function to each sample point's feature vector.

apply_to_features

apply_to_features(function)

Apply a function to the feature vector of each sample point.

Applies the given function to each sample point's feature vector, returning a pd.Series of results indexed by sample points.

Parameters:

Name Type Description Default
function Callable[[FeatureVector | Hashable], any]

Function that takes a FeatureVector object (in dimension > 1) or a Hashable (in dimension 1) and returns a value.

required

Returns:

Name Type Description
results Series

Series of function results indexed by sample points.

Examples:

>>> from sigalg.core import RandomVector, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=2, prefix="s")
>>> X = RandomVector(domain=Omega, name="X").from_dict(outputs={"s_0": (1, 2), "s_1": (3, 4)})
>>> X.apply_to_features(lambda f: f.sum() + 2)
sample
s_0    5
s_1    9
dtype: int64
>>> Y = RandomVector(domain=Omega, name="Y").from_dict(outputs={"s_0": 5, "s_1": 10})
>>> Y.apply_to_features(lambda x: x * 2)
sample
s_0    10
s_1    20
Name: Y, dtype: int64

from_constant

from_constant(constant)

Create a RandomVector that maps every sample point in the domain to the same constant output vector.

For this construction method, the domain must be provided at construction.

Parameters:

Name Type Description Default
constant Hashable

The constant output vector that every sample point in the domain maps to.

required

Returns:

Name Type Description
self RandomVector

A random vector mapping every sample point in the domain to the same constant output vector.

from_dict

from_dict(outputs)

Create a RandomVector from a dictionary mapping sample points to output vectors.

If the domain sample space is not provided at construction, it is automatically generated from the keys of the outputs dictionary. Similarly, if the index is not provided at construction and the random vector has dimension 2 or greater, a default feature index (i.e., an instance of Index) is also automatically generated. If the domain is provided at construction, the keys of the outputs dictionary must match the indices of the domain.

Parameters:

Name Type Description Default
outputs Mapping[Hashable, Hashable]

A mapping from sample points in the domain to their corresponding output vectors (e.g., tuples of feature values).

required

Raises:

Type Description
ValueError

If the data has dimension greater than 1 and self is an instance of RandomVariable.

Returns:

Name Type Description
self RandomVector

The constructed RandomVector instance.

Examples:

>>> from sigalg.core import RandomVector, SampleSpace
>>> outputs = {"omega_0": (0.1, 0.2), "omega_1": (0.3, 0.4), "omega_2": (0.5, 0.6)}
>>> X = RandomVector(name="X").from_dict(outputs)
>>> tuple(X("omega_1"))
(0.3, 0.4)
>>> X.domain
Sample space 'Omega':
['omega_0', 'omega_1', 'omega_2']
>>> X.index
Index 'index':
['X_0', 'X_1']

from_numpy

from_numpy(array)

Create a RandomVector from a NumPy ndarray.

If the domain sample space is not provided at construction, then it is automatically generated as a default sample space with indices 0, 1, ..., n-1, where n is the number of rows in the provided ndarray. Similarly, if the index is not provided at construction and the random vector has dimension 2 or greater, a default feature index (i.e., an instance of Index) is also automatically generated.

Parameters:

Name Type Description Default
array ndarray

NumPy array where rows are feature vectors of sample points and columns are features.

required

Returns:

Name Type Description
self RandomVector

A random vector constructed from the array.

Raises:

Type Description
TypeError

If array is not a NumPy ndarray.

Examples:

>>> from sigalg.core import Index, RandomVector, SampleSpace
>>> import numpy as np
>>> domain = SampleSpace.generate_sequence(size=3)
>>> index = Index.generate_sequence(size=2, prefix="feature")
>>> arr = np.array([[1, 2], [3, 4], [5, 6]])
>>> X = RandomVector(domain=domain, index=index, name="X").from_numpy(arr)
>>> X
Random vector 'X':
        feature_0  feature_1
sample
omega_0         1         2
omega_1         3         4
omega_2         5         6

from_pandas

from_pandas(data)

Create a RandomVector from a pd.Series or pd.DataFrame.

If the domain sample space is not provided at construction, then it is automatically generated from the index of the provided pd.DataFrame. Similarly, if the index is not provided at construction and the random vector has dimension 2 or greater, a default feature index (i.e., an instance of Index) is also automatically generated. If either domain or index are provided at construction, they must match the index and columns of the provided pd.DataFrame, respectively.

Parameters:

Name Type Description Default
data Series | DataFrame

A pd.Series or pd.DataFrame where each row corresponds to a feature vector of a sample point. If data is a pd.Series, the random vector is 1-dimensional; if data is a pd.DataFrame, the random vector's dimension equals the number of columns.

required

Raises:

Type Description
TypeError

If data is not a pd.Series or pd.DataFrame.

ValueError

If the length of index (if provided) does not match the dimension of the random vector, or if the data has dimension greater than 1 and self is an instance of RandomVariable.

Returns:

Name Type Description
self RandomVector

The constructed RandomVector instance.

Examples:

>>> from sigalg.core import RandomVector
>>> import pandas as pd
>>> # Create a 2-dimensional random vector
>>> data = pd.DataFrame(
...     [[1, 2], [3, 4], [5, 6]],
...     index=pd.Index([0, 1, 2], name="numbers"),
...     columns=pd.Index(["feature1", "feature2"], name="features"),
... )
>>> X = RandomVector(name="X").from_pandas(data)
>>> X
Random vector 'X':
features  feature1  feature2
numbers
0              1         2
1              3         4
2              5         6
>>> # Create a 1-dimensional random variable from a series
>>> data = pd.Series(
...     [10, 20, 30],
...     index=pd.Index([0, 1, 2], name="numbers"),
... )
>>> Y = RandomVector(name="Y").from_pandas(data)
>>> Y
Random vector 'Y':
       Y
numbers
0     10
1     20
2     30
>>> # Create a 1-dimensional random variable from a single-column dataframe
>>> data = pd.DataFrame([1, 2, 3], index=pd.Index([0, 1, 2], name="numbers"))
>>> Z = RandomVector(name="Z").from_pandas(data)
>>> Z
Random vector 'Z':
       Z
numbers
0     1
1     2
2     3

get_component_rv

get_component_rv(index)

Get a component random variable corresponding to a specific feature index.

Parameters:

Name Type Description Default
index Hashable

The feature index for which to get the component random variable.

required

Returns:

Name Type Description
component_rv RandomVariable

A new RandomVariable representing the component random variable.

Raises:

Type Description
ValueError

If the feature index is not found.

Examples:

>>> from sigalg.core import RandomVector, SampleSpace
>>> domain = SampleSpace.generate_sequence(size=2, prefix="s")
>>> outputs = {"s_0": (1, 2), "s_1": (3, 4)}
>>> X = RandomVector(domain=domain).from_dict(outputs)
>>> X_component = X.get_component_rv("X_1")
>>> X_component
Random variable 'X_1':
       X_1
sample
s_0     2
s_1     4

get_sub_vector

get_sub_vector(feature_indices)

Get a sub-vector of the random vector by selecting specific feature indices.

Parameters:

Name Type Description Default
feature_indices list[Hashable]

List of feature indices to select for the sub-vector.

required

Returns:

Name Type Description
sub_vector RandomVector

A new RandomVector containing only the specified feature indices.

Raises:

Type Description
ValueError

If any feature index is not found.

Examples:

>>> from sigalg.core import RandomVector, SampleSpace
>>> domain = SampleSpace.generate_sequence(size=2, prefix="s")
>>> outputs = {"s_0": (1, 2, 3), "s_1": (4, 5, 6)}
>>> X = RandomVector(domain=domain).from_dict(outputs)
>>> X_sub = X.get_sub_vector(feature_indices=["X_0", "X_2"])
>>> X_sub
Random vector 'X_sub':
feature  X_0  X_2
sample
s_0        1    3
s_1        4    6

is_measurable

is_measurable(sigma_algebra)

Check if the random vector is measurable with respect to a given sigma-algebra.

Parameters:

Name Type Description Default
sigma_algebra SigmaAlgebra

The sigma-algebra on the domain sample space.

required

Returns:

Name Type Description
is_measurable bool

True if the random vector is measurable with respect to the given sigma-algebra, False otherwise.

Examples:

>>> from sigalg.core import (
...     RandomVector,
...     SampleSpace,
...     SigmaAlgebra,
... )
>>> domain = SampleSpace.generate_sequence(size=4, prefix="s", name="S")
>>> X = RandomVector(domain=domain, name="X").from_dict(
...     outputs={"s_0": (1, 2), "s_1": (3, 4), "s_2": (3, 4), "s_3": (3, 4)},
... )
>>> Y = RandomVector(domain=domain, name="Y").from_dict(
...     outputs={"s_0": "a", "s_1": "b", "s_2": "c", "s_3": "d"},
... )
>>> F = SigmaAlgebra(sample_space=domain).from_dict(
...     {"s_0": 0, "s_1": 1, "s_2": 1, "s_3": 2},
... )
>>> print(X.is_measurable(F))
True
>>> print(Y.is_measurable(F))
False

item

item()

Get the single output value of a 1-dimensional RandomVector with exactly one sample point.

Returns:

Name Type Description
output Hashable

The single output value of the random vector.

Raises:

Type Description
ValueError

If the random vector does not have exactly one sample point or is not 1-dimensional.

iter_features

iter_features()

Iterate over sample points and their feature vectors.

Yields tuples of (sample_index, FeatureVector) for each sample point in the domain, allowing iteration over the random vector's entire domain.

Yields:

Name Type Description
sample_index Hashable

Index of the sample point.

features FeatureVector

Feature vector of the sample point.

Examples:

>>> from sigalg.core import RandomVector, SampleSpace
>>> Omega = SampleSpace.generate_sequence(size=2, prefix="s")
>>> X = RandomVector(domain=Omega).from_dict(outputs={"s_0": (1, 2), "s_1": (3, 4)})
>>> for _, features in X.iter_features():
...     print(features)
Feature vector of 's_0':
         s_0
feature
X_0        1
X_1        2
Feature vector of 's_1':
         s_1
feature
X_0        3
X_1        4
>>> Y = RandomVector(domain=Omega, name="Y").from_dict(outputs={"s_0": 1, "s_1": 2})
>>> for idx, features in Y.iter_features():
...     print(f"Feature of {idx}: ", features)
Feature of s_0:  1
Feature of s_1:  2

print_values_and_probabilities

print_values_and_probabilities()

Print the values of the random vector and their corresponding probabilities.

to_random_variable

to_random_variable()

Convert a 1-dimensional RandomVector to a RandomVariable.

Returns:

Name Type Description
rv RandomVariable

The converted RandomVariable.

Examples:

>>> from sigalg.core import RandomVector, SampleSpace
>>> domain = SampleSpace.generate_sequence(size=2, prefix="s")
>>> outputs = {"s_0": 10, "s_1": 20}
>>> X = RandomVector(domain=domain, name="X").from_dict(outputs=outputs)
>>> X_var = X.to_random_variable()
>>> X_var
Random variable 'X':
        X
sample
s_0    10
s_1    20

with_name

with_name(name, modify_index=False)

Set the name of the random vector and return self for chaining.

Parameters:

Name Type Description Default
name Hashable

The new name for the random vector.

required
modify_index bool

If True and the random vector has a feature index, also updates the feature index to reflect the new name of the random vector.

True

Returns:

Name Type Description
self RandomVector

Returns self to allow method chaining.

with_probability_measure

with_probability_measure(
    probabilities=None, probability_measure=None
)

Set the probability measure on the domain of the random vector and return self for chaining.

The user can provide either a probability_measure or a probabilities mapping, but not both. If a probabilities mapping is provided, it is used to construct a ProbabilityMeasure on the domain of the random vector.

Parameters:

Name Type Description Default
probabilities Mapping[Hashable, Real] | None

A mapping from sample points in the domain to their corresponding probabilities. If given, this is used to construct a ProbabilityMeasure on the domain of the random vector.

None
probability_measure ProbabilityMeasure | None

The probability measure to set on the domain of the random vector.

None

SampleSpace

Bases: Index

A class representing a sample space.

An instance of SampleSpace is not intended to contain data; rather, it is used to model only the labels or indices of possible outcomes of a random experiment. Data is encoded in instances of RandomVariable and RandomVector.

Sample spaces support operations like creating events, converting to probability spaces, and iterating over outcomes.

Parameters:

Name Type Description Default
name Hashable | None

Name identifier for the sample space.

"Omega"
data_name Hashable | None

Name for the internal pd.Index.

"sample"

Examples:

>>> from sigalg.core import SampleSpace
>>> import pandas as pd
>>> # Construction with list
>>> Omega_1 = SampleSpace(name="Omega_1").from_list(["omega_0", "omega_1", "omega_2"])
>>> Omega_1
Sample space 'Omega_1':
['omega_0', 'omega_1', 'omega_2']
>>> # Construction with pd.Index
>>> idx = pd.Index(["a", "b", "c"], name="sample")
>>> Omega_2 = SampleSpace(name="Omega_2").from_pandas(data=idx)
>>> Omega_2
Sample space 'Omega_2':
['a', 'b', 'c']

generate_sequence classmethod

generate_sequence(
    size,
    initial_index=0,
    prefix="omega",
    name="Omega",
    data_name="sample",
)

Generate a default SampleSpace with sequential indices.

Creates a SampleSpace with sequentially numbered sample points, optionally prefixed by a given string.

Parameters:

Name Type Description Default
size int

Number of sample points to generate.

required
initial_index int

Starting integer for generating sample point names.

0
prefix Hashable | None

Prefix for naming sample points. If the prefix is a non-string hashable or None, numerical indices are used instead.

"omega"
name Hashable | None

Name identifier for the sample space.

"Omega"
data_name Hashable | None

Name for the internal pd.Index.

"sample"

Examples:

>>> from sigalg.core import SampleSpace
>>> # Generate sample space with string prefix
>>> Omega1 = SampleSpace.generate_sequence(size=3, prefix="s")
>>> Omega1
Sample space 'Omega':
['s_0', 's_1', 's_2']
>>> # Generate sample space with numerical indices
>>> Omega2 = SampleSpace.generate_sequence(size=2, initial_index=5, prefix=None, name="Numbers")
>>> Omega2
Sample space 'Numbers':
[5, 6]

get_event

get_event(event_indices, name='A')

Create an event from a list of sample point indices.

Constructs an Event object representing a subset of this sample space. All provided indices must exist in the sample space.

Parameters:

Name Type Description Default
event_indices list of Hashable

List of sample point indices to include in the event. Must be hashable items that exist in this sample space.

required
name Hashable

Name identifier for the event.

"A"

Returns:

Name Type Description
event Event

An Event object containing the specified sample points.

Examples:

>>> from sigalg.core import SampleSpace
>>> Omega = SampleSpace().from_list(["omega0", "omega1", "omega2", "omega3"])
>>> # Create event with specific sample points
>>> A = Omega.get_event(["omega0", "omega1"], name="A")
>>> # Create event with empty list
>>> empty_event = Omega.get_event([])

make_event_space

make_event_space(sigma_algebra=None)

Convert this sample space to an event space.

Creates an EventSpace object with this sample space as the underlying space. Optionally specify a sigma-algebra to define which events are measurable.

Parameters:

Name Type Description Default
sigma_algebra SigmaAlgebra

Sigma-algebra to use. If None, a power set sigma-algebra will be created.

None

Returns:

Name Type Description
event_space EventSpace

An EventSpace object with this sample space.

Examples:

>>> from sigalg.core import SampleSpace, SigmaAlgebra
>>> Omega = SampleSpace().from_list(["s0", "s1", "s2", "s3"])
>>> # Create with default power set sigma-algebra
>>> event_space = Omega.make_event_space()
>>> # Create with custom sigma-algebra
>>> F = SigmaAlgebra(sample_space=Omega).from_dict(
...     sample_id_to_atom_id={"s0": 0, "s1": 0, "s2": 1, "s3": 1},
... )
>>> event_space = Omega.make_event_space(sigma_algebra=F)

make_probability_space

make_probability_space(
    sigma_algebra=None, probability_measure=None
)

Convert this sample space to a probability space.

Creates a ProbabilitySpace object with this sample space as the underlying space. Optionally specify a sigma-algebra and probability measure. If not provided, defaults will be used.

Parameters:

Name Type Description Default
sigma_algebra SigmaAlgebra

Sigma-algebra to use. If None, a power set sigma-algebra will be created.

None
probability_measure ProbabilityMeasure

Probability measure to use. If None, a uniform probability measure will be created.

None

Returns:

Name Type Description
probability_space ProbabilitySpace

A ProbabilitySpace object with this sample space.

Examples:

>>> from sigalg.core import SampleSpace, ProbabilityMeasure
>>> Omega = SampleSpace().from_list(["s0", "s1", "s2"])
>>> # Create with default uniform measure
>>> prob_space = Omega.make_probability_space()
>>> # Create with custom probability measure
>>> probs = {"s0": 0.5, "s1": 0.3, "s2": 0.2}
>>> P = ProbabilityMeasure(sample_space=Omega).from_dict(probs)
>>> prob_space = Omega.make_probability_space(probability_measure=P)

SigmaAlgebra

A class representing a sigma algebra over a sample space.

This class represents a sigma algebra defined by a mapping from sample IDs to atom IDs within a given sample space.

Parameters:

Name Type Description Default
sample_space SampleSpace | None
The sample space over which the sigma algebra is defined. If `None`, it will be inferred either the `from_dict` or `from_pandas` methods.
None
name Hashable | None

The name of the sigma algebra.

"F"

Raises:

Type Description
TypeError

If name is provided and is not a hashable type, or if sample_space is provided and is not a SampleSpace instance.

Examples:

>>> from sigalg.core import SampleSpace, SigmaAlgebra
>>> sample_id_to_atom_id = {"s_1": "A", "s_2": "A", "s_3": "B"}
>>> F = SigmaAlgebra(name="F").from_dict(
...     sample_id_to_atom_id=sample_id_to_atom_id,
... )
>>> F
Sigma algebra 'F':
    atom ID
sample
s_1        A
s_2        A
s_3        B

atom_id_to_cardinality property

atom_id_to_cardinality

Get a mapping from atom IDs to their cardinalities in this sigma algebra.

Returns:

Name Type Description
atom_id_to_cardinality dict[Hashable, int]

A dictionary mapping each atom ID to the number of sample IDs it contains.

atom_id_to_event property

atom_id_to_event

Get a mapping from atom IDs to Event objects in this sigma algebra.

Returns:

Name Type Description
atom_id_to_event dict[Hashable, Event]

A dictionary mapping each atom ID to its corresponding Event object.

atom_id_to_sample_ids property

atom_id_to_sample_ids

Get a mapping from atom IDs to lists of sample IDs in this sigma algebra.

Returns:

Name Type Description
atom_id_to_sample_ids dict[Hashable, list[Hashable]]

A dictionary mapping each atom ID to a list of sample IDs contained in that atom.

atom_ids property

atom_ids

Get a list of atom IDs in this sigma algebra.

Returns:

Name Type Description
atom_ids list[Hashable]

A list of atom IDs in this sigma algebra.

data property

data

Get the underlying pd.Series.

Returns:

Name Type Description
data Series

A pd.Series mapping sample IDs to atom IDs.

name property writable

name

Get the name identifier for this sigma algebra.

Returns:

Name Type Description
name Hashable

The name of this sigma algebra.

num_atoms property

num_atoms

Get the number of atoms in this sigma algebra.

Returns:

Name Type Description
num_atoms int

The number of atoms in this sigma algebra.

sample_id_to_atom_id property

sample_id_to_atom_id

Get the mapping from sample IDs to atom IDs.

Returns:

Name Type Description
sample_id_to_atom_id Mapping[Hashable, Hashable]

A mapping from sample IDs to atom IDs.

from_dict

from_dict(sample_id_to_atom_id)

Initialize the sigma algebra from a dictionary mapping sample IDs to atom IDs.

If a sample_space was not provided during initialization, it will be created from the keys of the provided mapping. If it was provided, the keys of the mapping must match the sample space.

Parameters:

Name Type Description Default
sample_id_to_atom_id Mapping[Hashable, Hashable]

A mapping from sample IDs to atom IDs.

required

Returns:

Name Type Description
self SigmaAlgebra

The current SigmaAlgebra instance with updated mapping.

from_event classmethod

from_event(event)

Create the sigma algebra generated by a single event.

Parameters:

Name Type Description Default
event Event

The event to generate the sigma algebra from.

required

Returns:

Name Type Description
sigma_algebra SigmaAlgebra

A new SigmaAlgebra instance generated by the given event.

from_pandas

from_pandas(data)

Create a SigmaAlgebra from a pd.Series.

If a sample_space was not provided during initialization, it will be created from the index of the provided pd.Series. If it was provided, the index of the pd.Series must match the sample space.

Parameters:

Name Type Description Default
data Series

pd.Series object to use for the sigma algebra.

required

Raises:

Type Description
TypeError

If data is not a pd.Series.

Returns:

Name Type Description
self SigmaAlgebra

The current SigmaAlgebra instance with updated data.

Examples:

>>> from sigalg.core import SigmaAlgebra
>>> import pandas as pd
>>> # Create a sigma algebra from a series with custom index
>>> data = pd.Series(['A', 'A', 'B'], index=['s_0', 's_1', 's_2'])
>>> F = SigmaAlgebra().from_pandas(data)
>>> F
Sigma algebra 'F':
    atom ID
sample
s_0          A
s_1          A
s_2          B
>>> # Check the automatically generated sample space
>>> F.sample_space
Sample space 'Omega':
['s_0', 's_1', 's_2']
>>> # Change the name of the sample space
>>> F.sample_space.name = 'S'
>>> F.sample_space
Sample space 'S':
['s_0', 's_1', 's_2']
>>> # Create another sigma algebra from series with default index
>>> new_data = pd.Series([0, 0, 1])
>>> G = SigmaAlgebra(name="G").from_pandas(new_data)
>>> G
Sigma algebra 'G':
        atom ID
sample
0             0
1             0
2             1
>>> G.sample_space
Sample space 'Omega':
[0, 1, 2]

from_random_vector classmethod

from_random_vector(
    rv,
    discretize=False,
    n_bins=10,
    use_pca=False,
    n_components=None,
)

Create a sigma algebra induced by a random vector.

Parameters:

Name Type Description Default
rv RandomVector

The random vector to induce the sigma algebra from.

required
discretize bool

Whether to discretize continuous data using binning.

False
n_bins int

Number of bins per dimension (only used if discretize=True).

10
use_pca bool

Whether to apply PCA before discretization (only used if discretize=True).

False
n_components int | None

Number of principal components (only used if discretize=True and use_pca=True).

None

Returns:

Name Type Description
sigma_algebra SigmaAlgebra

A new SigmaAlgebra instance induced by the given random vector.

get_atom_containing

get_atom_containing(sample_id)

Get the atom containing a given sample ID.

Parameters:

Name Type Description Default
sample_id Hashable

The sample ID for which to retrieve the containing atom.

required

Raises:

Type Description
ValueError

If sample_id is not in the sample space of this sigma algebra.

Returns:

Name Type Description
atom Event

The Event object representing the atom that contains the given sample ID.

is_measurable

is_measurable(event)

Check if an event is measurable with respect to this sigma algebra.

Parameters:

Name Type Description Default
event Event

The event to check for measurability.

required

Raises:

Type Description
TypeError

If event is not an Event instance.

ValueError

If event does not have the same sample space as this sigma algebra.

Returns:

Name Type Description
is_measurable bool

True if the event is measurable with respect to this sigma algebra, False otherwise.

Examples:

>>> from sigalg.core import Event, SampleSpace, SigmaAlgebra
>>> sample_space = SampleSpace.generate_sequence(size=3, initial_index=1, prefix="s")
>>> sample_id_to_atom_id = {"s_1": "A", "s_2": "A", "s_3": "B"}
>>> sigma_algebra = SigmaAlgebra(sample_space=sample_space).from_dict(
...     sample_id_to_atom_id=sample_id_to_atom_id,
... )
>>> A = Event(sample_space=sample_space, name="A").from_list(["s_1", "s_2"])
>>> B = Event(sample_space=sample_space, name="B").from_list(["s_3"])
>>> C = Event(sample_space=sample_space, name="C").from_list(["s_1"])
>>> sigma_algebra.is_measurable(A)
True
>>> sigma_algebra.is_measurable(B)
True
>>> sigma_algebra.is_measurable(C)
False

power_set classmethod

power_set(sample_space, name='power_set')

Create the power-set sigma algebra over a given sample space.

The power-set sigma algebra contains all possible subsets of the sample space, meaning each sample point is its own atom. It is the finest sigma algebra possible over the given sample space.

Parameters:

Name Type Description Default
sample_space SampleSpace

The sample space over which to create the power-set sigma algebra.

required
name Hashable

Name identifier for the sigma algebra.

'power_set'

Returns:

Name Type Description
sigma_algebra SigmaAlgebra

A new SigmaAlgebra instance representing the power-set sigma algebra.

Examples:

>>> from sigalg.core import SampleSpace, SigmaAlgebra
>>> sample_space = SampleSpace.generate_sequence(size=3, initial_index=1, prefix="s")
>>> G = SigmaAlgebra.power_set(sample_space, name="G")
>>> # Each sample point is its own atom in the power-set sigma algebra
>>> G
Sigma algebra 'G':
    atom ID
sample
s_1        0
s_2        1
s_3        2

to_atoms

to_atoms()

Get a list of atoms as Event objects in this sigma algebra.

Returns:

Name Type Description
atoms list[Event]

A list of Event objects representing the atoms in this sigma algebra.

trivial classmethod

trivial(sample_space, name='trivial')

Create the trivial sigma algebra over a given sample space.

The trivial sigma algebra contains only the empty set and the entire sample space, meaning all sample points belong to the same atom. It is the coarsest sigma algebra possible over the given sample space.

Parameters:

Name Type Description Default
sample_space SampleSpace

The sample space over which to create the trivial sigma algebra.

required
name Hashable

Name identifier for the sigma algebra.

'trivial'

Returns:

Name Type Description
sigma_algebra SigmaAlgebra

A new SigmaAlgebra instance representing the trivial sigma algebra.

Examples:

>>> from sigalg.core import SampleSpace, SigmaAlgebra
>>> sample_space = SampleSpace.generate_sequence(size=3, initial_index=1, prefix="s")
>>> F = SigmaAlgebra.trivial(sample_space, name="F")
>>> # All sample points belong to the same atom in the trivial sigma algebra
>>> F
Sigma algebra 'F':
        atom ID
sample
s_1        0
s_2        0
s_3        0

with_name

with_name(name)

Set the name of the sigma algebra and return self for chaining.

Parameters:

Name Type Description Default
name Hashable

The new name for the sigma algebra.

required

Returns:

Name Type Description
self SigmaAlgebra

The current instance with the updated name.

Time

Bases: Index

A class representing a time index.

Parameters:

Name Type Description Default
name Hashable | None

Name identifier for the index.

"T"
data_name Hashable | None

Name for the internal pd.Index.

"time"

Examples:

>>> from sigalg.core import Time
>>> # Discrete time
>>> time_discrete = Time.discrete(start=0, length=5)
>>> time_discrete
Time 'T':
[0, 1, 2, 3, 4, 5]
>>> # Continuous time
>>> time_continuous = Time.continuous(start=0.0, stop=1.0, num_points=9)
>>> time_continuous
Time 'T':
[0.0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0]

continuous classmethod

continuous(
    start,
    stop,
    dt=None,
    num_points=None,
    name="T",
    data_name="time",
)

Create a continuous time index with real-valued time points.

Generates a time index with real-valued time points either by specifying the time step (dt) or the number of points (num_points). Exactly one of these parameters must be provided.

Parameters:

Name Type Description Default
start Real

Starting time point.

required
stop Real

Ending time point.

required
dt Real

Time step between consecutive points. Mutually exclusive with num_points.

None
num_points int

Number of evenly-spaced points to generate. Mutually exclusive with dt.

None
name Hashable | None

Name identifier for the index.

"T"
data_name Hashable | None

Name for the internal pd.Index.

"time"

Returns:

Name Type Description
time Time

A continuous time index with real-valued time points.

Raises:

Type Description
ValueError

If both dt and num_points are specified, or if neither is specified. Also raised if start is not less than stop, or if dt is not positive, or if num_points is less than 2.

TypeError

If start, stop, or dt (if given) are not real numbers, or if num_points (if given) is not an integer.

Examples:

>>> from sigalg.core import Time
>>> # Using num_points
>>> time1 = Time.continuous(start=0.0, stop=1.0, num_points=3)
>>> list(time1)
[0.0, 0.5, 1.0]
>>> # Using dt
>>> time2 = Time.continuous(start=0.0, stop=1.0, dt=0.25)
>>> len(time2)
5

discrete classmethod

discrete(
    length=None,
    start=0,
    stop=None,
    name="T",
    data_name="time",
)

Create a discrete time index with integer time steps.

Generates a time index with consecutive integer time points starting from the specified start. The user may pass either the length of the time interval, or the stop value, but not both. The relation between the three parameters is length = stop - start.

Parameters:

Name Type Description Default
length int | None

Number of time points to generate. Must be positive.

None
start int

Starting time point.

0
stop int | None

Ending time point. Mutually exclusive with length.

None
name Hashable | None

Name identifier for the index.

"T"
data_name Hashable | None

Name for the internal pd.Index.

"time"

Returns:

Name Type Description
time Time

A discrete time index with integer time points.

Raises:

Type Description
ValueError

If length is not a positive integer or if stop is not an integer greater than start, or if both length and stop are specified, or if neither is specified.

TypeError

If start is not an integer.

Examples:

>>> from sigalg.core import Time
>>> time = Time.discrete(start=0, length=5)
>>> list(time)
[0, 1, 2, 3, 4, 5]
>>> time.is_discrete
True

find_nearest_time

find_nearest_time(time_point)

Find the nearest time point to the given value.

Parameters:

Name Type Description Default
time_point Real

The time point to find the nearest index for.

required

Returns:

Name Type Description
time Real

The nearest time point in the Time index.

Raises:

Type Description
ValueError

If the Time index is empty.

from_list

from_list(indices, is_discrete=True)

Create a Time from a list of time points.

The time points can represent either discrete time steps (integers) or continuous time points (real numbers). They must be monotonically increasing and are used as the temporal dimension for stochastic processes and other objects.

Parameters:

Name Type Description Default
indices list[Real]

List of real-valued time points to use for the index.

required
is_discrete bool

Whether the time index represents discrete (True) or continuous (False) time.

True

Returns:

Name Type Description
self Time

The current Time instance with updated indices.

insert_time

insert_time(time)

Insert a new time point into the Time index.

Parameters:

Name Type Description Default
time Real

The time point to insert. Must be an integer for discrete Time indices.

required

Raises:

Type Description
TypeError

If time is not a real number.

ValueError

If the Time index is empty or if time already exists in the Time index.

Returns:

Name Type Description
new_time Time

A new Time object with the inserted time point.

remove_time

remove_time(time=None, pos=None)

Remove a time point from the Time index.

Parameters:

Name Type Description Default
time Real | None

The time point to remove. Must be specified if pos is not provided.

None
pos int | None

The position of the time point to remove. Must be specified if time is not provided.

None

Raises:

Type Description
TypeError

If time is not a real number or pos is not an integer.

ValueError

If the Time index is empty, if time does not exist in the Time index, if pos is out of bounds, if both time and pos are provided, or if neither is provided.

Returns:

Name Type Description
new_time Time

A new Time object with the specified time point removed.

is_refinement

is_refinement(coarser_algebra, finer_algebra)

Check if one sigma algebra is a refinement of another.

Parameters:

Name Type Description Default
coarser_algebra SigmaAlgebra

The candidate coarser algebra.

required
finer_algebra SigmaAlgebra

The candidate finer algebra.

required

Returns:

Name Type Description
is_refinement bool

True if finer_algebra is a refinement of coarser_algebra, False otherwise.

is_subalgebra

is_subalgebra(sub_algebra, super_algebra)

Check if one sigma algebra is a subalgebra of another.

Parameters:

Name Type Description Default
sub_algebra SigmaAlgebra

The candidate subalgebra.

required
super_algebra SigmaAlgebra

The candidate superalgebra.

required

Returns:

Name Type Description
is_subalgebra bool

True if sub_algebra is a subalgebra of super_algebra, False otherwise.

join

join(sigma_algebras, name='join')

Compute the join (least upper bound) of a list of sigma algebras.

Parameters:

Name Type Description Default
sigma_algebras list[SigmaAlgebra]

A list of SigmaAlgebra instances to join.

required
name Hashable | None

Name identifier for the resulting sigma algebra.

"join"

Raises:

Type Description
TypeError

If the input is not a list of SigmaAlgebra instances.

ValueError

If the list is empty or if the SigmaAlgebra instances do not share the same sample space.