Statistics¶

Classes for statistical distributions

class ionworkspipeline.data_fits.stats.Distribution(distribution=None)¶

Base class for sampling from probability distributions.

property argmin: float | ndarray¶

The point at which the normalized negative logpdf (evaluate_to_scalar) achieves its minimum value of zero. This is the negative logpdf shifted such that its minimum value is zero. For most distributions, this occurs at the mode of the distribution.

Returns¶

float or np.ndarray: The location of the minimum of the normalized negative logpdf.

Raises¶

NotImplementedError: This method should be implemented by the subclass.

cdf(x: float | ndarray) → float | ndarray¶

Cumulative distribution function of the distribution.

Parameters¶

xfloat or array_like: Points at which to evaluate the CDF.

Returns¶

cdffloat or ndarray: Cumulative distribution function evaluated at x.

property distribution: Any¶

Returns the underlying scipy.stats distribution object.

Returns¶

distributionscipy.stats distribution: The underlying distribution object.

evaluate_to_array(x: float | ndarray) → ndarray¶

Compute the residuals whose sum of squares equals evaluate_to_scalar(x). That is, np.sum(evaluate_to_array(x)**2) == evaluate_to_scalar(x).

Parameters¶

xfloat or np.ndarray: Point at which to evaluate.

Returns¶

np.ndarray: Residuals whose squared sum equals the normalized negative logpdf.

evaluate_to_scalar(x: float | ndarray) → float¶

Compute the negative log probability density function (negative logpdf), normalized such that the minimum value is zero. This minimum occurs at the distribution’s argmin.

Parameters¶

xfloat or np.ndarray: Point at which to evaluate.

Returns¶

float: Normalized negative logpdf, with minimum value of 0 at the distribution’s argmin.

property multivariate: bool¶

Whether the distribution is multivariate.

Returns¶

multivariatebool: True if the distribution is multivariate, False otherwise.

pdf(x: float | ndarray) → float | ndarray¶

Probability density function of the distribution.

Parameters¶

xfloat or array_like: Points at which to evaluate the PDF.

Returns¶

pdffloat or ndarray: Probability density function evaluated at x.

ppf(U: ndarray) → ndarray¶

Percent point function (inverse of CDF) - transform samples from the standard uniform distribution to the distribution of the class.

Parameters¶

Unp.ndarray: Samples from the standard uniform distribution.

Returns¶

samplesnp.ndarray: Transformed samples from the target distribution.

rand(n: int | None = None) → ndarray¶

Draw random samples from the distribution.

Parameters¶

nint, optional: Number of samples to draw.

Returns¶

samplesndarray: Random samples from the distribution.

property univariate: bool¶: Whether the distribution is univariate.

property zero_variance: bool¶

Whether the distribution has zero variance.

Returns¶

has_zero_variancebool: True if the distribution has zero variance, False otherwise.

class ionworkspipeline.data_fits.stats.Normal(mean: float, std: float)¶

Univariate normal distribution.

Extends: ionworkspipeline.data_fits.stats.stats.Distribution

property argmin: float¶

The point at which the normalized negative logpdf achieves its minimum value of zero. For the normal distribution, this occurs at the mean, where the negative logpdf is shifted to be zero.

Returns¶

float: The mean of the distribution.

evaluate_to_array(x: float) → ndarray¶

Compute the residual for univariate normal: (x - μ) / σ / √2

When squared and summed, equals evaluate_to_scalar(x).

Parameters¶

xfloat: Point at which to evaluate.

Returns¶

np.ndarray: Residual whose square equals the normalized negative logpdf.

evaluate_to_scalar(x: float) → float¶

Compute the normalized negative logpdf for univariate normal: ((x - μ)² / σ²) / 2

The minimum value of 0 occurs at x = μ (the mean).

Parameters¶

xfloat: Point at which to evaluate.

Returns¶

float: Normalized negative logpdf, equals 0 at x = μ.

property mean: float¶

Mean of the normal distribution.

Returns¶

meanfloat: Mean of the distribution.

property multivariate: bool¶

Whether the distribution is multivariate.

Returns¶

multivariatebool: Always False for univariate normal.

property std: float¶

Standard deviation of the normal distribution.

Returns¶

stdfloat: Standard deviation of the distribution.

class ionworkspipeline.data_fits.stats.MultivariateNormal(mean: ndarray, cov: ndarray)¶

Multivariate normal distribution.

Extends: ionworkspipeline.data_fits.stats.stats.Distribution

property argmin: ndarray¶

The point at which the normalized negative logpdf achieves its minimum value of zero. For the multivariate normal distribution, this occurs at the mean vector, where the negative logpdf is shifted to be zero.

Returns¶

np.ndarray: The mean vector of the distribution.

property cholesky_cov: ndarray¶: Cholesky factor of the covariance matrix of the multivariate normal distribution.

property cholesky_inv_cov: ndarray¶: Cholesky factor of the inverse covariance matrix of the multivariate normal distribution.

property cov: ndarray¶

Covariance matrix of the multivariate normal distribution.

Returns¶

covnp.ndarray: Covariance matrix of the distribution.

evaluate_to_array(x: ndarray) → ndarray¶

Compute the residuals for multivariate normal: Σ⁻¹/₂ (x - μ) / √2

When squared and summed, equals evaluate_to_scalar(x).

Parameters¶

xnp.ndarray: Point at which to evaluate.

Returns¶

np.ndarray: Residuals whose squared sum equals the normalized negative logpdf.

evaluate_to_scalar(x: ndarray) → float¶

Compute the normalized negative logpdf for multivariate normal: (x - μ)ᵀ Σ⁻¹ (x - μ) / 2

The minimum value of 0 occurs at x = μ (the mean vector).

Parameters¶

xnp.ndarray: Point at which to evaluate.

Returns¶

float: Normalized negative logpdf, equals 0 at x = μ.

property inv_cholesky_cov: ndarray¶: Inverse Cholesky factor of the covariance matrix of the multivariate normal distribution.

property inv_cov: ndarray¶: Inverse covariance matrix of the multivariate normal distribution.

property mean: ndarray¶

Mean vector of the multivariate normal distribution.

Returns¶

meannp.ndarray: Mean vector of the distribution.

property multivariate: bool¶

Whether the distribution is multivariate.

Returns¶

multivariatebool: Always True for multivariate normal.

ppf(U: ndarray) → ndarray¶

Percent point function (inverse of CDF) - transform samples from the standard uniform distribution to the multivariate normal distribution. Vectorized Rosenblatt transform from Uniform(0,1) samples to a correlated multivariate normal N(mu, sigma).

Note: Unlike univariate distributions, the PPF for multivariate distributions is not unique. Many different points in the multivariate space can have the same cumulative probability. This implementation uses the Rosenblatt transformation to create a deterministic mapping from uniform samples to the multivariate normal distribution while preserving the correlation structure.

Parameters¶

Unp.ndarray: Samples from the standard uniform distribution.

Returns¶

samplesnp.ndarray: Samples from the multivariate normal distribution.

class ionworkspipeline.data_fits.stats.Uniform(lb: float, ub: float)¶

Uniform distribution.

Extends: ionworkspipeline.data_fits.stats.stats.Distribution

property argmin: float¶

The point at which the normalized negative logpdf achieves its minimum value of zero. For the uniform distribution, the negative logpdf equals zero for all points in [lb, ub] and infinity outside this interval. By convention, we return the midpoint of the interval.

Returns¶

float: (lb + ub) / 2, one of the points where the normalized negative logpdf equals zero.

evaluate_to_array(x: float) → ndarray¶

Return a single-element array containing the square root of evaluate_to_scalar(x). Returns 0 for x in [lb, ub], infinity otherwise.

When squared, equals evaluate_to_scalar(x).

Parameters¶

xfloat: Point at which to evaluate.

Returns¶

np.ndarray: Single-element array containing 0 if x is in [lb, ub], infinity otherwise.

evaluate_to_scalar(x: float) → float¶

Compute the normalized negative logpdf for uniform distribution. Returns 0 for x in [lb, ub], infinity otherwise.

The minimum value of 0 occurs for all x in [lb, ub].

Parameters¶

xfloat: Point at which to evaluate.

Returns¶

float: 0 if x is in [lb, ub], infinity otherwise.

property lb: float¶

Lower bound of the uniform distribution.

Returns¶

lbfloat: Lower bound of the distribution.

property multivariate: bool¶

Whether the distribution is multivariate.

Returns¶

multivariatebool: Always False for univariate uniform.

property ub: float¶

Upper bound of the uniform distribution.

Returns¶

ubfloat: Upper bound of the distribution.

class ionworkspipeline.data_fits.stats.LogNormal(mean: float, std: float)¶

Univariate lognormal distribution.

Extends: ionworkspipeline.data_fits.stats.stats.Distribution

property argmin: float¶

The point at which the normalized negative logpdf achieves its minimum value of zero. For the lognormal distribution, this occurs at exp(μ - σ²), where μ and σ are the mean and standard deviation of the underlying normal distribution. The negative logpdf is shifted to be zero at this point.

Returns¶

float: exp(mean - std²), where the normalized negative logpdf equals zero.

evaluate_to_array(x: ndarray) → ndarray¶

Compute the single residual for univariate lognormal. For x > 0, computes: ((log(x) - μ) / σ + σ) / √2

When squared, equals evaluate_to_scalar(x).

Parameters¶

xnp.ndarray: Point at which to evaluate.

Returns¶

np.ndarray: Single residual whose square equals the normalized negative logpdf. Returns infinity for x ≤ 0.

evaluate_to_scalar(x: ndarray) → float¶

Compute the normalized negative logpdf for univariate lognormal. For x > 0, computes: ((log(x) - μ) / σ + σ)² / 2

The minimum value of 0 occurs at x = exp(μ - σ²).

Parameters¶

xnp.ndarray: Point at which to evaluate.

Returns¶

float: Normalized negative logpdf, equals 0 at x = exp(μ - σ²). Returns infinity for x ≤ 0.

property mean: float¶

Mean of the underlying normal distribution.

Returns¶

meanfloat: Mean of the underlying normal distribution.

property multivariate: bool¶

Whether the distribution is multivariate.

Returns¶

multivariatebool: Always False for univariate lognormal.

ppf(U: ndarray) → ndarray¶

Percent point function (inverse of CDF) - transform samples from the standard uniform distribution to the lognormal distribution.

Parameters¶

Unp.ndarray: Samples from the standard uniform distribution.

Returns¶

samplesnp.ndarray: Samples from the lognormal distribution.

property std: float¶

Standard deviation of the underlying normal distribution.

Returns¶

stdfloat: Standard deviation of the underlying normal distribution.

class ionworkspipeline.data_fits.stats.MultivariateLogNormal(mean: ndarray, cov: ndarray)¶

Multivariate lognormal distribution.

Extends: ionworkspipeline.data_fits.stats.stats.Distribution

property argmin: ndarray¶

The point at which the normalized negative logpdf achieves its minimum value of zero. For the multivariate lognormal distribution, this occurs at exp(μ - diag(Σ)), where μ is the mean vector and Σ is the covariance matrix of the underlying multivariate normal distribution. The negative logpdf is shifted to be zero at this point.

Returns¶

np.ndarray: exp(mean - rowsum(cov)), where the normalized negative logpdf equals zero.

cdf(x: ndarray) → float¶

Cumulative distribution function of the multivariate lognormal distribution.

Parameters¶

xarray_like: Points at which to evaluate the CDF.

Returns¶

cdffloat: Cumulative distribution function evaluated at x.

property cov: ndarray¶

Covariance matrix of the underlying multivariate normal distribution.

Returns¶

covnp.ndarray: Covariance matrix of the underlying multivariate normal distribution.

evaluate_to_array(x: ndarray) → ndarray¶

Compute the residuals for multivariate lognormal. For x > 0, computes: Σ⁻¹/₂ (log(x) - μ + Σ1) / √2 where Σ1 is the row sums of the covariance matrix.

When squared and summed, equals evaluate_to_scalar(x).

Parameters¶

xnp.ndarray: Point at which to evaluate.

Returns¶

np.ndarray: Residuals whose squared sum equals the normalized negative logpdf. Returns infinity for components where x ≤ 0.

evaluate_to_scalar(x: ndarray) → float¶

Compute the normalized negative logpdf for multivariate lognormal. For x > 0, computes: ((log(x) - μ)ᵀ Σ⁻¹ (log(x) - μ)) / 2 + sum(log(x))

The minimum value of 0 occurs at x = exp(μ - diag(Σ)).

Parameters¶

xnp.ndarray: Point at which to evaluate.

Returns¶

float: Normalized negative logpdf, equals 0 at x = exp(μ - diag(Σ)). Returns infinity if any component of x ≤ 0.

property mean: ndarray¶

Mean vector of the underlying multivariate normal distribution.

Returns¶

meannp.ndarray: Mean vector of the underlying multivariate normal distribution.

property multivariate: bool¶

Whether the distribution is multivariate.

Returns¶

multivariatebool: Always True for multivariate lognormal.

pdf(x: ndarray) → float¶

Probability density function of the multivariate lognormal distribution.

Parameters¶

xarray_like: Points at which to evaluate the PDF.

Returns¶

pdffloat: Probability density function evaluated at x.

ppf(U: ndarray) → ndarray¶

Percent point function (inverse of CDF) - transform samples from the standard uniform distribution to the multivariate lognormal distribution.

Parameters¶

Unp.ndarray: Samples from the standard uniform distribution.

Returns¶

samplesnp.ndarray: Samples from the multivariate lognormal distribution.

rand(n: int | None = None) → ndarray¶

Draw random samples from the distribution.

Parameters¶

nint, optional: Number of samples to draw.

Returns¶

samplesnp.ndarray: Random samples from the multivariate lognormal distribution.

class ionworkspipeline.data_fits.stats.PointMass(value: float)¶

PointMass distribution (constant value).

Extends: ionworkspipeline.data_fits.stats.stats.Distribution

property argmin: float¶

The point at which the normalized negative logpdf achieves its minimum value of zero. For a point mass distribution, the negative logpdf equals zero only at the point mass value and infinity elsewhere.

Returns¶

float: The point mass value, where the normalized negative logpdf equals zero.

cdf(x: float | ndarray) → float | ndarray¶

Cumulative distribution function of the point mass distribution.

Parameters¶

xfloat or array_like: Points at which to evaluate the CDF.

Returns¶

cdffloat or ndarray: Cumulative distribution function evaluated at x (0 for x < value, 1 for x >= value).

evaluate_to_array(x: float) → ndarray¶

Return a single-element array containing the square root of evaluate_to_scalar(x). Returns 0 if x equals the point mass value, infinity otherwise.

When squared, equals evaluate_to_scalar(x).

Parameters¶

xfloat: Point at which to evaluate.

Returns¶

np.ndarray: Single-element array containing 0 if x equals the point mass value, infinity otherwise.

evaluate_to_scalar(x: float) → float¶

Compute the normalized negative logpdf for point mass distribution. Returns 0 if x equals the point mass value, infinity otherwise.

The minimum value of 0 occurs only at x = value.

Parameters¶

xfloat: Point at which to evaluate.

Returns¶

float: 0 if x equals the point mass value, infinity otherwise.

property multivariate: bool¶

Whether the distribution is multivariate.

Returns¶

multivariatebool: Always False for point mass.

pdf(x: float | ndarray) → float | ndarray¶

Probability density function of the point mass distribution.

Parameters¶

xfloat or array_like: Points at which to evaluate the PDF.

Returns¶

pdffloat or ndarray: Probability density function evaluated at x (infinity at the constant value, 0 elsewhere).

ppf(U: ndarray) → ndarray¶

Percent point function (inverse of CDF) - transform samples from the standard uniform distribution to the point mass distribution.

Parameters¶

Unp.ndarray: Samples from the standard uniform distribution.

Returns¶

samplesnp.ndarray: Samples from the point mass distribution (all equal to the constant value).

rand(n: int | None = None) → ndarray¶

Draw random samples from the distribution.

Parameters¶

nint, optional: Number of samples to draw.

Returns¶

samplesndarray: Samples from the point mass distribution (all equal to the constant value).

property value: float¶

The constant value of the point mass distribution.

Returns¶

valuefloat: The constant value.

property zero_variance: bool¶

Whether the distribution has zero variance.

Returns¶

has_zero_variancebool: Always True for point mass.

class ionworkspipeline.data_fits.stats.Dirichlet(alpha: ndarray)¶

Dirichlet distribution.

Extends: ionworkspipeline.data_fits.stats.stats.Distribution

property alpha: ndarray¶

Concentration parameters of the Dirichlet distribution.

Returns¶

alphanp.ndarray: Concentration parameters.

property argmin: ndarray¶

The point at which the normalized negative logpdf achieves its minimum value of zero. For the Dirichlet distribution, this occurs at the mode.

Returns¶

np.ndarray: The mode of the distribution, where all alpha_i > 1.

cdf(x: float | ndarray) → float | ndarray¶

Cumulative distribution function of the distribution.

Parameters¶

xfloat or array_like: Points at which to evaluate the CDF.

Returns¶

cdffloat or ndarray: Cumulative distribution function evaluated at x.

evaluate_to_array(x: ndarray) → ndarray¶

Compute the residuals for Dirichlet distribution.

When squared and summed, equals evaluate_to_scalar(x).

Parameters¶

xnp.ndarray: Point at which to evaluate.

Returns¶

np.ndarray: Residuals whose squared sum equals the normalized negative logpdf.

evaluate_to_scalar(x: ndarray) → float¶

Compute the normalized negative logpdf for Dirichlet distribution. For x on the simplex (x_i > 0 and sum(x) = 1), computes: -sum((alpha_i - 1) * log(x_i))

The minimum value of 0 occurs at the mode of the distribution.

Parameters¶

xnp.ndarray: Point at which to evaluate.

Returns¶

float: Normalized negative logpdf, equals 0 at the mode. Returns infinity if any x_i <= 0 or sum(x) != 1.

property multivariate: bool¶

Whether the distribution is multivariate.

Returns¶

multivariatebool: Always True for Dirichlet distribution.

pdf(x: float | ndarray) → float | ndarray¶

Probability density function of the Dirichlet distribution.

Parameters¶

xnp.ndarray: Points at which to evaluate the PDF. Must be on the simplex (all x_i > 0 and sum(x) = 1).

Returns¶

pdffloat | np.ndarray: Probability density function evaluated at x. Returns 0 if any x_i <= 0 or sum(x) != 1.

ppf(U: ndarray) → ndarray¶

Percent point function (inverse of CDF) for Dirichlet distribution.

For Dirichlet, we use the property that if X_i ~ Gamma(alpha_i, 1), then [X_1, X_2, …, X_k] / sum(X) ~ Dirichlet(alpha_1, alpha_2, …, alpha_k)

Parameters¶

Unp.ndarray: Samples from the standard uniform distribution.

Returns¶

samplesnp.ndarray: Samples from the Dirichlet distribution.