Statistics¶
Classes for statistical distributions
- class ionworkspipeline.data_fits.stats.Distribution(distribution=None)¶
Base class for sampling from probability distributions.
- property argmin: float | ndarray¶
The point at which the normalized negative logpdf (evaluate_to_scalar) achieves its minimum value of zero. This is the negative logpdf shifted such that its minimum value is zero. For most distributions, this occurs at the mode of the distribution.
Returns¶
- float or np.ndarray
The location of the minimum of the normalized negative logpdf.
Raises¶
- NotImplementedError
This method should be implemented by the subclass.
- cdf(x: float | ndarray) float | ndarray ¶
Cumulative distribution function of the distribution.
Parameters¶
- xfloat or array_like
Points at which to evaluate the CDF.
Returns¶
- cdffloat or ndarray
Cumulative distribution function evaluated at x.
- property distribution: Any¶
Returns the underlying scipy.stats distribution object.
Returns¶
- distributionscipy.stats distribution
The underlying distribution object.
- evaluate_to_array(x: float | ndarray) ndarray ¶
Compute the residuals whose sum of squares equals evaluate_to_scalar(x). That is, np.sum(evaluate_to_array(x)**2) == evaluate_to_scalar(x).
Parameters¶
- xfloat or np.ndarray
Point at which to evaluate.
Returns¶
- np.ndarray
Residuals whose squared sum equals the normalized negative logpdf.
- evaluate_to_scalar(x: float | ndarray) float ¶
Compute the negative log probability density function (negative logpdf), normalized such that the minimum value is zero. This minimum occurs at the distribution’s argmin.
Parameters¶
- xfloat or np.ndarray
Point at which to evaluate.
Returns¶
- float
Normalized negative logpdf, with minimum value of 0 at the distribution’s argmin.
- property multivariate: bool¶
Whether the distribution is multivariate.
Returns¶
- multivariatebool
True if the distribution is multivariate, False otherwise.
- pdf(x: float | ndarray) float | ndarray ¶
Probability density function of the distribution.
Parameters¶
- xfloat or array_like
Points at which to evaluate the PDF.
Returns¶
- pdffloat or ndarray
Probability density function evaluated at x.
- ppf(U: ndarray) ndarray ¶
Percent point function (inverse of CDF) - transform samples from the standard uniform distribution to the distribution of the class.
Parameters¶
- Unp.ndarray
Samples from the standard uniform distribution.
Returns¶
- samplesnp.ndarray
Transformed samples from the target distribution.
- rand(n: int | None = None) ndarray ¶
Draw random samples from the distribution.
Parameters¶
- nint, optional
Number of samples to draw.
Returns¶
- samplesndarray
Random samples from the distribution.
- property univariate: bool¶
Whether the distribution is univariate.
- class ionworkspipeline.data_fits.stats.Normal(mean: float, std: float)¶
Univariate normal distribution.
Extends:
ionworkspipeline.data_fits.stats.stats.Distribution
- property argmin: float¶
The point at which the normalized negative logpdf achieves its minimum value of zero. For the normal distribution, this occurs at the mean, where the negative logpdf is shifted to be zero.
Returns¶
- float
The mean of the distribution.
- evaluate_to_array(x: float) ndarray ¶
Compute the residual for univariate normal: (x - μ) / σ / √2
When squared and summed, equals evaluate_to_scalar(x).
Parameters¶
- xfloat
Point at which to evaluate.
Returns¶
- np.ndarray
Residual whose square equals the normalized negative logpdf.
- evaluate_to_scalar(x: float) float ¶
Compute the normalized negative logpdf for univariate normal: ((x - μ)² / σ²) / 2
The minimum value of 0 occurs at x = μ (the mean).
Parameters¶
- xfloat
Point at which to evaluate.
Returns¶
- float
Normalized negative logpdf, equals 0 at x = μ.
- class ionworkspipeline.data_fits.stats.MultivariateNormal(mean: ndarray, cov: ndarray)¶
Multivariate normal distribution.
Extends:
ionworkspipeline.data_fits.stats.stats.Distribution
- property argmin: ndarray¶
The point at which the normalized negative logpdf achieves its minimum value of zero. For the multivariate normal distribution, this occurs at the mean vector, where the negative logpdf is shifted to be zero.
Returns¶
- np.ndarray
The mean vector of the distribution.
- property cholesky_cov: ndarray¶
Cholesky factor of the covariance matrix of the multivariate normal distribution.
- property cholesky_inv_cov: ndarray¶
Cholesky factor of the inverse covariance matrix of the multivariate normal distribution.
- property cov: ndarray¶
Covariance matrix of the multivariate normal distribution.
Returns¶
- covnp.ndarray
Covariance matrix of the distribution.
- evaluate_to_array(x: ndarray) ndarray ¶
Compute the residuals for multivariate normal: Σ⁻¹/₂ (x - μ) / √2
When squared and summed, equals evaluate_to_scalar(x).
Parameters¶
- xnp.ndarray
Point at which to evaluate.
Returns¶
- np.ndarray
Residuals whose squared sum equals the normalized negative logpdf.
- evaluate_to_scalar(x: ndarray) float ¶
Compute the normalized negative logpdf for multivariate normal: (x - μ)ᵀ Σ⁻¹ (x - μ) / 2
The minimum value of 0 occurs at x = μ (the mean vector).
Parameters¶
- xnp.ndarray
Point at which to evaluate.
Returns¶
- float
Normalized negative logpdf, equals 0 at x = μ.
- property inv_cholesky_cov: ndarray¶
Inverse Cholesky factor of the covariance matrix of the multivariate normal distribution.
- property inv_cov: ndarray¶
Inverse covariance matrix of the multivariate normal distribution.
- property mean: ndarray¶
Mean vector of the multivariate normal distribution.
Returns¶
- meannp.ndarray
Mean vector of the distribution.
- property multivariate: bool¶
Whether the distribution is multivariate.
Returns¶
- multivariatebool
Always True for multivariate normal.
- ppf(U: ndarray) ndarray ¶
Percent point function (inverse of CDF) - transform samples from the standard uniform distribution to the multivariate normal distribution. Vectorized Rosenblatt transform from Uniform(0,1) samples to a correlated multivariate normal N(mu, sigma).
Note: Unlike univariate distributions, the PPF for multivariate distributions is not unique. Many different points in the multivariate space can have the same cumulative probability. This implementation uses the Rosenblatt transformation to create a deterministic mapping from uniform samples to the multivariate normal distribution while preserving the correlation structure.
Parameters¶
- Unp.ndarray
Samples from the standard uniform distribution.
Returns¶
- samplesnp.ndarray
Samples from the multivariate normal distribution.
- class ionworkspipeline.data_fits.stats.Uniform(lb: float, ub: float)¶
Uniform distribution.
Extends:
ionworkspipeline.data_fits.stats.stats.Distribution
- property argmin: float¶
The point at which the normalized negative logpdf achieves its minimum value of zero. For the uniform distribution, the negative logpdf equals zero for all points in [lb, ub] and infinity outside this interval. By convention, we return the midpoint of the interval.
Returns¶
- float
(lb + ub) / 2, one of the points where the normalized negative logpdf equals zero.
- evaluate_to_array(x: float) ndarray ¶
Return a single-element array containing the square root of evaluate_to_scalar(x). Returns 0 for x in [lb, ub], infinity otherwise.
When squared, equals evaluate_to_scalar(x).
Parameters¶
- xfloat
Point at which to evaluate.
Returns¶
- np.ndarray
Single-element array containing 0 if x is in [lb, ub], infinity otherwise.
- evaluate_to_scalar(x: float) float ¶
Compute the normalized negative logpdf for uniform distribution. Returns 0 for x in [lb, ub], infinity otherwise.
The minimum value of 0 occurs for all x in [lb, ub].
Parameters¶
- xfloat
Point at which to evaluate.
Returns¶
- float
0 if x is in [lb, ub], infinity otherwise.
- property lb: float¶
Lower bound of the uniform distribution.
Returns¶
- lbfloat
Lower bound of the distribution.
- class ionworkspipeline.data_fits.stats.LogNormal(mean: float, std: float)¶
Univariate lognormal distribution.
Extends:
ionworkspipeline.data_fits.stats.stats.Distribution
- property argmin: float¶
The point at which the normalized negative logpdf achieves its minimum value of zero. For the lognormal distribution, this occurs at exp(μ - σ²), where μ and σ are the mean and standard deviation of the underlying normal distribution. The negative logpdf is shifted to be zero at this point.
Returns¶
- float
exp(mean - std²), where the normalized negative logpdf equals zero.
- evaluate_to_array(x: ndarray) ndarray ¶
Compute the single residual for univariate lognormal. For x > 0, computes: ((log(x) - μ) / σ + σ) / √2
When squared, equals evaluate_to_scalar(x).
Parameters¶
- xnp.ndarray
Point at which to evaluate.
Returns¶
- np.ndarray
Single residual whose square equals the normalized negative logpdf. Returns infinity for x ≤ 0.
- evaluate_to_scalar(x: ndarray) float ¶
Compute the normalized negative logpdf for univariate lognormal. For x > 0, computes: ((log(x) - μ) / σ + σ)² / 2
The minimum value of 0 occurs at x = exp(μ - σ²).
Parameters¶
- xnp.ndarray
Point at which to evaluate.
Returns¶
- float
Normalized negative logpdf, equals 0 at x = exp(μ - σ²). Returns infinity for x ≤ 0.
- property mean: float¶
Mean of the underlying normal distribution.
Returns¶
- meanfloat
Mean of the underlying normal distribution.
- property multivariate: bool¶
Whether the distribution is multivariate.
Returns¶
- multivariatebool
Always False for univariate lognormal.
- class ionworkspipeline.data_fits.stats.MultivariateLogNormal(mean: ndarray, cov: ndarray)¶
Multivariate lognormal distribution.
Extends:
ionworkspipeline.data_fits.stats.stats.Distribution
- property argmin: ndarray¶
The point at which the normalized negative logpdf achieves its minimum value of zero. For the multivariate lognormal distribution, this occurs at exp(μ - diag(Σ)), where μ is the mean vector and Σ is the covariance matrix of the underlying multivariate normal distribution. The negative logpdf is shifted to be zero at this point.
Returns¶
- np.ndarray
exp(mean - rowsum(cov)), where the normalized negative logpdf equals zero.
- cdf(x: ndarray) float ¶
Cumulative distribution function of the multivariate lognormal distribution.
Parameters¶
- xarray_like
Points at which to evaluate the CDF.
Returns¶
- cdffloat
Cumulative distribution function evaluated at x.
- property cov: ndarray¶
Covariance matrix of the underlying multivariate normal distribution.
Returns¶
- covnp.ndarray
Covariance matrix of the underlying multivariate normal distribution.
- evaluate_to_array(x: ndarray) ndarray ¶
Compute the residuals for multivariate lognormal. For x > 0, computes: Σ⁻¹/₂ (log(x) - μ + Σ1) / √2 where Σ1 is the row sums of the covariance matrix.
When squared and summed, equals evaluate_to_scalar(x).
Parameters¶
- xnp.ndarray
Point at which to evaluate.
Returns¶
- np.ndarray
Residuals whose squared sum equals the normalized negative logpdf. Returns infinity for components where x ≤ 0.
- evaluate_to_scalar(x: ndarray) float ¶
Compute the normalized negative logpdf for multivariate lognormal. For x > 0, computes: ((log(x) - μ)ᵀ Σ⁻¹ (log(x) - μ)) / 2 + sum(log(x))
The minimum value of 0 occurs at x = exp(μ - diag(Σ)).
Parameters¶
- xnp.ndarray
Point at which to evaluate.
Returns¶
- float
Normalized negative logpdf, equals 0 at x = exp(μ - diag(Σ)). Returns infinity if any component of x ≤ 0.
- property mean: ndarray¶
Mean vector of the underlying multivariate normal distribution.
Returns¶
- meannp.ndarray
Mean vector of the underlying multivariate normal distribution.
- property multivariate: bool¶
Whether the distribution is multivariate.
Returns¶
- multivariatebool
Always True for multivariate lognormal.
- pdf(x: ndarray) float ¶
Probability density function of the multivariate lognormal distribution.
Parameters¶
- xarray_like
Points at which to evaluate the PDF.
Returns¶
- pdffloat
Probability density function evaluated at x.
- ppf(U: ndarray) ndarray ¶
Percent point function (inverse of CDF) - transform samples from the standard uniform distribution to the multivariate lognormal distribution.
Parameters¶
- Unp.ndarray
Samples from the standard uniform distribution.
Returns¶
- samplesnp.ndarray
Samples from the multivariate lognormal distribution.
- class ionworkspipeline.data_fits.stats.PointMass(value: float)¶
PointMass distribution (constant value).
Extends:
ionworkspipeline.data_fits.stats.stats.Distribution
- property argmin: float¶
The point at which the normalized negative logpdf achieves its minimum value of zero. For a point mass distribution, the negative logpdf equals zero only at the point mass value and infinity elsewhere.
Returns¶
- float
The point mass value, where the normalized negative logpdf equals zero.
- cdf(x: float | ndarray) float | ndarray ¶
Cumulative distribution function of the point mass distribution.
Parameters¶
- xfloat or array_like
Points at which to evaluate the CDF.
Returns¶
- cdffloat or ndarray
Cumulative distribution function evaluated at x (0 for x < value, 1 for x >= value).
- evaluate_to_array(x: float) ndarray ¶
Return a single-element array containing the square root of evaluate_to_scalar(x). Returns 0 if x equals the point mass value, infinity otherwise.
When squared, equals evaluate_to_scalar(x).
Parameters¶
- xfloat
Point at which to evaluate.
Returns¶
- np.ndarray
Single-element array containing 0 if x equals the point mass value, infinity otherwise.
- evaluate_to_scalar(x: float) float ¶
Compute the normalized negative logpdf for point mass distribution. Returns 0 if x equals the point mass value, infinity otherwise.
The minimum value of 0 occurs only at x = value.
Parameters¶
- xfloat
Point at which to evaluate.
Returns¶
- float
0 if x equals the point mass value, infinity otherwise.
- property multivariate: bool¶
Whether the distribution is multivariate.
Returns¶
- multivariatebool
Always False for point mass.
- pdf(x: float | ndarray) float | ndarray ¶
Probability density function of the point mass distribution.
Parameters¶
- xfloat or array_like
Points at which to evaluate the PDF.
Returns¶
- pdffloat or ndarray
Probability density function evaluated at x (infinity at the constant value, 0 elsewhere).
- ppf(U: ndarray) ndarray ¶
Percent point function (inverse of CDF) - transform samples from the standard uniform distribution to the point mass distribution.
Parameters¶
- Unp.ndarray
Samples from the standard uniform distribution.
Returns¶
- samplesnp.ndarray
Samples from the point mass distribution (all equal to the constant value).
- rand(n: int | None = None) ndarray ¶
Draw random samples from the distribution.
Parameters¶
- nint, optional
Number of samples to draw.
Returns¶
- samplesndarray
Samples from the point mass distribution (all equal to the constant value).
- class ionworkspipeline.data_fits.stats.Dirichlet(alpha: ndarray)¶
Dirichlet distribution.
Extends:
ionworkspipeline.data_fits.stats.stats.Distribution
- property alpha: ndarray¶
Concentration parameters of the Dirichlet distribution.
Returns¶
- alphanp.ndarray
Concentration parameters.
- property argmin: ndarray¶
The point at which the normalized negative logpdf achieves its minimum value of zero. For the Dirichlet distribution, this occurs at the mode.
Returns¶
- np.ndarray
The mode of the distribution, where all alpha_i > 1.
- cdf(x: float | ndarray) float | ndarray ¶
Cumulative distribution function of the distribution.
Parameters¶
- xfloat or array_like
Points at which to evaluate the CDF.
Returns¶
- cdffloat or ndarray
Cumulative distribution function evaluated at x.
- evaluate_to_array(x: ndarray) ndarray ¶
Compute the residuals for Dirichlet distribution.
When squared and summed, equals evaluate_to_scalar(x).
Parameters¶
- xnp.ndarray
Point at which to evaluate.
Returns¶
- np.ndarray
Residuals whose squared sum equals the normalized negative logpdf.
- evaluate_to_scalar(x: ndarray) float ¶
Compute the normalized negative logpdf for Dirichlet distribution. For x on the simplex (x_i > 0 and sum(x) = 1), computes: -sum((alpha_i - 1) * log(x_i))
The minimum value of 0 occurs at the mode of the distribution.
Parameters¶
- xnp.ndarray
Point at which to evaluate.
Returns¶
- float
Normalized negative logpdf, equals 0 at the mode. Returns infinity if any x_i <= 0 or sum(x) != 1.
- property multivariate: bool¶
Whether the distribution is multivariate.
Returns¶
- multivariatebool
Always True for Dirichlet distribution.
- pdf(x: float | ndarray) float | ndarray ¶
Probability density function of the Dirichlet distribution.
Parameters¶
- xnp.ndarray
Points at which to evaluate the PDF. Must be on the simplex (all x_i > 0 and sum(x) = 1).
Returns¶
- pdffloat | np.ndarray
Probability density function evaluated at x. Returns 0 if any x_i <= 0 or sum(x) != 1.
- ppf(U: ndarray) ndarray ¶
Percent point function (inverse of CDF) for Dirichlet distribution.
For Dirichlet, we use the property that if X_i ~ Gamma(alpha_i, 1), then [X_1, X_2, …, X_k] / sum(X) ~ Dirichlet(alpha_1, alpha_2, …, alpha_k)
Parameters¶
- Unp.ndarray
Samples from the standard uniform distribution.
Returns¶
- samplesnp.ndarray
Samples from the Dirichlet distribution.