Statistics on single collections of uncertain data
These estimators operate on collections of uncertain values. Each element of such a collection can be an uncertain value of any type, such as populations, theoretical distributions, KDE distributions or fitted distributions.
The methods compute the statistic in question by drawing a length-k
realisation of the k
-element collection. Realisations are drawn by sampling each uncertain point in the collection independently. The statistic is then computed on either a single such realisation (yielding a single value for the statistic) or over multiple realisations (yielding a distribution of the statistic).
Syntax
The syntax for computing a statistic f
for single instances of an uncertain value collections is
f(x::UVAL_COLLECTION_TYPES)
, which resamplesx
once, assuming no element-wise dependence between the elements ofx
.f(x::UVAL_COLLECTION_TYPES, n::Int, args...; kwargs...)
, which resamplesx
n
times, assuming no element-wise dependence between the elements ofx
, then computes the statistic on each of thosen
independent draws. Returns a distributions of estimates of the statistic.
Methods
Mean
Statistics.mean
— Methodmean(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the mean of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the mean is computed for each of those length-L
realisations, yielding a distribution of mean estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the mean for the realisation.
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the mean ofx
, which is returned as a vector.
Mode
StatsBase.mode
— Methodmode(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the mode of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the mode is computed for each of those length-L
realisations, yielding a distribution of mode estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the mode for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the mode ofx
, which is returned as a vector.
Quantile
Statistics.quantile
— Methodquantile(x::UVAL_COLLECTION_TYPES, q, n::Int)
Obtain a distribution for the quantile(s) q
of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the quantile is computed for each of those length-L
realisations, yielding a distribution of quantile estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the quantile for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the quantile ofx
, which is returned as a vector.
IQR
StatsBase.iqr
— Methodiqr(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the interquartile range (IQR), i.e. the 75th percentile minus the 25th percentile, of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the IQR is computed for each of those length-L
realisations, yielding a distribution of IQR estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the IQR for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the IQR ofx
, which is returned as a vector.
Median
Statistics.median
— Methodmedian(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the median of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the median is computed for each of those length-L
realisations, yielding a distribution of median estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the median for the realisation.
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the median ofx
, which is returned as a vector.
Middle
Statistics.middle
— Methodmiddle(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the middle of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the middle is computed for each of those length-L
realisations, yielding a distribution of middle estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the middle for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the middle ofx
, which is returned as a vector.
Standard deviation
Statistics.std
— Methodstd(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the standard deviation of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the standard deviation is computed for each of those length-L
realisations, yielding a distribution of standard deviation estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the std for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the standard deviation ofx
, which is returned as a vector.
Variance
Statistics.var
— Methodvar(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the variance of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the variance is computed for each of those length-L
realisations, yielding a distribution of variance estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the variance for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the variance ofx
, which is returned as a vector.
Generalized/power mean
StatsBase.genmean
— Methodgenmean(x::UVAL_COLLECTION_TYPES, p, n::Int)
Obtain a distribution for the generalized/power mean with exponent p
of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the generalized mean is computed for each of those length-L
realisations, yielding a distribution of generalized mean estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the generalized mean for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the generalized mean ofx
, which is returned as a vector.
Generalized variance
StatsBase.genvar
— Methodgenvar(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the generalized sample variance of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the generalized sample variance is computed for each of those length-L
realisations, yielding a distribution of generalized sample variance estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the generalized sample variance for the realisation, which is a vector of length
L
. - Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the generalized sample variance ofx
, which is returned as a vector.
Harmonic mean
StatsBase.harmmean
— Methodharmmean(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the harmonic mean of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the harmonic mean is computed for each of those length-L
realisations, yielding a distribution of harmonic mean estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the harmonic mean for the realisation.
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the harmonic mean ofx
, which is returned as a vector.
Geometric mean
StatsBase.geomean
— Methodgeomean(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the geometric mean of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the geometric mean is computed for each of those length-L
realisations, yielding a distribution of geometric mean estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the geometric mean for the realisation.
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the geometric mean ofx
, which is returned as a vector.
Kurtosis
StatsBase.kurtosis
— Methodkurtosis(x::UVAL_COLLECTION_TYPES, n::Int, f = StatsBase.mean)
Obtain a distribution for the kurtosis of a collection of uncertain values.
This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the kurtosis is computed for each of those length-L
realisations, yielding a distribution of kurtosis estimates.
Optionally, a center function f
can be specified. This function is used to compute the center of each draw, i.e. for the i-th draw, call StatsBase.kurtosis(draw_i, f(draw_i))
.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the kurtosis for the realisation.
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the kurtosis ofx
, which is returned as a vector.
k-th order moment
StatsBase.moment
— Methodmoment(x::UVAL_COLLECTION_TYPES, k, n::Int)
Obtain a distribution for the k
-th order central moment of a collection of uncertain values.
This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the k
-th order central moment is computed for each of those length-L
realisations, yielding a distribution of k
-th order central moment estimates.
The procedure is as follows.
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the
k
-th order central moment for the realisation. - Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of thek
-th order central moment ofx
, which is returned as a vector.
Percentile
StatsBase.percentile
— Methodpercentile(x::UVAL_COLLECTION_TYPES, p, n::Int)
Obtain a distribution for the percentile(s) p
of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the percentile is computed for each of those length-L
realisations, yielding a distribution of percentile estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the percentile for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the percentile ofx
, which is returned as a vector.
Renyi entropy
StatsBase.renyientropy
— Methodrenyientropy(x::UVAL_COLLECTION_TYPES, α, n::Int)
Obtain a distribution for the Rényi (generalized) entropy of order α
of a collection of uncertain values.
This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the generalized entropy is computed for each of those length-L
realisations, yielding a distribution of generalized entropy estimates.
The procedure is as follows.
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the Rényi (generalized) entropy of order
α
for the realisation. - Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the Rényi (generalized) entropy of orderα
ofx
, which is returned as a vector.
Run-length encoding
StatsBase.rle
— Methodrle(x::UVAL_COLLECTION_TYPES, α, n::Int)
Obtain a distribution for the run-length encoding of a collection of uncertain values.
This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the run-length encoding is computed for each of those length-L
realisations, yielding a distribution of run-length encoding estimates.
Returns a vector of tuples of run-length encodings.
The procedure is as follows.
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the run-length encoding for the realisation. This gives a tuple, where the first element of the tuple is a vector of values of the input and the second is the number of consecutive occurrences of each element.
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the run-length encoding ofx
, which is returned as a vector of the run-length encoding tuples.
Standard error of the mean
StatsBase.sem
— Methodsem(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the standard error of the mean of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the standard error of the mean is computed for each of those length-L
realisations, yielding a distribution of standard error of the mean estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the standard error of the mean for the realisation, which is a vector of length
L
. - Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the standard error of the mean ofx
, which is returned as a vector.
Skewness
StatsBase.skewness
— Methodskewness(x::UVAL_COLLECTION_TYPES, n::Int, f = StatsBase.mean)
Obtain a distribution for the skewness of a collection of uncertain values.
This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the skewness is computed for each of those length-L
realisations, yielding a distribution of skewness estimates.
Optionally, a center function f
can be specified. This function is used to compute the center of each draw, i.e. for the i-th draw, call StatsBase.skewness(draw_i, f(draw_i))
.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the skewness for the realisation.
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the skewness ofx
, which is returned as a vector.
Span
StatsBase.span
— Methodspan(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the span of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the span is computed for each of those length-L
realisations, yielding a distribution of span estimates.
Returns a length-L
vector of span
s, where the i-th span is the range minimum(draw_x_i):maximum(draw_x_i)
.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the span for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the span ofx
, which is returned as a vector.
Summary statistics
StatsBase.summarystats
— Methodsummarystats(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the summary statistics of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the summary statistics is computed for each of those length-L
realisations, yielding a distribution of summary statistics estimates.
Returns a length-L
vector of SummaryStats
objects containing the mean, minimum, 25th percentile, median, 75th percentile, and maximum for each draw of x
.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the summary statistics for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the summary statistics ofx
, which is returned as a vector.
Total variance
StatsBase.totalvar
— Methodtotalvar(x::UVAL_COLLECTION_TYPES, n::Int)
Obtain a distribution for the total variance of a collection of uncertain values. This is done by first drawing n
length-L
realisations of x
, where L = length(x)
. Then, the total variance is computed for each of those length-L
realisations, yielding a distribution of total variance estimates.
Detailed steps:
- First, draw a length-
L
realisation ofx
by drawing one random number from each uncertain value furnishing the dataset. The draws are independent, so that no element-wise dependencies (e.g. sequential correlations) that are not already present in the data are introduced in the realisation. - Compute the total variance for the realisation, which is a vector of length
L
- Repeat the procedure
n
times, drawingn
independent realisations ofx
. This yieldsn
estimates of the total variance ofx
, which is returned as a vector.