List of resampling schemes and their purpose
For collections of uncertain data, sampling constraints can be represented using the ConstrainedValueResampling
type. This allows for passing complicated sampling constraints as a single input argument to functions that accept uncertain value collections.
Constrained resampling
UncertainData.Resampling.ConstrainedValueResampling
— TypeConstrainedValueResampling{N_DATASETS}
Indicates that resampling should be done with constraints on the furnishing distributions/populations.
Fields
constraints
. The constraints for the datasets. The constraints are represented as a tuple of lengthN_DATASETS
, where thei
-th tuple element contains the constraints for that dataset. Constraints for each dataset must be supplied as either a single sampling constraint, or as a vector of sampling constraints with length matching the length of the dataset (Union{SamplingConstraint, Vector{<:SamplingConstraint}}}
). For example, if thei
-th dataset contains 352 observations, thenconstraints[i]
must be either a single sampling constraint (e.g.TruncateStd(1.1)
) or a vector of 352 different sampling constraints (e.g.[TruncateStd(1.0 + rand()) for i = 1:352]
).n::Int
. The number of draws.
Example
Assume we have three collections of uncertain values of, each of length L = 50
. These should be resampled 250
times. Before resampling, however, the distributions/populations furnishing the uncertain values should be truncated:
- For the first collection, truncate each value at
1.5
times its standard deviation around its mean. This could simulate measurement errors from an instrument that yields stable measurements whose errors are normally distributed, but for which we are not interested in outliers or values beyond1.5
standard devations for our analyses. - For the second collection, truncate each value at the
80
th percentile range. This could simulate measurement errors from an instrument that yields stable measurements, whose errors are not normally distributed, so that confidence intervals are better to use than standard deviations. In this case, we're not interested in outliers, and therefore exclude values smaller than the10
th percentile and larger than the90
th percentile of the data. - For the third collection, truncate the
i
-th value at an fraction of its standard deviation around the mean slightly larger than at thei-1
-th value, so that the standard deviation ranges from0.5
to0.5 + L/100
. This could simulate, for example, an instrument whose measurement error increases over time.
L = 50
constraints_d1 = TruncateStd(1.5)
constraints_d2 = TruncateQuantiles(0.1, 0.9)
constraints_d3 = [TruncateStd(0.5 + i/100) for i = 1:L]