Uncertain index-value datasets

Documentation

UncertainData.UncertainDatasets.UncertainIndexValueDatasetType
UncertainIndexValueDataset{
    IDXTYP<:AbstractUncertainIndexDataset, 
    VALSTYP<:AbstractUncertainValueDataset}

A generic dataset type consisting of a set of uncertain indices (e.g. time, depth, order, etc...) and a set of uncertain values.

The i-th index is assumed to correspond to the i-th value. For example, if data is an instance of a UncertainIndexValueDataset, then

  • data.indices[2] is the index for the value data.values[2]
  • data.values[7] is the value for the index data.indices[7].
  • data[3] is an index-value tuple (data.indices[3], data.values[3]).

Fields

  • indices::T where {T <: AbstractUncertainIndexDataset}: The uncertain indices, represented by some type of uncertain index dataset.
  • values::T where {T <: AbstractUncertainValueDataset}: The uncertain values, represented by some type of uncertain index dataset.

Example

# Simulate some data values measured a specific times.
times = 1:100
values = sin.(0.0:0.1:100.0)

# Assume the data were measured by a device with normally distributed
# measurement uncertainties with fluctuating standard deviations
σ_range = (0.1, 0.7)

uncertain_values = [UncertainValue(Normal, val, rand(Uniform(σ_range...))) 
    for val in values]

# Assume the clock used to record the times is uncertain, but with uniformly 
# distributed noise that doesn't change through time.
uncertain_times = [UncertainValue(Uniform, t-0.1, t+0.1) for t in times]

# Pair the time-value data. If vectors are provided to the constructor,
# the first will be interpreted as the indices and the second as the values.
data = UncertainIndexValueDataset(uncertain_times, uncertain_values)

# A safer option is to first convert to UncertainIndexDataset and 
# UncertainValueDataset, so you don't accidentally mix the indices 
# and the values.
uidxs = UncertainIndexDataset(uncertain_times)
uvals = UncertainValueDataset(uncertain_values)

data = UncertainIndexValueDataset(uidxs, uvals)
source

Description

UncertainIndexValueDatasets have uncertainties associated with both the indices (e.g. time, depth, etc) and the values of the data points.

Defining an uncertain index-value dataset

Example 1

Defining the values

Let's start by defining the uncertain data values and collecting them in an UncertainValueDataset.

using UncertainData, Plots 
gr()
r1 = [UncertainValue(Normal, rand(), rand()) for i = 1:10]
r2 = UncertainValue(rand(10000))
r3 = UncertainValue(Uniform, rand(10000))
r4 = UncertainValue(Normal, -0.1, 0.5)
r5 = UncertainValue(Gamma, 0.4, 0.8)

u_values = [r1; r2; r3; r4; r5]
udata = UncertainValueDataset(u_values);

Defining the indices

The values were measures at some time indices by an inaccurate clock, so that the times of measuring are normally distributed values with fluctuating standard deviations.

u_timeindices = [UncertainValue(Normal, i, rand(Uniform(0, 1))) 
    for i = 1:length(udata)]
uindices = UncertainIndexDataset(u_timeindices);

Combinining the indices and values

Now, combine the uncertain time indices and measurements into an UncertainIndexValueDataset.

x = UncertainIndexValueDataset(uindices, udata)

The built-in plot recipes make it easy to visualize the dataset. By default, plotting the dataset plots the median value of the index and the measurement (only for scatter plots), along with the 33rd to 67th percentile range error bars in both directions.

plot(x)

You can also tune the error bars by calling plot(udata::UncertainIndexValueDataset, idx_quantiles, val_quantiles), explicitly specifying the quantiles in each direction, like so:

plot(x, [0.05, 0.95], [0.05, 0.95])

Example 2

Defining the indices

Say we had a dataset of 20 values for which the uncertainties are normally distributed with increasing standard deviation through time.

time_inds = 1:13
uvals = [UncertainValue(Normal, ind, rand(Uniform()) + (ind / 6)) for ind in time_inds]
inds = UncertainIndexDataset(uvals)

That's it. We can also plot the 33rd to 67th percentile range for the indices.

plot(inds, [0.33, 0.67])

Defining the values

Let's define some uncertain values that are associated with the indices.

u1 = UncertainValue(Gamma, rand(Gamma(), 500))
u2 = UncertainValue(rand(MixtureModel([Normal(1, 0.3), Normal(0.1, 0.1)]), 500))
uvals3 = [UncertainValue(Normal, rand(), rand()) for i = 1:11]

measurements = [u1; u2; uvals3]
datavals = UncertainValueDataset(measurements)

Combinining the indices and values

Now, we combine the indices and the corresponding data.

d = UncertainIndexValueDataset(inds, datavals)

Plot the dataset with error bars in both directions, using the 20th to 80th percentile range for the indices and the 33rd to 67th percentile range for the data values.

plot(d, [0.2, 0.8], [0.33, 0.67])