Skip to content

Combining

Because all uncertainties are handled using a resampling approach, it is trivial to combine uncertain values of different types into a single uncertain value.

Without weights

When no weights are provided, the combined value is computed by resampling each of the N uncertain values n/N times, then combining using kernel density estimation.

# UncertainData.combineMethod.

1
2
combine(uvals::Vector{AbstractUncertainValue}; n = 10000*length(uvals), 
    bw::Union{Nothing, Real} = nothing)

Combine multiple uncertain values into a single uncertain value. This is done by resampling each uncertain value in uvals, n times each, then pooling these draws together. Finally, a kernel density estimate to the final distribution is computed over those draws.

The KDE bandwidth is controlled by bw. By default, bw = nothing; in this case, the bandwidth is determined using the KernelDensity.default_bandwidth function.

Tip

For very wide, close-to-normal distributions, the default bandwidth may work well. If you're combining very peaked distributions or discrete populations, however, you may want to lower the bandwidth significantly.

Example

1
2
3
4
5
6
7
8
v1 = UncertainValue(Normal, 1, 0.3)
v2 = UncertainValue(Normal, 0.8, 0.4)
v3 = UncertainValue([rand() for i = 1:3], [0.3, 0.3, 0.4])
v4 = UncertainValue(Normal, 3.7, 0.8)
uvals = [v1, v2, v3, v4];

combine(uvals)
combine(uvals, n = 20000) # adjust number of total draws

source

Weights dictating the relative contribution of each uncertain value into the combined value can also be provided. combine works with ProbabilityWeights, AnalyticWeights, FrequencyWeights and the generic Weights.

Below shows an example of combining

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
v1 = UncertainValue(rand(1000))
v2 = UncertainValue(Normal, 0.8, 0.4)
v3 = UncertainValue([rand() for i = 1:3], [0.3, 0.3, 0.4])
v4 = UncertainValue(Normal, 3.7, 0.8)
uvals = [v1, v2, v3, v4]

p = plot(title = L"distributions \,\, with \,\, overlapping \,\, supports")
plot!(v1, label = L"v_1", ls = :dash)
plot!(v2, label = L"v_2", ls = :dot)
vline!(v3.values, label = L"v_3") # plot each possible state as vline
plot!(v4, label = L"v_4")

pcombined = plot(combine(uvals), title = L"merge(v_1, v_2, v_3, v_4)", lc = :black, lw = 2)

plot(p, pcombined, layout = (2, 1), link = :x, ylabel = "Density")

With weights

Weights, ProbabilityWeights and AnalyticWeights are functionally the same. Either may be used depending on whether the weights are assigned subjectively or quantitatively. With FrequencyWeights, it is possible to control the exact number of draws from each uncertain value that goes into the draw pool before performing KDE.

ProbabilityWeights

# UncertainData.combineMethod.

1
2
3
combine(uvals::Vector{AbstractUncertainValue}, weights::ProbabilityWeights; 
    n = 10000*length(uvals), 
    bw::Union{Nothing, Real} = nothing)

Combine multiple uncertain values into a single uncertain value. This is done by resampling each uncertain value in uvals proportionally to the provided relative analytic weights indicating their relative importance (these are normalised by default, so don't need to sum to 1), then pooling these draws together. Finally, a kernel density estimate to the final distribution is computed over the n total draws.

Providing ProbabilityWeights leads to the exact same behaviour as for AnalyticWeights, but may be more appropriote when, for example, weights have been determined quantitatively.

The KDE bandwidth is controlled by bw. By default, bw = nothing; in this case, the bandwidth is determined using the KernelDensity.default_bandwidth function.

Tip

For very wide, close-to-normal distributions, the default bandwidth may work well. If you're combining very peaked distributions or discrete populations, however, you may want to lower the bandwidth significantly.

Example

1
2
3
4
5
6
7
8
9
v1 = UncertainValue(Normal, 1, 0.3)
v2 = UncertainValue(Normal, 0.8, 0.4)
v3 = UncertainValue([rand() for i = 1:3], [0.3, 0.3, 0.4])
v4 = UncertainValue(Normal, 3.7, 0.8)
uvals = [v1, v2, v3, v4];

# Two difference syntax options
combine(uvals, ProbabilityWeights([0.2, 0.1, 0.3, 0.2]))
combine(uvals, pweights([0.2, 0.1, 0.3, 0.2]), n = 20000) # adjust number of total draws

source

For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
v1 = UncertainValue(UnivariateKDE, rand(4:0.25:6, 1000), bandwidth = 0.02)
v2 = UncertainValue(Normal, 0.8, 0.4)
v3 = UncertainValue([rand() for i = 1:3], [0.3, 0.3, 0.4])
v4 = UncertainValue(Gamma, 8, 0.4)
uvals = [v1, v2, v3, v4];

p = plot(title = L"distributions \,\, with \,\, overlapping \,\, supports")
plot!(v1, label = L"v_1: KDE \, over \, empirical \, distribution", ls = :dash)
plot!(v2, label = L"v_2: Normal(0.8, 0.4)", ls = :dot)
# plot each possible state as vline
vline!(v3.values, 
    label = L"v_3: \, Discrete \, population\, [1,2,3], w/ \, weights \, [0.3, 0.4, 0.4]") 
plot!(v4, label = L"v_4: \, Gamma(8, 0.4)")

pcombined = plot(
    combine(uvals, ProbabilityWeights([0.1, 0.3, 0.02, 0.5]), n = 100000, bw = 0.05), 
    title = L"combine([v_1, v_2, v_3, v_4], ProbabilityWeights([0.1, 0.3, 0.02, 0.5])", 
    lc = :black, lw = 2)

plot(p, pcombined, layout = (2, 1), size = (800, 600), 
    link = :x, 
    ylabel = "Density",
    tickfont = font(12),
    legendfont = font(8), fg_legend = :transparent, bg_legend = :transparent)

AnalyticWeights

# UncertainData.combineMethod.

1
2
3
combine(uvals::Vector{AbstractUncertainValue}, weights::AnalyticWeights; 
    n = 10000*length(uvals), 
    bw::Union{Nothing, Real} = nothing)

Combine multiple uncertain values into a single uncertain value. This is done by resampling each uncertain value in uvals proportionally to the provided relative probability weights (these are normalised by default, so don't need to sum to 1), then pooling these draws together. Finally, a kernel density estimate to the final distribution is computed over the n total draws.

Providing AnalyticWeights leads to the exact same behaviour as for ProbabilityWeights, but may be more appropriote when relative importance weights are assigned subjectively, and not based on quantitative evidence.

The KDE bandwidth is controlled by bw. By default, bw = nothing; in this case, the bandwidth is determined using the KernelDensity.default_bandwidth function.

Tip

For very wide, close-to-normal distributions, the default bandwidth may work well. If you're combining very peaked distributions or discrete populations, however, you may want to lower the bandwidth significantly.

Example

1
2
3
4
5
6
7
8
9
v1 = UncertainValue(Normal, 1, 0.3)
v2 = UncertainValue(Normal, 0.8, 0.4)
v3 = UncertainValue([rand() for i = 1:3], [0.3, 0.3, 0.4])
v4 = UncertainValue(Normal, 3.7, 0.8)
uvals = [v1, v2, v3, v4];

# Two difference syntax options
combine(uvals, AnalyticWeights([0.2, 0.1, 0.3, 0.2]))
combine(uvals, aweights([0.2, 0.1, 0.3, 0.2]), n = 20000) # adjust number of total draws

source

For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
v1 = UncertainValue(UnivariateKDE, rand(4:0.25:6, 1000), bandwidth = 0.02)
v2 = UncertainValue(Normal, 0.8, 0.4)
v3 = UncertainValue([rand() for i = 1:3], [0.3, 0.3, 0.4])
v4 = UncertainValue(Gamma, 8, 0.4)
uvals = [v1, v2, v3, v4];

p = plot(title = L"distributions \,\, with \,\, overlapping \,\, supports")
plot!(v1, label = L"v_1: KDE \, over \, empirical \, distribution", ls = :dash)
plot!(v2, label = L"v_2: Normal(0.8, 0.4)", ls = :dot)
vline!(v3.values, label = L"v_3: \, Discrete \, population\, [1,2,3], w/ \, weights \, [0.3, 0.4, 0.4]") # plot each possible state as vline
plot!(v4, label = L"v_4: \, Gamma(8, 0.4)")

pcombined = plot(combine(uvals, AnalyticWeights([0.1, 0.3, 0.02, 0.5]), n = 100000, bw = 0.05), 
    title = L"combine([v_1, v_2, v_3, v_4], AnalyticWeights([0.1, 0.3, 0.02, 0.5])", lc = :black, lw = 2)

plot(p, pcombined, layout = (2, 1), size = (800, 600), 
    link = :x, 
    ylabel = "Density",
    tickfont = font(12),
    legendfont = font(8), fg_legend = :transparent, bg_legend = :transparent)

Generic Weights

# UncertainData.combineMethod.

1
2
3
combine(uvals::Vector{AbstractUncertainValue}, weights::Weights; 
    n = 10000*length(uvals), 
    bw::Union{Nothing, Real} = nothing)

Combine multiple uncertain values into a single uncertain value. This is done by resampling each uncertain value in uvals proportionally to the provided weights (these are normalised by default, so don't need to sum to 1), then pooling these draws together. Finally, a kernel density estimate to the final distribution is computed over the n total draws.

Providing Weights leads to the exact same behaviour as for ProbabilityWeights and AnalyticalWeights.

The KDE bandwidth is controlled by bw. By default, bw = nothing; in this case, the bandwidth is determined using the KernelDensity.default_bandwidth function.

Tip

For very wide, close-to-normal distributions, the default bandwidth may work well. If you're combining very peaked distributions or discrete populations, however, you may want to lower the bandwidth significantly.

Example

1
2
3
4
5
6
7
8
9
v1 = UncertainValue(Normal, 1, 0.3)
v2 = UncertainValue(Normal, 0.8, 0.4)
v3 = UncertainValue([rand() for i = 1:3], [0.3, 0.3, 0.4])
v4 = UncertainValue(Normal, 3.7, 0.8)
uvals = [v1, v2, v3, v4];

# Two difference syntax options
combine(uvals, Weights([0.2, 0.1, 0.3, 0.2]))
combine(uvals, weights([0.2, 0.1, 0.3, 0.2]), n = 20000) # adjust number of total draws

source

For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
v1 = UncertainValue(UnivariateKDE, rand(4:0.25:6, 1000), bandwidth = 0.01)
v2 = UncertainValue(Normal, 0.8, 0.4)
v3 = UncertainValue([rand() for i = 1:3], [0.3, 0.3, 0.4])
v4 = UncertainValue(Gamma, 8, 0.4)
uvals = [v1, v2, v3, v4];

p = plot(title = L"distributions \,\, with \,\, overlapping \,\, supports")
plot!(v1, label = L"v_1: KDE \, over \, empirical \, distribution", ls = :dash)
plot!(v2, label = L"v_2: Normal(0.8, 0.4)", ls = :dot)
# plot each possible state as vline
vline!(v3.values, 
    label = L"v_3: \, Discrete \, population\, [1,2,3], w/ \, weights \, [0.3, 0.4, 0.4]") 
plot!(v4, label = L"v_4: \, Gamma(8, 0.4)")

pcombined = plot(combine(uvals, Weights([0.1, 0.15, 0.1, 0.1]), n = 100000, bw = 0.02), 
    title = L"combine([v_1, v_2, v_3, v_4],  Weights([0.1, 0.15, 0.1, 0.1]))", 
    lc = :black, lw = 2)

plot(p, pcombined, layout = (2, 1), size = (800, 600), 
    link = :x, 
    ylabel = "Density",
    tickfont = font(12),
    legendfont = font(8), fg_legend = :transparent, bg_legend = :transparent)

FrequencyWeights

Using FrequencyWeights, one may specify the number of times each of the uncertain values should be sampled to form the pooled resampled draws on which the final kernel density estimate is performed.

# UncertainData.combineMethod.

1
2
combine(uvals::Vector{AbstractUncertainValue}, weights::FrequencyWeights;
    bw::Union{Nothing, Real} = nothing)

Combine multiple uncertain values into a single uncertain value. This is done by resampling each uncertain value in uvals according to their relative frequencies (the absolute number of draws provided by weights). Finally, a kernel density estimate to the final distribution is computed over the sum(weights) total draws.

The KDE bandwidth is controlled by bw. By default, bw = nothing; in this case, the bandwidth is determined using the KernelDensity.default_bandwidth function.

Tip

For very wide and close-to-normal distributions, the default bandwidth may work well. If you're combining very peaked distributions or discrete populations, however, you may want to lower the bandwidth significantly.

Example

v1 = UncertainValue(Normal, 1, 0.3) v2 = UncertainValue(Normal, 0.8, 0.4) v3 = UncertainValue([rand() for i = 1:3], [0.3, 0.3, 0.4]) v4 = UncertainValue(Normal, 3.7, 0.8) uvals = [v1, v2, v3, v4];

Two difference syntax options

combine(uvals, FrequencyWeights([100, 500, 343, 7000])) combine(uvals, pweights([1410, 550, 223, 801]))

source

For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
v1 = UncertainValue(UnivariateKDE, rand(4:0.25:6, 1000), bandwidth = 0.01)
v2 = UncertainValue(Normal, 0.8, 0.4)
v3 = UncertainValue([rand() for i = 1:3], [0.3, 0.3, 0.4])
v4 = UncertainValue(Gamma, 8, 0.4)
uvals = [v1, v2, v3, v4];

p = plot(title = L"distributions \,\, with \,\, overlapping \,\, supports")
plot!(v1, label = L"v_1: KDE \, over \, empirical \, distribution", ls = :dash)
plot!(v2, label = L"v_2: Normal(0.8, 0.4)", ls = :dot)
# plot each possible state as vline
vline!(v3.values, 
    label = L"v_3: \, Discrete \, population\, [1,2,3], w/ \, weights \, [0.3, 0.4, 0.4]") 
plot!(v4, label = L"v_4: \, Gamma(8, 0.4)")

pcombined = plot(combine(uvals, FrequencyWeights([10000, 20000, 3000, 5000]), bw = 0.05), 
    title = L"combine([v_1, v_2, v_3, v_4], FrequencyWeights([10000, 20000, 3000, 5000])", 
    lc = :black, lw = 2)

plot(p, pcombined, layout = (2, 1), size = (800, 600), 
    link = :x, 
    ylabel = "Density",
    tickfont = font(12),
    legendfont = font(8), fg_legend = :transparent, bg_legend = :transparent)