Tools for validating uncertainty calculations

Functions in this module can be used to validate uncertainty calculations by a simulation method.

If an uncertainty calculation preforms well on simulated data, it can be expected to perform well with real data too.

What does ‘perform well’ mean? Coverage probability (level of confidence) can be understood as the success-rate when a procedure is applied many times to independent data sets. That is, calculated uncertainty statements, for different sets of experimental data, should cover the measurand on roughly 95 out of 100 occasions when the nominal level of confidence is 95%. A simulation should represent the main features of the data generated in actual measurements.

The functions in this module fall into two groups: those that simulate random effects and those that test whether the measurand is covered by the uncertainty statement. Both real and complex-valued measurands can be handled.

There are two groups of function that simulate random effects. One group generates random errors drawn from a particular error distribution. The other group generates a triplet of numbers needed to define uncertain numbers: the estimate, the uncertainty of the estimate and the degrees of freedom associated with the uncertainty.

A suffix in the function names distinguishes between the two types: _err indicates a simple random error generator (e.g., gaussian_err), _est indicates a generator of estimates (e.g., gaussian_est)

Generators of estimates are used to create independent inputs for an uncertainty calculation that varies from one measurement to the next.

The prefix sim is needed to resolve the names of objects defined in this module.

Functions that create simple random error generators (real-valued)

In the case of correlated errors, the function mv_gaussian_err, which is associated with a multivariate Gaussian distribution, can be used.

Functions that generate estimates (real-valued)

  • gaussian_est - an input associated with a Gaussian distribution
  • uniform_est - an input associated with a uniform distribution
  • triangular_est - an input associated with a triangular distribution
  • arcsine_est - an input associated with an arcsine distribution

The generators created by these functions produce data triplets (x, u, df), which are convenient for direct input into GTC calculations.

In the case of correlated inputs the function mv_gaussian_est, which is associated with a multivariate Gaussian distribution, can be used.

Functions that create random error generators (complex-valued)

Functions that generate estimates (complex-valued)

  • bi_gaussian_est - an input associated with a bivariate Gaussian distribution

Utility functions

The mapping error selects random error generator functions by name (without the _err suffix).

The mapping estimate selects estimate generators functions by name (without the _est suffix).

The function seed gives access to the seed function of the Python random module.

The functions:

return True when the measurand is contained by the uncertainty statement

Summing the number of successes provides an estimate of the coverage probability for a procedure. The standard deviation of this estimate is calculated by success_rate_sd.

Module contents

gaussian_err(x, u)

Return a generator of Gaussian errors

Parameters:
  • x – the mean
  • u – the standard deviation

Example:

>>> gen = sim.gaussian_err(1,.5)
>>> gen.next()
1.2333329637100736
>>> gen.next()
-0.26827031760393716
uniform_err(x, a)

Return a generator of uniformly distributed errors

Parameters:
  • x – the mean
  • a – half-width

The standard deviation of the simulated distribution is u = a/sqrt(3).

Example:

>>> gen = sim.uniform_err(1,.5)
>>> gen.next()
0.7138525970522182
>>> gen.next()
1.115028001086297
triangular_err(x, a)

Return a generator of triangular distribution errors

Parameters:
  • x – the mean
  • a – half-width

The standard deviation of the simulated distribution is u = a/sqrt(6).

Example:

>>> gen = sim.triangular_err(1,.5)
>>> gen.next()
0.8490224542656288
>>> gen.next()
1.0586618711505276
arcsine_err(x, a)

Return a generator of arcsine distributed errors

Parameters:
  • x – the mean
  • a – half-width

The standard deviation of the simulated distribution is u = a/sqrt(2).

Example:

>>> gen = sim.arcsine_err(1,.5)
>>> gen.next()
1.285842415423328
>>> gen.next()
0.5577409002191889
gaussian_est(x, u, df=inf)

Return a generator of Gaussian input estimates

Parameters:
  • x – best estimate, used as the mean in simulations
  • u – standard uncertainty, used as standard deviation in simulations
  • df – degrees-of-freedom

The generator produces namedtuples with attributes x, u, df

A chi-squared random number generator provides different estimates of the standard uncertainty when the degrees-of-freedom df is finite.

Example:

>>> gen = sim.gaussian_est(1,.5,10)
>>> gen.next()
simulated_input(x=1.7995, u=0.5191, df=10)
>>> gen.next()
simulated_input(x=0.6995, u=0.6150, df=10)
uniform_est(x, a, df=inf)

Return a generator ofa uniformly distributed input estimates

Parameters:
  • x – best estimate, used as the mean for simulations
  • a – half-width, used to calculate a standard deviation for simulations
  • df – degrees-of-freedom

The generator produces namedtuples with attributes x, u, df

The mean of the simulated distribution is x and the standard deviation u = a/sqrt(3).

A chi-squared random number generator provides different estimates of the standard uncertainty when the degrees-of-freedom df is finite.

Example:

>>> gen = sim.uniform_est(1,.5,10)
>>> gen.next()
simulated_input(x=0.7496, u=0.2880, df=10)
>>> gen.next()
simulated_input(x=0.7080, u=0.3367, df=10)
triangular_est(x, a, df=inf)

Return a generator of triangular input estimates

Parameters:
  • x – best estimate, used as the mean for simulations
  • a – half-width, used to calculate a standard deviation for simulations
  • df – degrees-of-freedom

The generator produces namedtuples with attributes x, u, df

The mean of the simulated distribution is x and the standard deviation u = a/sqrt(6).

A chi-squared random number generator provides different estimates of the standard uncertainty when the degrees-of-freedom df is finite.

Example:

>>> gen = sim.triangular_est(1,.5,10)
>>> gen.next()
simulated_input(x=0.8494, u=0.1831, df=10)
>>> gen.next()
simulated_input(x=1.4484, u=0.1783, df=10)
arcsine_est(x, a, df=inf)

Return a generator of arcsine input estimates

Parameters:
  • x – best estimate, used as the mean for simulations
  • a – half-width, used to calculate a standard deviation for simulations
  • df – degrees-of-freedom

The generator produces namedtuples with attributes x, u, df

The mean of the simulated distribution is x and the standard deviation u = a/sqrt(2).

A chi-squared random number generator provides different estimates of the standard uncertainty when the degrees-of-freedom df is finite.

Example:

>>> gen = sim.arcsine_est(1,.5,10)
>>> gen.next()
simulated_input(x=1.171, u=0.317, df=10)
>>> gen.next()
simulated_input(x=1.236, u=0.437, df=10)
mv_gaussian_err(x, cv)

Return a generator of multivariate Gaussian errors

Parameters:
  • x – the mean
  • cv – the variance-covariance matrix

Example:

>>> z = (1.0,0.5)
>>> cv = la.array( [[0.4,0],[0,0.2]] )
>>> gen = sim.mv_gaussian_err(z,cv)
>>> gen.next()
[1.4171563857707223, 0.8272576411959041]
>>> gen.next()
[1.6483724628734273, 0.2507518428675501]    
bi_gaussian_err(z, cv)

Return a generator of bivariate Gaussian errors

Parameters:
  • z – the mean
  • cv – the covariance matrix

Example:

>>> z = 1+0.5j
>>> cv = la.array( [0.4,0,0,0.2] )
>>> gen = sim.bi_gaussian_err(z,cv)
>>> gen.next()
(0.4011898661840365+0.2985638387342745j)
>>> gen.next()
(0.15916435129849904-0.23052895672783447j)
uniform_ring_err(z, r)

Return a generator of random points on a circle

The generator function produces random points distributed on a circle of radius r, centre z, in the complex plane.

Example:

>>> z,r = (0.34-6j), 1.5
>>> rv = sim.uniform_ring_err(z,r)
>>> rv.next()
(1.7951618239337341-6.3640110797295169j)
uniform_disk_err(z, r)

Return a generator of points in and on a disk

The generator function produces random points distributed uniformly on a disk of radius r, centre z, in the complex plane.

Example:

>>> z,r = (12.34-6j), 5.5
>>> rv = sim.uniform_disk_err(z,r)
>>> rv.next()
(13.680838228974434-1.7635650558700604j)
mv_gaussian_est(x, cv, df=inf)

Return a generator of multivariate Gaussian input estimates

Parameters:
  • x – best estimate, used as the mean for simulations
  • cv – standard variance-covariance matrix
  • df – degrees-of-freedom

The generator produces simulated_vector_input namedtuples with attributes x, cv, df.

A multivariate Gaussian distribution is used to to simulate the observations x.

A Wishart random number generator is used to provide different covariance matrices when the degrees-of-freedom df is finite.

Example:

>>> z = (1.0,0.5)
>>> cv = la.array( [[0.4,0],[0,0.2]] )
>>> gen = sim.mv_gaussian_est(z,cv)
>>> gen.next().x
[0.7746525731799878, 0.926462098877803]
>>> gen.next().cv
array([[0.4, 0],
[0, 0.2]])
>>> gen.next()
simulated_vector_input(
    x=[0.4064169945628521, 0.1324584259774531],
    cv=array([[0.4, 0],[0, 0.2]]),
    df=inf)
bi_gaussian_est(z, cv, df=inf)

Return a generator of bivariate Gaussian input estimates

Parameters:
  • z – best estimate, used as the mean for simulations
  • cv – a 4-element sequence representing the covariance
  • df – degrees-of-freedom

The generator produces simulated_complex_input namedtuples, with attributes z, cv, df.

A bivariate Gaussian distribution is used to to simulate observations z.

A Wishart random number generator is used to provide different covariance matrices when the degrees-of-freedom df is finite.

Example:

>>> z = 1+0.5j
>>> cv = la.array( [0.4,0,0,0.2] )
>>> gen = sim.bi_gaussian_est(z,cv)
>>> gen.next().z
(1.1791619306909931+0.56658874952748051j)
>>> gen.next().cv
(0.40, 0, 0, 0.20)
>>> gen.next()
simulated_complex_input(
    z=(1.5483928415509403+2.4126670838378645j),
    cv=(0.40, 0, 0, 0.20),
    df=inf
)
interval_OK(measurand, x, U)

Return True if the measurand lies inside [x-U,x+U]

Parameters:
  • measurand – the true value (float)
  • x – the estimate (float)
  • U – the expanded uncertainty (float)

Example:

>>> mu, sd = 1.0, 0.1    
>>> k = reporting.k_factor()    
>>> _OK = lambda m,rv: sim.interval_OK(m,rv.x,k*rv.u)
>>> rv = sim.gaussian_est(mu,sd)
>>> N = 10000
>>> success = sum(
...     _OK(mu, rv.next() ) for i in xrange(N)
... )
>>> success
9523
circle_OK(measurand, z, U)

True if measurand is inside a circle

Parameters:
  • measurand – the true value (complex)
  • z – the estimate (complex)
  • U – radius of the expanded uncertainty circle (float)

The circle around z with radius U is an uncertainty statement for an estimate of the measurand.

Example:

>>> k2_sq = reporting.k2_factor_sq()
>>> def _OK(m,rv):
...    U = math.sqrt(k2_sq * rv.cv[0])
...    return sim.circle_OK(mu,rv.z,U)
...
>>> mu = 1.0+0.3j
>>> cv = [1.0,0.0,0.0,1.0]
>>> rv = sim.bi_gaussian_est(mu,cv)
>>> N = 10000
>>> success = sum(
...     _OK(mu, rv.next() ) for i in xrange(N)
... )
>>> success
9468
ellipse_OK(measurand, z, cv, k2_sq, inverted=False)

True when measurand lies inside the uncertainty region

Parameters:
  • measurand – the true value (complex)
  • z – the estimate (complex)
  • cv – 4-element sequence representing the covariance matrix, or its inverse
  • k2_sq – bivariate elliptical coverage factor squared (float)
  • invertedTrue if cv is inverted (Boolean)

The shape of the elliptical uncertainty region around z is determined by the covariance matrix cv.

When the Mahalanobis Distance squared between measurand and z is less than the two-dimensional coverage factor squared k2_sq, the measurand is within the region.

When inverted == True, the inverse of cv is not calculated.

Example:

>>> k2_sq = reporting.k2_factor_sq()
>>> def _OK(m,rv):
...     return sim.ellipse_OK(mu,rv.z,rv.cv,k2_sq)
... 
>>> mu = 1.0+0.3j
>>> cv = [1.0,0.0,0.0,1.0]
>>> rv = sim.bi_gaussian_est(mu,cv)
>>> N = 10000        
>>> success = sum(
...     _OK(mu, rv.next() ) for i in xrange(N) 
... )
>>> success
9496
success_rate_sd(N, p=95)

Return the standard deviation of the expected number of successes

Parameters:
  • N – the number of trials (integer)
  • p – the nominal coverage probability (in %)

Assuming a binomial process with probability p % of success, when N trials are carried out the standard deviation is sqrt( N*p*(1-p) ).

Example:

>>> sim.success_rate_sd(1000)
6.8920243760451143