bet.calculateP package

Submodules

bet.calculateP.calculateError module

This module provides methods for calculating error estimates of the probability measure for calculate probability measures. See Butler et al. 2015. <http://arxiv.org/pdf/1407.3851>.

  • cell_connectivity_exact calculates the connectivity of cells.

  • boundary_sets calculates which cells are on the boundary and strictly interior for contour events.

  • sampling_error is for calculating error estimates due to sampling.

  • model_error is for calculating error estimates due to error in solution of QoIs

bet.calculateP.calculateError.boundary_sets(disc, nei_list)

Calculates the the neighboring Voronoi cells for each cell.

Parameters
  • disc (bet.sample.discretization) – An object containing the discretization information.

  • nei_list (list) – list of lists defining contour events of neighboring cells.

Return type

tuple

Returns

(\(B_N, C_N\)) where B_N are the cells strictly on the interior of a contour event and C_N are the cells on the boundary of a contour eventas defined in Butler et al. 2015. <http://arxiv.org/pdf/1407.3851>

bet.calculateP.calculateError.cell_connectivity_exact(disc)

Calculates contour events of the cells and its neighbors.

Parameters

disc (bet.sample.discretization) – An object containing the discretization information.

Return type

list

Returns

list of lists of neighboring cells

class bet.calculateP.calculateError.model_error(disc)

Bases: object

A class for calculating the error due to numerical error for a discretization.

calculate_for_contour_events()

Calculate the numerical error for each contour event.

Return type

list

Returns

er_list, a list of the error estimates for each contour event.

calculate_for_sample_set_region(s_set, region, emulated_set=None)

Calculate the numerical error estimate for a region of the input space defined by a sample set object.

Parameters
  • s_set (bet.sample.sample_set_base) – sample set for which to calculate error

  • region (int) – region of s_set for which to calculate error

  • emulated_set – sample set for volume emulation

Return type

float

Returns

er_est, the numerical error estimate for the region

calculate_for_sample_set_region_mc(s_set, region)

Calculate the numerical error estimate for a region of the input space defined by a sample set object, using the MC assumption.

Parameters

:rtype float :returns: er_est, the numerical error estimate for the region

disc

bet.sample.discretiztion defining the problem

disc_new

bet.sample.discretiztion from adding error estimates

num

number of inputs and outputs

class bet.calculateP.calculateError.sampling_error(disc, exact=True)

Bases: object

A class for calculating the error due to sampling for a discretization.

B_N

dictionaries of interior and boundary sets

C_N

dictionaries of interior and boundary sets

calculate_for_contour_events()

Calculate the sampling error bounds for each contour event. If volumes are already calculated these are used. If not, volume emulation is used if an emulated input set exists. Otherwise the MC assumption is made.

Return type

tuple

Returns

(up_list, low_list) where up_list is a list of the upper bounds for each contour event and low_list is a list of the lower bounds.

calculate_for_sample_set_region(s_set, region, emulated_set=None)

Calculate the sampling error bounds for a region of the input space defined by a sample set object which defines an event \(A\).

Parameters
  • s_set (bet.sample.sample_set_base) – sample set for which to calculate error

  • region (int) – region of s_set for which to calculate error

  • emulated_set (bet.sample_set_base) – sample set for volume emulation

Return type

tuple

Returns

(upper_bound, lower_bound) the upper and lower bounds for the error.

disc

bet.sample.discretization that defines the problem

num

number of inputs and outputs

exception bet.calculateP.calculateError.wrong_argument_type

Bases: Exception

Exception for when the argument is not one of the acceptible types.

bet.calculateP.calculateP module

This module provides methods for calculating the probability measure \(P_{\Lambda}\).

  • prob_on_emulated_samples provides a skeleton class and calculates the probability for a set of emulation points.

  • prob estimates the probability based on pre-defined volumes.

  • prob_with_emulated estimates the probability using volume emulation.

  • prob_from_sample_set estimates the probability based on probabilities from another

sample set on the same space.

bet.calculateP.calculateP.prob(discretization, globalize=True)

Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples}})\), the probability associated with a set of cells defined by the model solves at \((\lambda_{samples})\) where the volumes of these cells are provided.

Parameters
  • discretization (class:bet.sample.discretization) – An object containing the discretization information.

  • globalize (bool) – Makes local variables global.

bet.calculateP.calculateP.prob_from_discretization_input(disc, set_new)

Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_new}})\) from \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_old}})\) where \(\lambda_{samples_old}\) come from an input discretization.

Parameters
  • disc (discretization) – Discretiztion on which probabilities have already been calculated

  • set_new (sample_set_base) – Sample set for which probabilities will be calculated.

bet.calculateP.calculateP.prob_from_sample_set(set_old, set_new)

Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_new}})\) from \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_old}})\) using the MC assumption with respect to set_old.

Parameters
  • set_old (sample_set_base) – Sample set on which probabilities have already been calculated

  • set_new (sample_set_base) – Sample set for which probabilities will be calculated.

bet.calculateP.calculateP.prob_from_sample_set_with_emulated_volumes(set_old, set_new, set_emulate=None)

Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_new}})\) from \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_old}})\) using a set of emulated points are distributed with respect to the volume measure.

Parameters
  • set_old (sample_set_base) – Sample set on which probabilities have already been calculated

  • set_new (sample_set_base) – Sample set for which probabilities will be calculated.

  • set_emulate (sample_set_base) – Sample set for volume emulation

bet.calculateP.calculateP.prob_on_emulated_samples(discretization, globalize=True)

Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{emulate}})\), the probability associated with a set of Voronoi cells defined by num_l_emulate iid samples \((\lambda_{emulate})\). This is added to the emulated input sample set object.

Parameters
  • discretization (class:bet.sample.discretization) – An object containing the discretization information.

  • globalize (bool) – Makes local variables global.

bet.calculateP.calculateP.prob_with_emulated_volumes(discretization)

Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples}})\), the probability associated with a set of cells defined by the model solves at \((\lambda_{samples})\) where the volumes are calculated with the given emulated input points.

Parameters
  • discretization (class:bet.sample.discretization) – An object containing the discretization information.

  • globalize – Makes local variables global.

bet.calculateP.calculateR module

This module contains functions for the density-based approach that utilizes a ratio of observed to predicted densities to update an initial density on the parameter space.

bet.calculateP.calculateR.generate_output_kdes(discretization, bw_method=None)

Generate Kernel Density Estimates on predicted and observed output sample sets.

Parameters
Returns

prediction set, prediction kdes, observation set, observation kdes, number of clusters

Return type

bet.discretization.sample_set, list, bet.discretization.sample_set, list, int

bet.calculateP.calculateR.invert(discretization, bw_method=None)

Solve the data consistent stochastic inverse problem, solving for input sample weights.

Parameters
Returns

acceptance rate, mean acceptance rate, pointers for samples to clusters

Return type

list, np.ndarray, list

bet.calculateP.calculateR.invert_rejection_sampling(discretization, bw_method=None)

Solve the data consistent stochastic inverse problem by rejection sampling.

Parameters
Returns

sample set containing samples

Return type

bet.sample.sample_set

bet.calculateP.calculateR.invert_to_gmm(discretization, bw_method=None)

Solve the data consistent stochastic inverse problem, solving for a Gaussian mixture model.

Parameters
Returns

means, covariances, and weights for Gaussians

Return type

list, list, list

bet.calculateP.calculateR.invert_to_kde(discretization, bw_method=None)

Solve the data consistent stochastic inverse problem, solving for a weighted kernel density estimate.

Parameters
Returns

marginal probabilities and cluster weights

Return type

list, np.ndarray

bet.calculateP.calculateR.invert_to_multivariate_gaussian(discretization, bw_method=None)

Solve the data consistent stochastic inverse problem, solving for a multivariate Gaussian.

Parameters
Returns

marginal probabilities and cluster weights

Return type

list, np.ndarray

bet.calculateP.calculateR.invert_to_random_variable(discretization, rv, num_reweighted=10000, bw_method=None)

Solve the data consistent stochastic inverse problem, fitting a random variable.

rv can take multiple types of formats depending on type of distribution.

A string is used for the same distribution with default parameters in each dimension. ex. rv = ‘uniform’ or rv = ‘beta’

A list or tuple of length 2 is used for the same distribution with fixed user-defined parameters in each dimension as a dictionary. ex. rv = [‘uniform’, {‘floc’:-2, ‘fscale’:5}] or rv = [‘beta’, {‘fa’: 2, ‘fb’:5, ‘floc’:-2, ‘fscale’:5}]

A list of length dim which entries of lists or tuples of length 2 is used for different distributions with fixed user-defined parameters in each dimension as a dictionary. ex. rv = [[‘uniform’, {‘floc’:-2, ‘fscale’:5}],

[‘beta’, {‘fa’: 2, ‘fb’:5, ‘floc’:-2, ‘fscale’:5}]]

Parameters
Returns

marginal probabilities and cluster weights

Return type

list, np.ndarray

bet.calculateP.simpleFunP module

This module provides methods for creating simple function approximations to be used by calculateP. These simple function approximations are returned as bet.sample.sample_set objects.

bet.calculateP.simpleFunP.check_inputs(data_set, Q_ref)

Checks inputs to methods.

bet.calculateP.simpleFunP.check_inputs_no_reference(data_set)

Checks inputs to methods.

bet.calculateP.simpleFunP.check_type(val, data_set=None)

Add support for different data types that can be passed as keyword arguments. Attempt to infer dimension and set it correctly.

bet.calculateP.simpleFunP.infer_Q(data_set)

Attempt to infer reference value around which to define a sample set.

bet.calculateP.simpleFunP.normal_partition_normal_distribution(data_set, Q_ref=None, std=1, M=1, num_d_emulate=1000000.0)

Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a multivariate normal probability density centered at Q_ref with standard deviation std using M bins sampled from the given normal distribution.

Parameters
  • data_set (discretization or sample_set or ndarray) – Sample set that the probability measure is defined for.

  • M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.

  • num_d_emulate (int) – Number of samples used to emulate using an MC assumption

  • Q_ref (ndarray of size (mdim,)) – \(Q(\lambda_{reference})\)

  • std (ndarray of size (mdim,)) – The standard deviation of each QoI

Return type

voronoi_sample_set

Returns

sample_set object defining simple function approximation

bet.calculateP.simpleFunP.regular_partition_uniform_distribution_rectangle_domain(data_set, rect_domain, cells_per_dimension=1)

Creates a simple function appoximation of \(\rho_{\mathcal{D},M}\) where \(\rho{\mathcal{D}, M}\) is a uniform probablity density over the hyperrectangular domain specified by rect_domain.

Since \(\rho_\mathcal{D}\) is a uniform distribution on a hyperrectangle we are able to represent it exactly.

Parameters
  • data_set (discretization or sample_set or ndarray) – Sample set that the probability measure is defined for.

  • rect_domain (numpy.ndarray of shape (2, mdim)) – The domain overwhich \(\rho_\mathcal{D}\) is uniform.

  • cells_per_dimension (list) – number of cells per dimension

Return type

rectangle_sample_set

Returns

sample_set object defining simple function approximation

bet.calculateP.simpleFunP.regular_partition_uniform_distribution_rectangle_scaled(data_set, Q_ref=None, rect_scale=1, cells_per_dimension=1)

Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a uniform probability density centered at Q_ref (or the reference_value of a sample set. If Q_ref is not given the reference value is used.) with rect_scale of the width of D.

Since rho_D is a uniform distribution on a hyperrectanlge we are able to represent it exactly.

Parameters
  • data_set (discretization or sample_set or ndarray) – Sample set that the probability measure is defined for.

  • rect_scale (double or list) – The scale used to determine the width of the uniform distribution as rect_size = (data_max-data_min)*rect_scale

  • Q_ref (ndarray of size (mdim,)) – \(Q(\lambda_{reference})\)

  • cells_per_dimension (list) – number of cells per dimension

Return type

rectangle_sample_set

Returns

sample_set object defining simple function approximation

bet.calculateP.simpleFunP.regular_partition_uniform_distribution_rectangle_size(data_set, Q_ref=None, rect_size=None, cells_per_dimension=1)

Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a uniform probability density centered at Q_ref (or the reference_value of a sample set. If Q_ref is not given the reference value is used) with rect_size of the width of a hyperrectangle.

Since rho_D is a uniform distribution on a hyperrectanlge we can represent it exactly.

Parameters
  • rect_size (double or list) – The size used to determine the width of the uniform distribution

  • data_set (discretization or sample_set or ndarray) – Sample set that the probability measure is defined for.

  • Q_ref (ndarray of size (mdim,)) – \(Q(\lambda_{reference})\)

  • cells_per_dimension (list) – number of cells per dimension.

Return type

rectangle_sample_set

Returns

sample_set object defining simple function approximation

bet.calculateP.simpleFunP.uniform_partition_normal_distribution(data_set, Q_ref=None, std=1, M=1, num_d_emulate=1000000.0)

Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a multivariate normal probability density centered at Q_ref with standard deviation std using M bins sampled from a uniform distribution with a size 4 standard deviations in each direction.

Parameters
  • data_set (discretization or sample_set or ndarray) – Sample set that the probability measure is defined for.

  • M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.

  • num_d_emulate (int) – Number of samples used to emulate using an MC assumption

  • Q_ref (ndarray of size (mdim,)) – \(Q(\lambda_{reference})\)

  • std (ndarray of size (mdim,)) – The standard deviation of each QoI

Return type

voronoi_sample_set

Returns

sample_set object defininng simple function approximation

bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_data_samples(data_set)

Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a uniform probability density over the entire data_domain. Here the data_domain is the union of Voronoi cells defined by data. In other words we assign each sample the same probability, so M = len(data) or rather len(d_distr_samples) == len(data). The purpose of this method is to approximate uniform distributions over irregularly shaped domains.

Parameters

data_set (discretization or sample_set or ndarray) – Sample set that the probability measure is defined for.

Return type

voronoi_sample_set

Returns

sample_set object defininng simple function approximation

bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_rectangle_domain(data_set, rect_domain, M=50, num_d_emulate=1000000.0)

Creates a simple function approximation of \(\rho_{\mathcal{D}}\) where \(\rho_{\mathcal{D}}\) is a uniform probability density on a generalized rectangle defined by rect_domain. The simple function approximation is then defined by determining M Voronoi cells (i.e., “bins”) partitioning \(\mathcal{D}\). These bins are only implicitly defined by M ``samples in :math:`\mathcal{D}`. Finally, the probabilities of each of these bins is computed by sampling from :math:`\rho{\mathcal{D}}` and using nearest neighbor searches to bin these samples in the ``M implicitly defined bins. The result is the simple function approximation denoted by \(\rho_{\mathcal{D},M}\).

Note that all computations in the measure-based approach that follow from this are for the fixed simple function approximation \(\rho_{\mathcal{D},M}\).

Parameters
  • M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.

  • rect_domain (double or list) – The support of the density

  • num_d_emulate (int) – Number of samples used to emulate using an MC assumption

  • data_set (discretization or sample_set or ndarray) – Sample set that the probability measure is defined for.

  • Q_ref (ndarray of size (mdim,)) – \(Q(\)lambda_{reference})`

Return type

voronoi_sample_set

Returns

sample_set object defininng simple function approximation

bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_rectangle_scaled(data_set, Q_ref=None, rect_scale=0.2, M=50, num_d_emulate=1000000.0)

Creates a simple function approximation of \(\rho_{\mathcal{D}}\) where \(\rho_{\mathcal{D}}\) is a uniform probability density on a generalized rectangle centered at Q_ref or the reference_value of a sample set. If Q_ref is not given the reference value is used.. The support of this density is defined by rect_scale, which determines the size of the generalized rectangle by scaling the circumscribing generalized rectangle of \(\mathcal{D}\). The simple function approximation is then defined by determining M `` Voronoi cells (i.e., "bins") partitioning :math:`\mathcal{D}`. These bins are only implicitly defined by ``M samples in \(\mathcal{D}\). Finally, the probabilities of each of these bins is computed by sampling from \(\rho{\mathcal{D}}\) and using nearest neighbor searches to bin these samples in the M implicitly defined bins. The result is the simple function approximation denoted by \(\rho_{\mathcal{D},M}\).

Note that all computations in the measure-based approach that follow from this are for the fixed simple function approximation \(\rho_{\mathcal{D},M}\).

Parameters
  • M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.

  • rect_scale (double or list) – The scale used to determine the support of the uniform distribution as rect_size = (data_max-data_min)*rect_scale

  • num_d_emulate (int) – Number of samples used to emulate using an MC assumption

  • data_set (discretization or sample_set or ndarray) – Sample set that the probability measure is defined for.

  • Q_ref (ndarray of size (mdim,)) – \(Q(\)lambda_{reference})`

Return type

voronoi_sample_set

Returns

sample_set object defininng simple function approximation

bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_rectangle_size(data_set, Q_ref=None, rect_size=None, M=50, num_d_emulate=1000000.0)

Creates a simple function approximation of \(\rho_{\mathcal{D}}\) where \(\rho_{\mathcal{D}}\) is a uniform probability density on a generalized rectangle centered at Q_ref or the reference_value of a sample set. If Q_ref is not given the reference value is used. The support of this density is defined by rect_size, which determines the size of the generalized rectangle. The simple function approximation is then defined by determining M Voronoi cells (i.e., “bins”) partitioning \(\mathcal{D}\). These bins are only implicitly defined by M samples in \(\mathcal{D}\). Finally, the probabilities of each of these bins is computed by sampling from \(\rho{\mathcal{D}}\) and using nearest neighbor searches to bin these samples in the M implicitly defined bins. The result is the simple function approximation denoted by \(\rho_{\mathcal{D},M}\).

Note

data_set is only used to determine dimension.

Note that all computations in the measure-based approach that follow from this are for the fixed simple function approximation \(\rho_{\mathcal{D},M}\).

Parameters
  • M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.

  • rect_size (double or list) – Determines the size of the support of the uniform distribution on a generalized rectangle

  • num_d_emulate (int) – Number of samples used to emulate using an MC assumption

  • data_set (discretization or sample_set or ndarray) – Sample set that the probability measure is defined for.

  • Q_ref (ndarray of size (mdim,)) – \(Q(\)lambda_{reference})`

Return type

voronoi_sample_set

Returns

sample_set object defininng simple function approximation

bet.calculateP.simpleFunP.user_partition_user_distribution(data_set, data_partition_set, data_distribution_set)

Creates a user defined simple function approximation of a user defined distribution. The simple function discretization is specified in the data_partition_set, and the set of i.i.d. samples from the distribution is specified in the data_distribution_set.

Parameters
  • data_set (discretization or sample_set or ndarray) – Sample set that the probability measure is defined for.

  • data_partition_set (discretization or sample_set or ndarray) – Sample set defining the discretization of the data space into Voronoi cells for which a simple function is defined upon.

  • data_distribution_set (discretization or sample_set or ndarray) – Sample set containing the i.i.d. samples from the distribution on the data space that are binned within the Voronoi cells implicitly defined by the data_discretization_set.

Return type

voronoi_sample_set

Returns

sample_set object defininng simple function approximation

exception bet.calculateP.simpleFunP.wrong_argument_type

Bases: Exception

Exception for when the argument for data_set is not one of the acceptible types.

Module contents

This subpackage provides classes and methods for calculating the probability measure \(P_{\Lambda}\).

  • calculateP provides methods for approximating probability densities in the measure-based approach.

  • simpleFunP provides methods for creating simple function approximations of probability densities for the measure-based approach.

  • calculateR provides methods for density-based approach.

  • calculateError provides methods for approximating numerical and sampling errors.