bet.calculateP package¶
Submodules¶
bet.calculateP.calculateError module¶
This module provides methods for calculating error estimates of the probability measure for calculate probability measures. See Butler et al. 2015. <http://arxiv.org/pdf/1407.3851>.
cell_connectivity_exactcalculates the connectivity of cells.boundary_setscalculates which cells are on the boundary and strictly interior for contour events.sampling_erroris for calculating error estimates due to sampling.model_erroris for calculating error estimates due to error in solution of QoIs
- bet.calculateP.calculateError.boundary_sets(disc, nei_list)¶
 Calculates the the neighboring Voronoi cells for each cell.
- Parameters
 disc (
bet.sample.discretization) – An object containing the discretization information.nei_list (list) – list of lists defining contour events of neighboring cells.
- Return type
 tuple
- Returns
 (\(B_N, C_N\)) where B_N are the cells strictly on the interior of a contour event and C_N are the cells on the boundary of a contour eventas defined in Butler et al. 2015. <http://arxiv.org/pdf/1407.3851>
- bet.calculateP.calculateError.cell_connectivity_exact(disc)¶
 Calculates contour events of the cells and its neighbors.
- Parameters
 disc (
bet.sample.discretization) – An object containing the discretization information.- Return type
 list
- Returns
 list of lists of neighboring cells
- class bet.calculateP.calculateError.model_error(disc)¶
 Bases:
objectA class for calculating the error due to numerical error for a discretization.
- calculate_for_contour_events()¶
 Calculate the numerical error for each contour event.
- Return type
 list
- Returns
 er_list, a list of the error estimates for each contour event.
- calculate_for_sample_set_region(s_set, region, emulated_set=None)¶
 Calculate the numerical error estimate for a region of the input space defined by a sample set object.
- Parameters
 s_set (
bet.sample.sample_set_base) – sample set for which to calculate errorregion (int) – region of s_set for which to calculate error
emulated_set – sample set for volume emulation
- Return type
 - Returns
 er_est, the numerical error estimate for the region
- calculate_for_sample_set_region_mc(s_set, region)¶
 Calculate the numerical error estimate for a region of the input space defined by a sample set object, using the MC assumption.
- Parameters
 s_set (
bet.sample.sample_set_base) – sample set for which to calculate errorregion (int) – region of s_set for which to calculate error
:rtype float :returns:
er_est, the numerical error estimate for the region
- disc¶
 bet.sample.discretiztiondefining the problem
- disc_new¶
 bet.sample.discretiztionfrom adding error estimates
- num¶
 number of inputs and outputs
- class bet.calculateP.calculateError.sampling_error(disc, exact=True)¶
 Bases:
objectA class for calculating the error due to sampling for a discretization.
- B_N¶
 dictionaries of interior and boundary sets
- C_N¶
 dictionaries of interior and boundary sets
- calculate_for_contour_events()¶
 Calculate the sampling error bounds for each contour event. If volumes are already calculated these are used. If not, volume emulation is used if an emulated input set exists. Otherwise the MC assumption is made.
- Return type
 tuple
- Returns
 (
up_list,low_list) whereup_listis a list of the upper bounds for each contour event andlow_listis a list of the lower bounds.
- calculate_for_sample_set_region(s_set, region, emulated_set=None)¶
 Calculate the sampling error bounds for a region of the input space defined by a sample set object which defines an event \(A\).
- Parameters
 s_set (
bet.sample.sample_set_base) – sample set for which to calculate errorregion (int) – region of s_set for which to calculate error
emulated_set (
bet.sample_set_base) – sample set for volume emulation
- Return type
 tuple
- Returns
 (
upper_bound,lower_bound) the upper and lower bounds for the error.
- disc¶
 bet.sample.discretizationthat defines the problem
- num¶
 number of inputs and outputs
- exception bet.calculateP.calculateError.wrong_argument_type¶
 Bases:
ExceptionException for when the argument is not one of the acceptible types.
bet.calculateP.calculateP module¶
This module provides methods for calculating the probability measure \(P_{\Lambda}\).
prob_on_emulated_samplesprovides a skeleton class and calculates the probability for a set of emulation points.probestimates the probability based on pre-defined volumes.prob_with_emulatedestimates the probability using volume emulation.prob_from_sample_setestimates the probability based on probabilities from another
sample set on the same space.
- bet.calculateP.calculateP.prob(discretization, globalize=True)¶
 Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples}})\), the probability associated with a set of cells defined by the model solves at \((\lambda_{samples})\) where the volumes of these cells are provided.
- Parameters
 discretization (class:bet.sample.discretization) – An object containing the discretization information.
globalize (bool) – Makes local variables global.
- bet.calculateP.calculateP.prob_from_discretization_input(disc, set_new)¶
 Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_new}})\) from \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_old}})\) where \(\lambda_{samples_old}\) come from an input discretization.
- Parameters
 disc (
discretization) – Discretiztion on which probabilities have already been calculatedset_new (
sample_set_base) – Sample set for which probabilities will be calculated.
- bet.calculateP.calculateP.prob_from_sample_set(set_old, set_new)¶
 Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_new}})\) from \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_old}})\) using the MC assumption with respect to set_old.
- Parameters
 set_old (
sample_set_base) – Sample set on which probabilities have already been calculatedset_new (
sample_set_base) – Sample set for which probabilities will be calculated.
- bet.calculateP.calculateP.prob_from_sample_set_with_emulated_volumes(set_old, set_new, set_emulate=None)¶
 Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_new}})\) from \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_old}})\) using a set of emulated points are distributed with respect to the volume measure.
- Parameters
 set_old (
sample_set_base) – Sample set on which probabilities have already been calculatedset_new (
sample_set_base) – Sample set for which probabilities will be calculated.set_emulate (
sample_set_base) – Sample set for volume emulation
- bet.calculateP.calculateP.prob_on_emulated_samples(discretization, globalize=True)¶
 Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{emulate}})\), the probability associated with a set of Voronoi cells defined by
num_l_emulateiid samples \((\lambda_{emulate})\). This is added to the emulated input sample set object.- Parameters
 discretization (class:bet.sample.discretization) – An object containing the discretization information.
globalize (bool) – Makes local variables global.
- bet.calculateP.calculateP.prob_with_emulated_volumes(discretization)¶
 Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples}})\), the probability associated with a set of cells defined by the model solves at \((\lambda_{samples})\) where the volumes are calculated with the given emulated input points.
- Parameters
 discretization (class:bet.sample.discretization) – An object containing the discretization information.
globalize – Makes local variables global.
bet.calculateP.calculateR module¶
This module contains functions for the density-based approach that utilizes a ratio of observed to predicted densities to update an initial density on the parameter space.
generate_output_kdes()generates KDEs on output sets.invert_to_kde()solves SIP for weighted KDEs.invert_to_gmm()solves SIP for a Gaussian Mixture Model.invert_to_multivariate_gaussian()solves SIP for a multivariate Gaussian.invert_to_random_variable()solves SIP for random variables.invert_rejection_sampling()solves SIP with rejection sampling.
- bet.calculateP.calculateR.generate_output_kdes(discretization, bw_method=None)¶
 Generate Kernel Density Estimates on predicted and observed output sample sets.
- Parameters
 discretization (
bet.sample.discretization) – Discretization used to calculate KDesbw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
 prediction set, prediction kdes, observation set, observation kdes, number of clusters
- Return type
 bet.discretization.sample_set, list,bet.discretization.sample_set, list, int
- bet.calculateP.calculateR.invert(discretization, bw_method=None)¶
 Solve the data consistent stochastic inverse problem, solving for input sample weights.
- Parameters
 discretization (
bet.sample.discretization) – Discretization on which to perform inversion.bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
 acceptance rate, mean acceptance rate, pointers for samples to clusters
- Return type
 list, np.ndarray, list
- bet.calculateP.calculateR.invert_rejection_sampling(discretization, bw_method=None)¶
 Solve the data consistent stochastic inverse problem by rejection sampling.
- Parameters
 discretization (
bet.sample.discretization) – Discretization on which to perform inversion.bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
 sample set containing samples
- Return type
 
- bet.calculateP.calculateR.invert_to_gmm(discretization, bw_method=None)¶
 Solve the data consistent stochastic inverse problem, solving for a Gaussian mixture model.
- Parameters
 discretization (
bet.sample.discretization) – Discretization on which to perform inversion.bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
 means, covariances, and weights for Gaussians
- Return type
 list, list, list
- bet.calculateP.calculateR.invert_to_kde(discretization, bw_method=None)¶
 Solve the data consistent stochastic inverse problem, solving for a weighted kernel density estimate.
- Parameters
 discretization (
bet.sample.discretization) – Discretization on which to perform inversion.bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
 marginal probabilities and cluster weights
- Return type
 list, np.ndarray
- bet.calculateP.calculateR.invert_to_multivariate_gaussian(discretization, bw_method=None)¶
 Solve the data consistent stochastic inverse problem, solving for a multivariate Gaussian.
- Parameters
 discretization (
bet.sample.discretization) – Discretization on which to perform inversion.bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
 marginal probabilities and cluster weights
- Return type
 list, np.ndarray
- bet.calculateP.calculateR.invert_to_random_variable(discretization, rv, num_reweighted=10000, bw_method=None)¶
 Solve the data consistent stochastic inverse problem, fitting a random variable.
rv can take multiple types of formats depending on type of distribution.
A string is used for the same distribution with default parameters in each dimension. ex. rv = ‘uniform’ or rv = ‘beta’
A list or tuple of length 2 is used for the same distribution with fixed user-defined parameters in each dimension as a dictionary. ex. rv = [‘uniform’, {‘floc’:-2, ‘fscale’:5}] or rv = [‘beta’, {‘fa’: 2, ‘fb’:5, ‘floc’:-2, ‘fscale’:5}]
A list of length dim which entries of lists or tuples of length 2 is used for different distributions with fixed user-defined parameters in each dimension as a dictionary. ex. rv = [[‘uniform’, {‘floc’:-2, ‘fscale’:5}],
[‘beta’, {‘fa’: 2, ‘fb’:5, ‘floc’:-2, ‘fscale’:5}]]
- Parameters
 discretization (
bet.sample.discretization) – Discretization on which to perform inversion.rv (str, list, or tuple) – Type and parameters for continuous random variables.
num_reweighted (int) – number of reweighted samples for fitting
bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
 marginal probabilities and cluster weights
- Return type
 list, np.ndarray
bet.calculateP.simpleFunP module¶
This module provides methods for creating simple function approximations to be
used by calculateP. These simple function approximations
are returned as bet.sample.sample_set objects.
- bet.calculateP.simpleFunP.check_inputs(data_set, Q_ref)¶
 Checks inputs to methods.
- bet.calculateP.simpleFunP.check_inputs_no_reference(data_set)¶
 Checks inputs to methods.
- bet.calculateP.simpleFunP.check_type(val, data_set=None)¶
 Add support for different data types that can be passed as keyword arguments. Attempt to infer dimension and set it correctly.
- bet.calculateP.simpleFunP.infer_Q(data_set)¶
 Attempt to infer reference value around which to define a sample set.
- bet.calculateP.simpleFunP.normal_partition_normal_distribution(data_set, Q_ref=None, std=1, M=1, num_d_emulate=1000000.0)¶
 Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a multivariate normal probability density centered at
Q_refwith standard deviationstdusingMbins sampled from the given normal distribution.- Parameters
 data_set (
discretizationorsample_setorndarray) – Sample set that the probability measure is defined for.M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.
num_d_emulate (int) – Number of samples used to emulate using an MC assumption
Q_ref (
ndarrayof size (mdim,)) – \(Q(\lambda_{reference})\)std (
ndarrayof size (mdim,)) – The standard deviation of each QoI
- Return type
 - Returns
 sample_set object defining simple function approximation
- bet.calculateP.simpleFunP.regular_partition_uniform_distribution_rectangle_domain(data_set, rect_domain, cells_per_dimension=1)¶
 Creates a simple function appoximation of \(\rho_{\mathcal{D},M}\) where \(\rho{\mathcal{D}, M}\) is a uniform probablity density over the hyperrectangular domain specified by
rect_domain.Since \(\rho_\mathcal{D}\) is a uniform distribution on a hyperrectangle we are able to represent it exactly.
- Parameters
 data_set (
discretizationorsample_setorndarray) – Sample set that the probability measure is defined for.rect_domain (
numpy.ndarrayof shape (2, mdim)) – The domain overwhich \(\rho_\mathcal{D}\) is uniform.cells_per_dimension (list) – number of cells per dimension
- Return type
 - Returns
 sample_set object defining simple function approximation
- bet.calculateP.simpleFunP.regular_partition_uniform_distribution_rectangle_scaled(data_set, Q_ref=None, rect_scale=1, cells_per_dimension=1)¶
 Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a uniform probability density centered at
Q_ref(or thereference_valueof a sample set. IfQ_refis not given the reference value is used.) withrect_scaleof the width of D.Since rho_D is a uniform distribution on a hyperrectanlge we are able to represent it exactly.
- Parameters
 data_set (
discretizationorsample_setorndarray) – Sample set that the probability measure is defined for.rect_scale (double or list) – The scale used to determine the width of the uniform distribution as
rect_size = (data_max-data_min)*rect_scaleQ_ref (
ndarrayof size (mdim,)) – \(Q(\lambda_{reference})\)cells_per_dimension (list) – number of cells per dimension
- Return type
 - Returns
 sample_set object defining simple function approximation
- bet.calculateP.simpleFunP.regular_partition_uniform_distribution_rectangle_size(data_set, Q_ref=None, rect_size=None, cells_per_dimension=1)¶
 Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a uniform probability density centered at
Q_ref(or thereference_valueof a sample set. IfQ_refis not given the reference value is used) withrect_sizeof the width of a hyperrectangle.Since rho_D is a uniform distribution on a hyperrectanlge we can represent it exactly.
- Parameters
 rect_size (double or list) – The size used to determine the width of the uniform distribution
data_set (
discretizationorsample_setorndarray) – Sample set that the probability measure is defined for.Q_ref (
ndarrayof size (mdim,)) – \(Q(\lambda_{reference})\)cells_per_dimension (list) – number of cells per dimension.
- Return type
 - Returns
 sample_set object defining simple function approximation
- bet.calculateP.simpleFunP.uniform_partition_normal_distribution(data_set, Q_ref=None, std=1, M=1, num_d_emulate=1000000.0)¶
 Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a multivariate normal probability density centered at
Q_refwith standard deviationstdusingMbins sampled from a uniform distribution with a size 4 standard deviations in each direction.- Parameters
 data_set (
discretizationorsample_setorndarray) – Sample set that the probability measure is defined for.M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.
num_d_emulate (int) – Number of samples used to emulate using an MC assumption
Q_ref (
ndarrayof size (mdim,)) – \(Q(\lambda_{reference})\)std (
ndarrayof size (mdim,)) – The standard deviation of each QoI
- Return type
 - Returns
 sample_set object defininng simple function approximation
- bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_data_samples(data_set)¶
 Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a uniform probability density over the entire
data_domain. Here thedata_domainis the union of Voronoi cells defined bydata. In other words we assign each sample the same probability, soM = len(data)or ratherlen(d_distr_samples) == len(data). The purpose of this method is to approximate uniform distributions over irregularly shaped domains.- Parameters
 data_set (
discretizationorsample_setorndarray) – Sample set that the probability measure is defined for.- Return type
 - Returns
 sample_set object defininng simple function approximation
- bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_rectangle_domain(data_set, rect_domain, M=50, num_d_emulate=1000000.0)¶
 Creates a simple function approximation of \(\rho_{\mathcal{D}}\) where \(\rho_{\mathcal{D}}\) is a uniform probability density on a generalized rectangle defined by
rect_domain. The simple function approximation is then defined by determiningMVoronoi cells (i.e., “bins”) partitioning \(\mathcal{D}\). These bins are only implicitly defined byM ``samples in :math:`\mathcal{D}`. Finally, the probabilities of each of these bins is computed by sampling from :math:`\rho{\mathcal{D}}` and using nearest neighbor searches to bin these samples in the ``Mimplicitly defined bins. The result is the simple function approximation denoted by \(\rho_{\mathcal{D},M}\).Note that all computations in the measure-based approach that follow from this are for the fixed simple function approximation \(\rho_{\mathcal{D},M}\).
- Parameters
 M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.
rect_domain (double or list) – The support of the density
num_d_emulate (int) – Number of samples used to emulate using an MC assumption
data_set (
discretizationorsample_setorndarray) – Sample set that the probability measure is defined for.Q_ref (
ndarrayof size (mdim,)) – \(Q(\)lambda_{reference})`
- Return type
 - Returns
 sample_set object defininng simple function approximation
- bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_rectangle_scaled(data_set, Q_ref=None, rect_scale=0.2, M=50, num_d_emulate=1000000.0)¶
 Creates a simple function approximation of \(\rho_{\mathcal{D}}\) where \(\rho_{\mathcal{D}}\) is a uniform probability density on a generalized rectangle centered at
Q_refor thereference_valueof a sample set. IfQ_refis not given the reference value is used.. The support of this density is defined byrect_scale, which determines the size of the generalized rectangle by scaling the circumscribing generalized rectangle of \(\mathcal{D}\). The simple function approximation is then defined by determiningM `` Voronoi cells (i.e., "bins") partitioning :math:`\mathcal{D}`. These bins are only implicitly defined by ``Msamples in \(\mathcal{D}\). Finally, the probabilities of each of these bins is computed by sampling from \(\rho{\mathcal{D}}\) and using nearest neighbor searches to bin these samples in theMimplicitly defined bins. The result is the simple function approximation denoted by \(\rho_{\mathcal{D},M}\).Note that all computations in the measure-based approach that follow from this are for the fixed simple function approximation \(\rho_{\mathcal{D},M}\).
- Parameters
 M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.
rect_scale (double or list) – The scale used to determine the support of the uniform distribution as
rect_size = (data_max-data_min)*rect_scalenum_d_emulate (int) – Number of samples used to emulate using an MC assumption
data_set (
discretizationorsample_setorndarray) – Sample set that the probability measure is defined for.Q_ref (
ndarrayof size (mdim,)) – \(Q(\)lambda_{reference})`
- Return type
 - Returns
 sample_set object defininng simple function approximation
- bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_rectangle_size(data_set, Q_ref=None, rect_size=None, M=50, num_d_emulate=1000000.0)¶
 Creates a simple function approximation of \(\rho_{\mathcal{D}}\) where \(\rho_{\mathcal{D}}\) is a uniform probability density on a generalized rectangle centered at
Q_refor thereference_valueof a sample set. IfQ_refis not given the reference value is used. The support of this density is defined byrect_size, which determines the size of the generalized rectangle. The simple function approximation is then defined by determiningMVoronoi cells (i.e., “bins”) partitioning \(\mathcal{D}\). These bins are only implicitly defined byMsamples in \(\mathcal{D}\). Finally, the probabilities of each of these bins is computed by sampling from \(\rho{\mathcal{D}}\) and using nearest neighbor searches to bin these samples in theMimplicitly defined bins. The result is the simple function approximation denoted by \(\rho_{\mathcal{D},M}\).Note
data_setis only used to determine dimension.Note that all computations in the measure-based approach that follow from this are for the fixed simple function approximation \(\rho_{\mathcal{D},M}\).
- Parameters
 M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.
rect_size (double or list) – Determines the size of the support of the uniform distribution on a generalized rectangle
num_d_emulate (int) – Number of samples used to emulate using an MC assumption
data_set (
discretizationorsample_setorndarray) – Sample set that the probability measure is defined for.Q_ref (
ndarrayof size (mdim,)) – \(Q(\)lambda_{reference})`
- Return type
 - Returns
 sample_set object defininng simple function approximation
- bet.calculateP.simpleFunP.user_partition_user_distribution(data_set, data_partition_set, data_distribution_set)¶
 Creates a user defined simple function approximation of a user defined distribution. The simple function discretization is specified in the
data_partition_set, and the set of i.i.d. samples from the distribution is specified in thedata_distribution_set.- Parameters
 data_set (
discretizationorsample_setorndarray) – Sample set that the probability measure is defined for.data_partition_set (
discretizationorsample_setorndarray) – Sample set defining the discretization of the data space into Voronoi cells for which a simple function is defined upon.data_distribution_set (
discretizationorsample_setorndarray) – Sample set containing the i.i.d. samples from the distribution on the data space that are binned within the Voronoi cells implicitly defined by the data_discretization_set.
- Return type
 - Returns
 sample_set object defininng simple function approximation
- exception bet.calculateP.simpleFunP.wrong_argument_type¶
 Bases:
ExceptionException for when the argument for data_set is not one of the acceptible types.
Module contents¶
This subpackage provides classes and methods for calculating the probability measure \(P_{\Lambda}\).
calculatePprovides methods for approximating probability densities in the measure-based approach.simpleFunPprovides methods for creating simple function approximations of probability densities for the measure-based approach.calculateRprovides methods for density-based approach.calculateErrorprovides methods for approximating numerical and sampling errors.