bet.calculateP package¶
Submodules¶
bet.calculateP.calculateError module¶
This module provides methods for calculating error estimates of the probability measure for calculate probability measures. See Butler et al. 2015. <http://arxiv.org/pdf/1407.3851>.
cell_connectivity_exact
calculates the connectivity of cells.boundary_sets
calculates which cells are on the boundary and strictly interior for contour events.sampling_error
is for calculating error estimates due to sampling.model_error
is for calculating error estimates due to error in solution of QoIs
- bet.calculateP.calculateError.boundary_sets(disc, nei_list)¶
Calculates the the neighboring Voronoi cells for each cell.
- Parameters
disc (
bet.sample.discretization
) – An object containing the discretization information.nei_list (list) – list of lists defining contour events of neighboring cells.
- Return type
tuple
- Returns
(\(B_N, C_N\)) where B_N are the cells strictly on the interior of a contour event and C_N are the cells on the boundary of a contour eventas defined in Butler et al. 2015. <http://arxiv.org/pdf/1407.3851>
- bet.calculateP.calculateError.cell_connectivity_exact(disc)¶
Calculates contour events of the cells and its neighbors.
- Parameters
disc (
bet.sample.discretization
) – An object containing the discretization information.- Return type
list
- Returns
list of lists of neighboring cells
- class bet.calculateP.calculateError.model_error(disc)¶
Bases:
object
A class for calculating the error due to numerical error for a discretization.
- calculate_for_contour_events()¶
Calculate the numerical error for each contour event.
- Return type
list
- Returns
er_list
, a list of the error estimates for each contour event.
- calculate_for_sample_set_region(s_set, region, emulated_set=None)¶
Calculate the numerical error estimate for a region of the input space defined by a sample set object.
- Parameters
s_set (
bet.sample.sample_set_base
) – sample set for which to calculate errorregion (int) – region of s_set for which to calculate error
emulated_set – sample set for volume emulation
- Return type
- Returns
er_est
, the numerical error estimate for the region
- calculate_for_sample_set_region_mc(s_set, region)¶
Calculate the numerical error estimate for a region of the input space defined by a sample set object, using the MC assumption.
- Parameters
s_set (
bet.sample.sample_set_base
) – sample set for which to calculate errorregion (int) – region of s_set for which to calculate error
:rtype float :returns:
er_est
, the numerical error estimate for the region
- disc¶
bet.sample.discretiztion
defining the problem
- disc_new¶
bet.sample.discretiztion
from adding error estimates
- num¶
number of inputs and outputs
- class bet.calculateP.calculateError.sampling_error(disc, exact=True)¶
Bases:
object
A class for calculating the error due to sampling for a discretization.
- B_N¶
dictionaries of interior and boundary sets
- C_N¶
dictionaries of interior and boundary sets
- calculate_for_contour_events()¶
Calculate the sampling error bounds for each contour event. If volumes are already calculated these are used. If not, volume emulation is used if an emulated input set exists. Otherwise the MC assumption is made.
- Return type
tuple
- Returns
(
up_list
,low_list
) whereup_list
is a list of the upper bounds for each contour event andlow_list
is a list of the lower bounds.
- calculate_for_sample_set_region(s_set, region, emulated_set=None)¶
Calculate the sampling error bounds for a region of the input space defined by a sample set object which defines an event \(A\).
- Parameters
s_set (
bet.sample.sample_set_base
) – sample set for which to calculate errorregion (int) – region of s_set for which to calculate error
emulated_set (
bet.sample_set_base
) – sample set for volume emulation
- Return type
tuple
- Returns
(
upper_bound
,lower_bound
) the upper and lower bounds for the error.
- disc¶
bet.sample.discretization
that defines the problem
- num¶
number of inputs and outputs
- exception bet.calculateP.calculateError.wrong_argument_type¶
Bases:
Exception
Exception for when the argument is not one of the acceptible types.
bet.calculateP.calculateP module¶
This module provides methods for calculating the probability measure \(P_{\Lambda}\).
prob_on_emulated_samples
provides a skeleton class and calculates the probability for a set of emulation points.prob
estimates the probability based on pre-defined volumes.prob_with_emulated
estimates the probability using volume emulation.prob_from_sample_set
estimates the probability based on probabilities from another
sample set on the same space.
- bet.calculateP.calculateP.prob(discretization, globalize=True)¶
Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples}})\), the probability associated with a set of cells defined by the model solves at \((\lambda_{samples})\) where the volumes of these cells are provided.
- Parameters
discretization (class:bet.sample.discretization) – An object containing the discretization information.
globalize (bool) – Makes local variables global.
- bet.calculateP.calculateP.prob_from_discretization_input(disc, set_new)¶
Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_new}})\) from \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_old}})\) where \(\lambda_{samples_old}\) come from an input discretization.
- Parameters
disc (
discretization
) – Discretiztion on which probabilities have already been calculatedset_new (
sample_set_base
) – Sample set for which probabilities will be calculated.
- bet.calculateP.calculateP.prob_from_sample_set(set_old, set_new)¶
Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_new}})\) from \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_old}})\) using the MC assumption with respect to set_old.
- Parameters
set_old (
sample_set_base
) – Sample set on which probabilities have already been calculatedset_new (
sample_set_base
) – Sample set for which probabilities will be calculated.
- bet.calculateP.calculateP.prob_from_sample_set_with_emulated_volumes(set_old, set_new, set_emulate=None)¶
Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_new}})\) from \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples_old}})\) using a set of emulated points are distributed with respect to the volume measure.
- Parameters
set_old (
sample_set_base
) – Sample set on which probabilities have already been calculatedset_new (
sample_set_base
) – Sample set for which probabilities will be calculated.set_emulate (
sample_set_base
) – Sample set for volume emulation
- bet.calculateP.calculateP.prob_on_emulated_samples(discretization, globalize=True)¶
Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{emulate}})\), the probability associated with a set of Voronoi cells defined by
num_l_emulate
iid samples \((\lambda_{emulate})\). This is added to the emulated input sample set object.- Parameters
discretization (class:bet.sample.discretization) – An object containing the discretization information.
globalize (bool) – Makes local variables global.
- bet.calculateP.calculateP.prob_with_emulated_volumes(discretization)¶
Calculates \(P_{\Lambda}(\mathcal{V}_{\lambda_{samples}})\), the probability associated with a set of cells defined by the model solves at \((\lambda_{samples})\) where the volumes are calculated with the given emulated input points.
- Parameters
discretization (class:bet.sample.discretization) – An object containing the discretization information.
globalize – Makes local variables global.
bet.calculateP.calculateR module¶
This module contains functions for the density-based approach that utilizes a ratio of observed to predicted densities to update an initial density on the parameter space.
generate_output_kdes()
generates KDEs on output sets.invert_to_kde()
solves SIP for weighted KDEs.invert_to_gmm()
solves SIP for a Gaussian Mixture Model.invert_to_multivariate_gaussian()
solves SIP for a multivariate Gaussian.invert_to_random_variable()
solves SIP for random variables.invert_rejection_sampling()
solves SIP with rejection sampling.
- bet.calculateP.calculateR.generate_output_kdes(discretization, bw_method=None)¶
Generate Kernel Density Estimates on predicted and observed output sample sets.
- Parameters
discretization (
bet.sample.discretization
) – Discretization used to calculate KDesbw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
prediction set, prediction kdes, observation set, observation kdes, number of clusters
- Return type
bet.discretization.sample_set
, list,bet.discretization.sample_set
, list, int
- bet.calculateP.calculateR.invert(discretization, bw_method=None)¶
Solve the data consistent stochastic inverse problem, solving for input sample weights.
- Parameters
discretization (
bet.sample.discretization
) – Discretization on which to perform inversion.bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
acceptance rate, mean acceptance rate, pointers for samples to clusters
- Return type
list, np.ndarray, list
- bet.calculateP.calculateR.invert_rejection_sampling(discretization, bw_method=None)¶
Solve the data consistent stochastic inverse problem by rejection sampling.
- Parameters
discretization (
bet.sample.discretization
) – Discretization on which to perform inversion.bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
sample set containing samples
- Return type
- bet.calculateP.calculateR.invert_to_gmm(discretization, bw_method=None)¶
Solve the data consistent stochastic inverse problem, solving for a Gaussian mixture model.
- Parameters
discretization (
bet.sample.discretization
) – Discretization on which to perform inversion.bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
means, covariances, and weights for Gaussians
- Return type
list, list, list
- bet.calculateP.calculateR.invert_to_kde(discretization, bw_method=None)¶
Solve the data consistent stochastic inverse problem, solving for a weighted kernel density estimate.
- Parameters
discretization (
bet.sample.discretization
) – Discretization on which to perform inversion.bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
marginal probabilities and cluster weights
- Return type
list, np.ndarray
- bet.calculateP.calculateR.invert_to_multivariate_gaussian(discretization, bw_method=None)¶
Solve the data consistent stochastic inverse problem, solving for a multivariate Gaussian.
- Parameters
discretization (
bet.sample.discretization
) – Discretization on which to perform inversion.bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
marginal probabilities and cluster weights
- Return type
list, np.ndarray
- bet.calculateP.calculateR.invert_to_random_variable(discretization, rv, num_reweighted=10000, bw_method=None)¶
Solve the data consistent stochastic inverse problem, fitting a random variable.
rv can take multiple types of formats depending on type of distribution.
A string is used for the same distribution with default parameters in each dimension. ex. rv = ‘uniform’ or rv = ‘beta’
A list or tuple of length 2 is used for the same distribution with fixed user-defined parameters in each dimension as a dictionary. ex. rv = [‘uniform’, {‘floc’:-2, ‘fscale’:5}] or rv = [‘beta’, {‘fa’: 2, ‘fb’:5, ‘floc’:-2, ‘fscale’:5}]
A list of length dim which entries of lists or tuples of length 2 is used for different distributions with fixed user-defined parameters in each dimension as a dictionary. ex. rv = [[‘uniform’, {‘floc’:-2, ‘fscale’:5}],
[‘beta’, {‘fa’: 2, ‘fb’:5, ‘floc’:-2, ‘fscale’:5}]]
- Parameters
discretization (
bet.sample.discretization
) – Discretization on which to perform inversion.rv (str, list, or tuple) – Type and parameters for continuous random variables.
num_reweighted (int) – number of reweighted samples for fitting
bw_method (str) – bandwidth method for scipy.stats.gaussian_kde. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
- Returns
marginal probabilities and cluster weights
- Return type
list, np.ndarray
bet.calculateP.simpleFunP module¶
This module provides methods for creating simple function approximations to be
used by calculateP
. These simple function approximations
are returned as bet.sample.sample_set objects.
- bet.calculateP.simpleFunP.check_inputs(data_set, Q_ref)¶
Checks inputs to methods.
- bet.calculateP.simpleFunP.check_inputs_no_reference(data_set)¶
Checks inputs to methods.
- bet.calculateP.simpleFunP.check_type(val, data_set=None)¶
Add support for different data types that can be passed as keyword arguments. Attempt to infer dimension and set it correctly.
- bet.calculateP.simpleFunP.infer_Q(data_set)¶
Attempt to infer reference value around which to define a sample set.
- bet.calculateP.simpleFunP.normal_partition_normal_distribution(data_set, Q_ref=None, std=1, M=1, num_d_emulate=1000000.0)¶
Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a multivariate normal probability density centered at
Q_ref
with standard deviationstd
usingM
bins sampled from the given normal distribution.- Parameters
data_set (
discretization
orsample_set
orndarray
) – Sample set that the probability measure is defined for.M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.
num_d_emulate (int) – Number of samples used to emulate using an MC assumption
Q_ref (
ndarray
of size (mdim,)) – \(Q(\lambda_{reference})\)std (
ndarray
of size (mdim,)) – The standard deviation of each QoI
- Return type
- Returns
sample_set object defining simple function approximation
- bet.calculateP.simpleFunP.regular_partition_uniform_distribution_rectangle_domain(data_set, rect_domain, cells_per_dimension=1)¶
Creates a simple function appoximation of \(\rho_{\mathcal{D},M}\) where \(\rho{\mathcal{D}, M}\) is a uniform probablity density over the hyperrectangular domain specified by
rect_domain
.Since \(\rho_\mathcal{D}\) is a uniform distribution on a hyperrectangle we are able to represent it exactly.
- Parameters
data_set (
discretization
orsample_set
orndarray
) – Sample set that the probability measure is defined for.rect_domain (
numpy.ndarray
of shape (2, mdim)) – The domain overwhich \(\rho_\mathcal{D}\) is uniform.cells_per_dimension (list) – number of cells per dimension
- Return type
- Returns
sample_set object defining simple function approximation
- bet.calculateP.simpleFunP.regular_partition_uniform_distribution_rectangle_scaled(data_set, Q_ref=None, rect_scale=1, cells_per_dimension=1)¶
Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a uniform probability density centered at
Q_ref
(or thereference_value
of a sample set. IfQ_ref
is not given the reference value is used.) withrect_scale
of the width of D.Since rho_D is a uniform distribution on a hyperrectanlge we are able to represent it exactly.
- Parameters
data_set (
discretization
orsample_set
orndarray
) – Sample set that the probability measure is defined for.rect_scale (double or list) – The scale used to determine the width of the uniform distribution as
rect_size = (data_max-data_min)*rect_scale
Q_ref (
ndarray
of size (mdim,)) – \(Q(\lambda_{reference})\)cells_per_dimension (list) – number of cells per dimension
- Return type
- Returns
sample_set object defining simple function approximation
- bet.calculateP.simpleFunP.regular_partition_uniform_distribution_rectangle_size(data_set, Q_ref=None, rect_size=None, cells_per_dimension=1)¶
Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a uniform probability density centered at
Q_ref
(or thereference_value
of a sample set. IfQ_ref
is not given the reference value is used) withrect_size
of the width of a hyperrectangle.Since rho_D is a uniform distribution on a hyperrectanlge we can represent it exactly.
- Parameters
rect_size (double or list) – The size used to determine the width of the uniform distribution
data_set (
discretization
orsample_set
orndarray
) – Sample set that the probability measure is defined for.Q_ref (
ndarray
of size (mdim,)) – \(Q(\lambda_{reference})\)cells_per_dimension (list) – number of cells per dimension.
- Return type
- Returns
sample_set object defining simple function approximation
- bet.calculateP.simpleFunP.uniform_partition_normal_distribution(data_set, Q_ref=None, std=1, M=1, num_d_emulate=1000000.0)¶
Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a multivariate normal probability density centered at
Q_ref
with standard deviationstd
usingM
bins sampled from a uniform distribution with a size 4 standard deviations in each direction.- Parameters
data_set (
discretization
orsample_set
orndarray
) – Sample set that the probability measure is defined for.M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.
num_d_emulate (int) – Number of samples used to emulate using an MC assumption
Q_ref (
ndarray
of size (mdim,)) – \(Q(\lambda_{reference})\)std (
ndarray
of size (mdim,)) – The standard deviation of each QoI
- Return type
- Returns
sample_set object defininng simple function approximation
- bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_data_samples(data_set)¶
Creates a simple function approximation of \(\rho_{\mathcal{D},M}\) where \(\rho_{\mathcal{D},M}\) is a uniform probability density over the entire
data_domain
. Here thedata_domain
is the union of Voronoi cells defined bydata
. In other words we assign each sample the same probability, soM = len(data)
or ratherlen(d_distr_samples) == len(data)
. The purpose of this method is to approximate uniform distributions over irregularly shaped domains.- Parameters
data_set (
discretization
orsample_set
orndarray
) – Sample set that the probability measure is defined for.- Return type
- Returns
sample_set object defininng simple function approximation
- bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_rectangle_domain(data_set, rect_domain, M=50, num_d_emulate=1000000.0)¶
Creates a simple function approximation of \(\rho_{\mathcal{D}}\) where \(\rho_{\mathcal{D}}\) is a uniform probability density on a generalized rectangle defined by
rect_domain
. The simple function approximation is then defined by determiningM
Voronoi cells (i.e., “bins”) partitioning \(\mathcal{D}\). These bins are only implicitly defined byM ``samples in :math:`\mathcal{D}`. Finally, the probabilities of each of these bins is computed by sampling from :math:`\rho{\mathcal{D}}` and using nearest neighbor searches to bin these samples in the ``M
implicitly defined bins. The result is the simple function approximation denoted by \(\rho_{\mathcal{D},M}\).Note that all computations in the measure-based approach that follow from this are for the fixed simple function approximation \(\rho_{\mathcal{D},M}\).
- Parameters
M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.
rect_domain (double or list) – The support of the density
num_d_emulate (int) – Number of samples used to emulate using an MC assumption
data_set (
discretization
orsample_set
orndarray
) – Sample set that the probability measure is defined for.Q_ref (
ndarray
of size (mdim,)) – \(Q(\)lambda_{reference})`
- Return type
- Returns
sample_set object defininng simple function approximation
- bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_rectangle_scaled(data_set, Q_ref=None, rect_scale=0.2, M=50, num_d_emulate=1000000.0)¶
Creates a simple function approximation of \(\rho_{\mathcal{D}}\) where \(\rho_{\mathcal{D}}\) is a uniform probability density on a generalized rectangle centered at
Q_ref
or thereference_value
of a sample set. IfQ_ref
is not given the reference value is used.. The support of this density is defined byrect_scale
, which determines the size of the generalized rectangle by scaling the circumscribing generalized rectangle of \(\mathcal{D}\). The simple function approximation is then defined by determiningM `` Voronoi cells (i.e., "bins") partitioning :math:`\mathcal{D}`. These bins are only implicitly defined by ``M
samples in \(\mathcal{D}\). Finally, the probabilities of each of these bins is computed by sampling from \(\rho{\mathcal{D}}\) and using nearest neighbor searches to bin these samples in theM
implicitly defined bins. The result is the simple function approximation denoted by \(\rho_{\mathcal{D},M}\).Note that all computations in the measure-based approach that follow from this are for the fixed simple function approximation \(\rho_{\mathcal{D},M}\).
- Parameters
M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.
rect_scale (double or list) – The scale used to determine the support of the uniform distribution as
rect_size = (data_max-data_min)*rect_scale
num_d_emulate (int) – Number of samples used to emulate using an MC assumption
data_set (
discretization
orsample_set
orndarray
) – Sample set that the probability measure is defined for.Q_ref (
ndarray
of size (mdim,)) – \(Q(\)lambda_{reference})`
- Return type
- Returns
sample_set object defininng simple function approximation
- bet.calculateP.simpleFunP.uniform_partition_uniform_distribution_rectangle_size(data_set, Q_ref=None, rect_size=None, M=50, num_d_emulate=1000000.0)¶
Creates a simple function approximation of \(\rho_{\mathcal{D}}\) where \(\rho_{\mathcal{D}}\) is a uniform probability density on a generalized rectangle centered at
Q_ref
or thereference_value
of a sample set. IfQ_ref
is not given the reference value is used. The support of this density is defined byrect_size
, which determines the size of the generalized rectangle. The simple function approximation is then defined by determiningM
Voronoi cells (i.e., “bins”) partitioning \(\mathcal{D}\). These bins are only implicitly defined byM
samples in \(\mathcal{D}\). Finally, the probabilities of each of these bins is computed by sampling from \(\rho{\mathcal{D}}\) and using nearest neighbor searches to bin these samples in theM
implicitly defined bins. The result is the simple function approximation denoted by \(\rho_{\mathcal{D},M}\).Note
data_set
is only used to determine dimension.Note that all computations in the measure-based approach that follow from this are for the fixed simple function approximation \(\rho_{\mathcal{D},M}\).
- Parameters
M (int) – Defines number M samples in D used to define \(\rho_{\mathcal{D},M}\) The choice of M is something of an “art” - play around with it and you can get reasonable results with a relatively small number here like 50.
rect_size (double or list) – Determines the size of the support of the uniform distribution on a generalized rectangle
num_d_emulate (int) – Number of samples used to emulate using an MC assumption
data_set (
discretization
orsample_set
orndarray
) – Sample set that the probability measure is defined for.Q_ref (
ndarray
of size (mdim,)) – \(Q(\)lambda_{reference})`
- Return type
- Returns
sample_set object defininng simple function approximation
- bet.calculateP.simpleFunP.user_partition_user_distribution(data_set, data_partition_set, data_distribution_set)¶
Creates a user defined simple function approximation of a user defined distribution. The simple function discretization is specified in the
data_partition_set
, and the set of i.i.d. samples from the distribution is specified in thedata_distribution_set
.- Parameters
data_set (
discretization
orsample_set
orndarray
) – Sample set that the probability measure is defined for.data_partition_set (
discretization
orsample_set
orndarray
) – Sample set defining the discretization of the data space into Voronoi cells for which a simple function is defined upon.data_distribution_set (
discretization
orsample_set
orndarray
) – Sample set containing the i.i.d. samples from the distribution on the data space that are binned within the Voronoi cells implicitly defined by the data_discretization_set.
- Return type
- Returns
sample_set object defininng simple function approximation
- exception bet.calculateP.simpleFunP.wrong_argument_type¶
Bases:
Exception
Exception for when the argument for data_set is not one of the acceptible types.
Module contents¶
This subpackage provides classes and methods for calculating the probability measure \(P_{\Lambda}\).
calculateP
provides methods for approximating probability densities in the measure-based approach.simpleFunP
provides methods for creating simple function approximations of probability densities for the measure-based approach.calculateR
provides methods for density-based approach.calculateError
provides methods for approximating numerical and sampling errors.