[WIP] Add marginal likelihood estimation via bridge sampling #2040
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Co-authored-by: @junpenglao
Description
Provides an estimate of the (log) marginal likelihood, estimated using bridge sampling (as described in Gronau, Quentin F., et al., 2017), building on an implementation from @junpenglao. This could be expanded to add Bayes factor functionality if desired.
The bridge sampler uses samples from the posterior, so the
log_marginal_likelihood_bridgesampling
function takes as a parameter and InferenceData object that has a posterior group, as well as an unnormalized log probability function (e.g. from a pymc modelmodel.logp_array
).Because we fit a multivariate normal proposal distribution to the posterior samples, it is helpful to have samples that are transformed e.g. to have support on the real line instead of on a bounded interval. Although these transformed samples are created as part of the NUTS sampling, I believe they're not currently included in InferenceData (see issue #230 ). So,
log_marginal_likelihood_bridgesampling
currently takes a dict whose keys are variable names and whose values are the associated transformation (or the identity). You could get this from a pymc model with something like the following, although maybe there's a better way:Curious to hear any thoughts or feedback! I'm happy to write tests for this as well, but wanted to wait to get initial feedback before doing so.
Checklist