Integration with MeasureTheory.jl #8

ablaom · 2021-10-30T00:49:41Z

The atomic objects defined in this package are just non-negative measures over a labelled sample space (see here) and so ought to fit into MeasureTheory.jl framework.

@cscherrer Be great if you can give a run-down of what's required. This package is a port of functionality still in MLJBase but with plans to replace it. While it's now publicly available, I've not promoted it all and there's scope for fixing things you may not like.

davibarreira · 2021-10-31T17:16:52Z

Very nice package! I've been looking for a proper way to deal with finite discrete measures for a while, since there is no such type of distribution in Distributions.jl. Now, am I mislead by the name or is this implementation not "well suitable" for actually multivariate distributions, I mean, when instead of unordered labels we have something like many samples from R^n?

cscherrer · 2021-10-31T17:17:49Z

Hi @ablaom , thanks for the ping :)

Generally, for some d::D to be a Measure requires two things:

At least one method of Measurebase.logdensity(d::D, x) or Measurebase.density(d::D, x), returning a float
A method basemeasure(d::D), returning another value satisfying the same interface.

It's valid for a measure to have itself as a base measure, which would make that measure "primitive". In this case we should get a log-density of 0.0.

The interface is still in development, and I'd welcome collaboration. "Primitiveness" should probably be trait-based.

If A → B means "A.jl has B.jl as a dependency", I can imagine three possible setups:

MeasureTheory → CategoricalDistributions. This could make sense if CD is light-weight, flexible, and fairly general, with good performance.
CategoricalDistributions → MeasureBase. This would require a little more for things to work well, in that you'd need to define a basemeasure instance. That should be easy, but it also makes your package a bit heavier (but still not bad IMO, MB is pretty light-weight)
MeasureBase ← CategoricalMeasures → CategoricalDistributions. Here the middle package would be a new glue package extending the other two to work well together.

In any case, in MeasureTheory we usually have a logdensity that only depends on the data, with other terms pushed into the base measure. This may not be an issue in this case, IIRC there's no normalization factor here. But it may come up if non-normalized weights are included. Anyway, if there is a normalization, I'd suggest having a logpdf method that's something like

logpdf(cd::CategoricalDistribution) = datadependentterms(cd) + normalizationterms(cd)

though probably with different names ;)

Splitting things up in this way makes it easy to optimize product measures, pulling the normalization terms of of the loop.

ablaom · 2021-10-31T20:31:50Z

link to zulip discussion: https://julialang.zulipchat.com/#narrow/stream/259730-measuretheory.2Ejl/topic/Contributing.20to.20MeasureTheory/near/259797384

ablaom · 2021-10-31T21:52:15Z

@cscherrer Thanks for that.

@davibarreira No, you are not mislead. This is not yet multivariate.

davibarreira · 2021-11-01T00:34:02Z

Thanks for the answer @ablaom.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration with MeasureTheory.jl #8

Integration with MeasureTheory.jl #8

ablaom commented Oct 30, 2021 •

edited

Loading

davibarreira commented Oct 31, 2021

cscherrer commented Oct 31, 2021

ablaom commented Oct 31, 2021

ablaom commented Oct 31, 2021

davibarreira commented Nov 1, 2021

Integration with MeasureTheory.jl #8

Integration with MeasureTheory.jl #8

Comments

ablaom commented Oct 30, 2021 • edited Loading

davibarreira commented Oct 31, 2021

cscherrer commented Oct 31, 2021

ablaom commented Oct 31, 2021

ablaom commented Oct 31, 2021

davibarreira commented Nov 1, 2021

ablaom commented Oct 30, 2021 •

edited

Loading