Implement Gaussian Process as a model #53

oesteban · 2021-09-14T12:28:28Z

In order to compare baseline performance vs. FSL eddy.

josephmje · 2021-12-07T17:22:07Z

Adding notes from our 2021-12-07 meeting here:

A first stab at implementing the Gaussian process model is at #60.

At the moment, we haven't done any tuning or optimization of the model. If we are going to eventually compare it to FSL eddy, that will need to be done.

For testing, some suggestions are to use:

simulated data from FiberFox
IXI dataset
a high quality dataset with certain voxels masked (perhaps near a white matter tract or some CSF in the middle of a ventricle)

Will eventually need to add better parallelization. Currently, the DTI model chunks the data before fitting the model. We can also implement that code here. For the future, we may need to dig into dipy and can use joblib/dask to further improve the parallelization.

See also comments: https://github.com/nipreps/eddymotion/pull/60/files#r764144040

dPys · 2022-08-27T22:52:48Z

@josephmje - Not sure if you or anyone else are still working on the GP model, but figured I'd chime in with a few thoughts to get the conversation going again:

The GP model needs at least a PoC design for the kernel before it can be tuned. Here's a first template for what that might look like. High-Level Requirements:
*Isotropism, uniform affinity for all angular directions.
*Symmetry around the origin.

The way that most folks have seemed to address this in the context of q-space is to factorize the covariance function into (a) an angular part and (b) a radial part, with the addition of (c) an error term.

The angular part can be constructed using Legendre polynomials, whereby symmetry about the origin can be achieved by excluding odd terms and ensuring that their coefficients are positive. Past work seems to have used an order <=6, with the consensus being that grabbing the first four terms is best.

For the radial part, we could start with sklearn's standard RBF kernel and see how far it gets us. My biggest concern would be performance...

and then the final kernel would would look something like the following, with ~6 hyperparams that'd still need to be tuned:

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process import kernels
from sklearn.gaussian_process.kernels import StationaryKernelMixin, GenericKernelMixin, Kernel
from numpy.polynomial.legendre import Legendre
                
def legendre_polynomial(n_gradients, order=6):
    """Even terms only"""
    x = np.arange(n_gradients)
    x = x - x.max()/2
    num_pol = range(0, order + 1, 2)
    y = np.ones((len(num_pol),len(x)))   
    coeff = np.eye(int(order/2 + 1))
    
    for i in num_pol:
        myleg = Legendre(coeff[i])
        y[i,:] = myleg(x) 
        if i>0:
            y[i,:] = y[i,:] - np.mean(y[i,:])
            y[i,:] = y[i,:]/np.max(y[i,:])
    return y

class LegendreKernel(StationaryKernelMixin, GenericKernelMixin, Kernel):
    def __init__(self, order=6, length_scale_bounds=(1e-5, 1e5)):
        self.order = order
        self.length_scale_bounds = length_scale_bounds
        super().__init__()
        
    def __call__(self, X, Y=None, eval_gradient=False):
        X = np.atleast_2d(X)
        if Y is None:
            Y = X
        else:
            Y = np.atleast_2d(Y)
            
        if eval_gradient:
            raise ValueError("Gradient can only be evaluated when Y is None.")
            
        K = np.zeros((X.shape[0], Y.shape[0]))
        for i in range(X.shape[0]):
            for j in range(Y.shape[0]):
                K[i,j] = np.dot(legendre_polynomial(X.shape[1], order=self.order).T, legendre_polynomial(Y.shape[1], order=self.order))
        return K
    
    def diag(self, X):
        return np.ones(X.shape[0])
    
    def is_stationary(self):
        return True

q_lengthscale=1
noise_level=1e-2

kernel = (kernels.WhiteKernel(noise_level=noise_level, noise_level_bounds=(1e-10, 1e1)) * kernels.WhiteKernel(noise_level=noise_level, noise_level_bounds=(1e-10, 1e1)) * kernels.WhiteKernel(noise_level=noise_level, noise_level_bounds=(1e-10, 1e1)) * kernels.RBF(length_scale=q_lengthscale, length_scale_bounds=(1e-2, 1e3)) * LegendreKernel(order=6))
    
param = {"solver": GaussianProcessRegressor(n_restarts_optimizer=10, alpha=0.1)}

The bulk of the work on this would be the initial round of tuning done during a pre-training stage, and using a synthetic, fiber-fox generated sample as you originally had proposed. The objective that we'd want to minimize is the negative log marginal likelihood, and the data itself would be synthesized such that training and validation sets have similar amounts of noise / order of magnitude but perhaps different numbers of directions compared to evaluation data.

A ton of work is involved in this obviously, but figured it couldn't hurt to discuss more. WDYT?

oesteban added effort: high Estimated high-effort task impact: high Estimated high-impact task models labels Sep 14, 2021

oesteban added this to the 0.2.0 milestone Sep 14, 2021

oesteban self-assigned this Sep 14, 2021

josephmje mentioned this issue Dec 10, 2021

ENH: Adds Sparse Fascicle and Gaussian Process models #60

Merged

dPys mentioned this issue Dec 13, 2021

Minimize memory footprint whereever possible, add joblib support for … dipy/dipy#2488

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Gaussian Process as a model #53

Implement Gaussian Process as a model #53

oesteban commented Sep 14, 2021

josephmje commented Dec 7, 2021 •

edited

Loading

dPys commented Aug 27, 2022 •

edited

Loading

Implement Gaussian Process as a model #53

Implement Gaussian Process as a model #53

Comments

oesteban commented Sep 14, 2021

josephmje commented Dec 7, 2021 • edited Loading

dPys commented Aug 27, 2022 • edited Loading

josephmje commented Dec 7, 2021 •

edited

Loading

dPys commented Aug 27, 2022 •

edited

Loading