Generalize cross correlation histogram function #597

morales-gregorio · 2023-10-16T11:46:39Z

Hi,

I was using the CCH function for some analysis and was very limited by the fact that it only accepted BinnedSpiketrains, because I also wanted to use it to compute CCH of non-spike data.

I went ahead and started modifying the function to enable using numpy arrays and encountered many many issues that I think made the code more difficult to understand and prone to unexpected behaviour. Here is the list of the things I found confusing and changed in this PR, we can roll some back if you still think they are worth having:

A lot of the code is there to deal with cross correlating arrays of different shape. But I cannot think of any use case where anyone would want to do that. The neurons should have been recorded simultaneously and it makes no sense to me to correlate one time series with another after one of them has finished. I removed all functionality relating to this, which already simplified the code greatly.
Another big issue is the use of spiketrains that have non-aligned t_start. I think the resulting statistics from spikes jumping around bins is not necessarily linearly predictable. The best practice here should be to slice the spiketrains before binning to the same t_start, hence I removed the functionality and included the strict criterion of having the same t_start.
Why include a kernel function in this function? Seems like a postprocessing step that people should do to their result if they want to. I suggest removing it, if someone wants to use a kernel, they can do so with the output of the function.
Same goes to binarize. This is a preprocessing step, and an option of BinnedSpikeTrain, the user should apply it before feeding the data to this function, seems ad hoc and dangerous to have this kind of functionality in multiple places.
Finally, it was unclear to me what the border correction was doing. It would only kick in for the different length spiketrains case (which I don't allow anymore), so I removed it.

Stylistically, there was a class with helper functions. But the class was only instantiated once and the helper functions therein only used once. I see no point in scattering the code around in this way for functions that are only used once. Since all the options I removed have significantly shortened the code I put it all into the main function, hoping that it is clear what the code does at any given time without having to jump back and forth between helpers and the main function.

Also there were examples interleaved in the code comments, which made the code more confusing to me and I have never seen in any scientific programming language, so I also removed those. I added comments where I saw fit to clarify what certain code blocks are supposed to do.

I hope all these changes improve the usability and clarity of the cross_correlation_histogram function. I know you worked hard to include some of this functions, but to me they made the function more difficult to understand and use, while introducing a lot of uncertainty about what really was happening with some of the pre/post-processing steps.

Let me know what you think, I will work on the tests once the main function is agreed upon.

Probably of interest to @Kleinjohann

Best,
Aitor

…s and detangle code.

pep8speaks · 2023-10-16T11:46:45Z

Hello @morales-gregorio! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-02-05 14:53:44 UTC

Moritz-Alexander-Kern · 2023-10-17T15:59:10Z

Hey Aitor,

thank you for contributing to Elephant.

This is what I understand from your description and my "first glance review":

Simplification of Code:
- Removed code related to cross-correlating arrays of different shapes, as this scenario is not expected.
- Eliminated functionality for handling spiketrains with non-aligned t_start, emphasizing the need for the same t_start.
- Suggested removing the kernel function and binarization, as they should be post-processing or preprocessing steps respectively.
Border Correction:
- Removed the border correction feature, as it only applied to different-length spiketrains (which are no longer allowed).
Code Organization:
- Integrated helper functions into the main function for improved code clarity.
- Removed interleaved examples from code comments for better readability.
- Added comments where needed to clarify code blocks.

I'll need some time to conduct a more thorough review of this PR, and, importantly, I believe it's essential for @Kleinjohann and @mdenker to also examine it.

Several of these points are not directly related to software development or Python programming; instead, they pose questions of a neuroscientific nature e.g. A lot of the code is there to deal with cross correlating arrays of different shape. But I cannot think of any use case where anyone would want to do that. Consequently, I'm hesitant to make comments or decisions regarding these particular aspects.

todo:

change unit tests accordingly

morales-gregorio · 2023-12-26T18:20:31Z

Hi, any updates? I would wait for feedback and approval of changes before putting work on updating the unit tests

morales-gregorio · 2024-02-05T11:35:30Z

We discussed this in the meeting on 2024.02.05

It is not clear if there is a good use case for differently shaped arrays.
- Numpy allows this in its function
- @mdenker will think about this use case and whether we really need it.
There exists both cross_correlation_histogram() and cross_correlation_function(); cohesive framework merging these two functions and possibly also spike triggered average. @mdenker supports merging these functions in the new elephant version.

Moritz-Alexander-Kern · 2024-03-26T07:18:05Z

When refactoring the cross-correlation functionalities, also inspect TSPE (Total Spiking Probability Estimation) implementations in elephant.functional_connectivity.

More specifically in elephant/functional_connectivity_src/total_spiking_probability_edges.py:

elephant/elephant/functional_connectivity_src/total_spiking_probability_edges.py

Lines 174 to 193 in 6363690

    
           def normalized_cross_correlation( 
        
               spike_trains: BinnedSpikeTrain, 
        
               delay_times: Union[int, List[int], Iterable[int]] = 0, 
        
           ) -> np.ndarray: 
        
               r""" 
        
               Normalized cross correlation using std deviation 
        
               Computes the normalized_cross_correlation between all 
        
               Spiketrains inside a BinnedSpikeTrain-Object at a given delay_time 
        
               The underlying formula is: 
        
               .. math:: 
        
                   NCC_{X\arrY(d)} = \frac{1}{N_{bins}}\sum_{i=-\inf}^{\inf}{ 
        
                   \frac{(y_{(i)} - \bar{y}) \cdot (x_{(i-d) - \bar{x})}{\sigma_x 
        
                   \cdot \sigma_y}}} 
        
               The subtraction of mean-values is omitted, since it offers little added 
        
               accuracy but increases the compute-time immensely. 
        
               """

Refact cross correlation histogram to enable using np arrays as input…

72a0bc9

…s and detangle code.

Fix PEP8 line length issues

77923b9

Moritz-Alexander-Kern self-requested a review October 17, 2023 14:14

Moritz-Alexander-Kern added the enhancement Editing an existing module, improving something label Oct 17, 2023

Moritz-Alexander-Kern modified the milestone: v1.1.0 Jan 17, 2024

mdenker added this to the v1.1.0 milestone Jan 17, 2024

Moritz-Alexander-Kern assigned mdenker Feb 5, 2024

Merge branch 'master' into generalize_cross_correlation

cb60f91

mdenker self-requested a review February 8, 2024 14:30

Moritz-Alexander-Kern modified the milestones: v1.1.0, v1.2.0 Mar 18, 2024

Moritz-Alexander-Kern modified the milestones: v1.2.0, v1.3.0 Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize cross correlation histogram function #597

Generalize cross correlation histogram function #597

morales-gregorio commented Oct 16, 2023

pep8speaks commented Oct 16, 2023 •

edited

Loading

Moritz-Alexander-Kern commented Oct 17, 2023 •

edited

Loading

morales-gregorio commented Dec 26, 2023

morales-gregorio commented Feb 5, 2024

Moritz-Alexander-Kern commented Mar 26, 2024

Generalize cross correlation histogram function #597

Are you sure you want to change the base?

Generalize cross correlation histogram function #597

Conversation

morales-gregorio commented Oct 16, 2023

pep8speaks commented Oct 16, 2023 • edited Loading

Comment last updated at 2024-02-05 14:53:44 UTC

Moritz-Alexander-Kern commented Oct 17, 2023 • edited Loading

morales-gregorio commented Dec 26, 2023

morales-gregorio commented Feb 5, 2024

Moritz-Alexander-Kern commented Mar 26, 2024

pep8speaks commented Oct 16, 2023 •

edited

Loading

Moritz-Alexander-Kern commented Oct 17, 2023 •

edited

Loading