Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential wrong row-standardization in lee.py #207

Closed
ChaozhongLiu opened this issue May 27, 2022 · 4 comments · Fixed by #245
Closed

Potential wrong row-standardization in lee.py #207

ChaozhongLiu opened this issue May 27, 2022 · 4 comments · Fixed by #245
Assignees

Comments

@ChaozhongLiu
Copy link

I could be wrong about this but it's confusing me for some time.

In spatial pearson statistic functions lee.py, line 83 and line 198, row-standardization is writen as:
standard_connectivity = self.connectivity / self.connectivity.sum(axis=1)

However, becaused the shape of self.connectivity.sum(axis=1) is (n,), when broadcasting it is row-wise division instead of column-wise.

Just by checking standard_connectivity.sum(axis=1) you will find it is not equal to 1. And this is problematic to the local spatial pearson statistic.

Possible editting could be standard_connectivity = self.connectivity / self.connectivity.sum(axis=1)[:, numpy.newaxis]?

Looking forward to your discussion!

@ljwolf
Copy link
Member

ljwolf commented May 27, 2022

Yes, thanks! That does look like an issuse. I think we need to use self.connectivity.sum(axis=1, keepdims=True) to fix. Thanks! Will get this done ASAP.

@ljwolf ljwolf self-assigned this May 27, 2022
@ljwolf ljwolf added the bug label May 27, 2022
@ChaozhongLiu
Copy link
Author

Thanks! @ljwolf

@jGaboardi
Copy link
Member

xref #91

@ljwolf ljwolf linked a pull request May 24, 2023 that will close this issue
@ljwolf
Copy link
Member

ljwolf commented May 24, 2023

Actually, this is correct as written thanks to the vagaries of the matrix API:

>>> from scipy import sparse
>>> import numpy
>>> numpy.random.seed(111211)
>>> connectivity = sparse.random(10,10,density=.4)
>>> connectivity.sum()
matrix([[2.50042315],
        [3.55327513],
        [0.79089601],
        [1.44499382],
        [1.63995059],
        [2.36368386],
        [1.9093926 ],
        [3.15104416],
        [2.07759805],
        [0.80082077]])

The documentation says that connectivity must be of type scipy.sparse.matrix, and is what comes from using w.sparse.

If you pass a numpy array describing connectivity that you've built yourself, this will be silently incorrect. But, we document the expected input type, and for that type, the code is correct.

I will add a fix that ensures this correction works for array types as well. Eventually, we need to move to scipy.sparse.csc_array(), but that will occur later.

@ljwolf ljwolf added rough edge and removed bug labels May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants