-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Ineffective alpha-level for SPADE #620
Comments
Hello @Idavr , At first glance I agree with your assessment. However the SPADE module has a number functions and parameters. Did you use the
Did you use the
It could be very helpful, if you could provide a minimal code example or give more details on how you ran SPADE. Hope this helps, unfortunately I'm not an expert on the method itself. |
This might also be related to: #439 . |
Thanks for responding, apologies for the delay in getting back to you. Using Pycharm I ran this specific line of code, where bins = 10, from inside my environment for elephant
The statistical correction used is However, the |
As a work around, I am trying to see how the same information can be stored for the non_significant patterns as well, seeing as they currently are only stored with their signature and p-value. Also storing the neuron numbers and time windows for the non-significant patterns would at least give me data to work with, regardless of the final statistical value. |
Hey @Idavr , I created the following minimal code example to investigate this: import quantities as pq
import elephant
import numpy as np
np.random.seed(4542)
# Generate spike trains with synchronous spikes
spiketrains = elephant.spike_train_generation.compound_poisson_process(
rate=5 * pq.Hz,
amplitude_distribution=[0] + [0.974] + [0] * 7 + [0.006] + [0.02],
t_stop=10 * pq.s,
)
len(spiketrains)
# Generate random spike trains
spiketrains.extend(
elephant.spike_train_generation.StationaryPoissonProcess(
rate=5 * pq.Hz, t_stop=10 * pq.s
).generate_n_spiketrains(90)
)
# Set alpha
alpha = 0.05
# Mining patterns with spade
patterns = elephant.spade.spade(
spiketrains,
bin_size=10 * pq.ms,
winlen=1,
dither=5 * pq.ms,
spectrum="3d#",
min_spikes=6,
min_occ=3,
min_neu=2,
alpha=alpha,
n_surr=1000,
psr_param=[0, 0, 0],
)
# Print result
print("Patterns dict:")
print(patterns)
# Print p-value spectrum
print("pvalue_spectrum: [size, number of occurrences, duration, p-value]")
print(patterns["pvalue_spectrum"])
# Print non-significant signatures
print("Non-significant signatures of ‘pvalue_spectrum’: [size, number of occurrences, duration]")
print(patterns["non_sgnf_sgnt"])
# Find p-values for non-significant signatures and print:
for non_sgnf_sgnt in patterns["non_sgnf_sgnt"]:
for pvalue_spectrum in patterns["pvalue_spectrum"]:
if pvalue_spectrum[:3] == [int(i) for i in non_sgnf_sgnt]:
print(
f"Non-significant signature:{non_sgnf_sgnt}, p-value:{pvalue_spectrum[3]}"
) which yields:
I understand your initial inquiry concerns why certain patterns are labeled as non-significant even though their p-values are lower than the specified threshold value for Upon examining the implementation, it appears that the p-values generated by the If you prefer to manage the filtering manually using patterns = elephant.spade.spade(
spiketrains,
bin_size=10 * pq.ms,
winlen=1,
dither=5 * pq.ms,
spectrum="3d#",
min_spikes=6,
min_occ=3,
min_neu=2,
alpha=None,
n_surr=1000,
psr_param=[0, 0, 0],
stat_corr='no'
) For the |
Thanks for looking into this. I have previously run a test with Does this make sense to you? Based on your explanation and what I thought myself the p-values for the non-significant patterns should indeed be the non-corrected ones, so as I was trying to get at the non-corrected significant patterns I thought the |
Hi @Idavr , While I don't fully grasp the reasoning behind it, it seems plausible that using
Please, allow me some time to look into the code. It seems the variation of those parameters can lead to behavior which is not obvious. I.e. there are different code execution paths for combinations of |
Thank you for looking into this, I am grasping more how this works now. I am currently running a larger analysis with those parameters and will let you know how it goes. Appreciate your time and fingers crossed this solves my conundrum. |
Just wanted to update on the progress on this. I have let the analysis of 31 seconds of data run for over two weeks with the parameters specified above, but it looks like this resulted in it being locked in some kind of loop after computing the value spectrum as most of these two weeks of the code running has been after that processing step. This is way longer than any other parameter testing I have done (on a relatively powerful computer), and without any other reference points or outputs I will have to abort it soon if the parameters have actually pushed it into being caught in a loop. I cannot see from the code immediately that this should happen, and when time dragged on I was thinking it just took this long since it was going over and writing up all the identified patterns, but as time goes on the belief that this is the case is diminishing. If you have any input or knowledge of what possibly is going on then any thoughts are appreciated. |
Hi @Idavr , To date, I haven't come across a scenario where SPADE would become stuck in an endless loop. If that's the case here it would be great to find and fix this issue. My understanding is that it's conceivable that suboptimal parameter choices could result in excessively lengthy runtimes. However, without insight into the code and the specifics of the machines you're using for your analysis, offering meaningful advice is challenging. If you could provide a minimal code snippet that replicates the problem, I'd be able to investigate further. As it stands, my assessments would be somewhat speculative. Generally speaking, analysis workflows are often IO bound, so I agree with your assessment, that reading and writing to disk could be a possible bottleneck here. |
I understand. I am looking into being able to share the data, but until then I am running additional tests. One of these being to confirm that the MPI is actually being utilized in this process, which I am now a bit uncertain about. In the spade.py script I put these lines all the way in the beginning with the rest of the imports:
However, this is not being printed when I run the program. Neither is any other print arguments I inserted deeper into the script at any point where
was written, making it say:
Still no print messages mentioning MPI being printed at all. This worries me as I am unable to know whether MPI is activated or not. Have you encountered this before? |
Hi @Idavr , It sounds like you're facing some challenges with ensuring MPI is correctly set up and utilized in your script. It's essential to confirm MPI's activation to ensure proper parallelization. This might also be the reason for your analysis getting stuck in a deadlock (race condition could also occur, this might be also be relevant for #627 ?). From your description, it seems the print statements related to MPI aren't being executed, which indeed should not be the case. One reason might be the wrong indentation level for the if HAVE_MPI: # pragma: no cover
comm = MPI.COMM_WORLD # create MPI communicator
rank = comm.Get_rank() # get rank of current MPI task
size = comm.Get_size() # Get the total number of MPI processes
print('Using MPI for parallelization!')
print('Number of MPI processes : {}'.format(size))
else:
rank = 0
print('No MPI.') Here are a few additional points to consider: MPI installation: Ensure that MPI is properly installed and configured on your system. Sometimes issues arise from mismatched versions or incomplete installations. Did you follow the setup process described in the docs?:
Script Execution: When running your Python script with MPI, make sure you're executing it in a manner that MPI recognizes. Depending on your MPI implementation (e.g., Open MPI, MPICH), the command to run MPI-enabled Python scripts might differ. Typically, you'd use a command like mpiexec or mpirun followed by your Python script. $ `mpiexec` -n 4 python spade.py See also; mpi4py Documentation: If you've verified the above and still encounter issues, reaching out to the mpi4py maintainers or consulting their documentation might provide valuable insights. They may offer specific guidance or troubleshooting steps tailored to mpi4py usage. https://mpi4py.readthedocs.io/en/stable/index.html Hope this helps, feel free to reach out again. |
We have not heard from you recently and we could not reproduce the errors that you are reporting. |
Hello there! I am trying to identify patterns in my neural recordings I am now testing SPADE, which initially seems very promising for what we want to do, but I discovered that many of the patterns that were flagged as "non-significant" had p-values below my specified alpha-value in the parameters (alpha = 0.05). Is there something I am missing in how the statistical significance is being done and corrected for?
Best,
Idavr
Environment
conda
,pip
, source): pipneo
python package version: 0.13.0elephant
python package version: 1.0.0The text was updated successfully, but these errors were encountered: