Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo dataset #63

Open
leminhngoc13021983 opened this issue Jun 11, 2024 · 12 comments
Open

Demo dataset #63

leminhngoc13021983 opened this issue Jun 11, 2024 · 12 comments

Comments

@leminhngoc13021983
Copy link

leminhngoc13021983 commented Jun 11, 2024

Dear Dave Rowenhorst,

I want to practice analyzing with PyEBSDindex. However, I don't know where the datasets of example.up1 and SLMtest/scan2v3.up1 can be downloaded.
Please, provide their links for me.

Thank you.
Best regards,
Ngoc

@drowenhorst-nrl
Copy link
Collaborator

Great question - in an effort to keep the github repo as small as possible, we have not included a test dataset, hoping that users will have their own. However, we did include a test dataset in our paper on PyEBSDIndex, and it can be found at https://zenodo.org/records/8400425
This is a good reminder that we should update the tutorial notebook to point towards the test dataset on zenodo.

@leminhngoc13021983
Copy link
Author

Great question - in an effort to keep the github repo as small as possible, we have not included a test dataset, hoping that users will have their own. However, we did include a test dataset in our paper on PyEBSDIndex, and it can be found at https://zenodo.org/records/8400425 This is a good reminder that we should update the tutorial notebook to point towards the test dataset on zenodo.

Thank you :)

@leminhngoc13021983
Copy link
Author

ERROR1
Dear Dave Rowenhorst,
I used Jupyterlab with Python V3.12.4 to run the Ti64_index.ipynb file. Until the step of "Index the original Ti64 patterns", an error occurred.
It shows the module 'pyebsdindex.ebsd_index' has no attribute 'index_pats_distributed'.
Can you instruct me on how to solve this issue?
Best regards
Ngoc

@drowenhorst-nrl
Copy link
Collaborator

I will assume you are using a conda environment? This indicates that Ray did not get installed. Currently Ray is not packaged in conda-forge, but, you should be able to install this with pip.

conda activate {your environment name} 
pip install ray

More details here: https://docs.ray.io/en/latest/ray-overview/installation.html

If that should fail,
1 - let me know.
2 - you should be able to run in single process mode:

data, bnddata = ebsd_index.index_pats(filename = './Ti64_930C4hr_Scan4_cr.up1', patstart = 0, npats = -1, ebsd_indexer_obj = indxer)

@drowenhorst-nrl
Copy link
Collaborator

drowenhorst-nrl commented Jun 13, 2024

I will also note that I am not sure Ray supports python 3.12 yet - so that might the issue. If making a conda environment:

conda create --name {name of your environment} python=3.11 pyebsdindex

should work, then follow the instructions above to install Ray.

@leminhngoc13021983
Copy link
Author

I will assume you are using a conda environment? This indicates that Ray did not get installed. Currently Ray is not packaged in conda-forge, but, you should be able to install this with pip.

conda activate {your environment name} 
pip install ray

More details here: https://docs.ray.io/en/latest/ray-overview/installation.html

If that should fail, 1 - let me know. 2 - you should be able to run in single process mode:

data, bnddata = ebsd_index.index_pats(filename = './Ti64_930C4hr_Scan4_cr.up1', patstart = 0, npats = -1, ebsd_indexer_obj = indxer)

I have installed Python 3.11.7 and Anaconda, and as a result, the file is running well on one of my computers.
However, I got another error on another computer. I will learn more about it.

Capture

@drowenhorst-nrl
Copy link
Collaborator

Looks like some kind of Ray error again. This might be related to a Windows firewall or VPN blocking ray from starting up the virtual network? If you can turn those off, that might work.
Another alternative is to only install ray[default] which does not install the dashboard (this is just a guess at the issue).

pip uninstall ray
pip install ray[default]

That is only my best guess at this point.

@leminhngoc13021983
Copy link
Author

leminhngoc13021983 commented Jun 19, 2024

I have some questions for you again.
My dataset was recorded as a map of 400×400 µm2 with a step size of 0.5 µm. And EBSD patterns were collected with 78×78 pixels per pattern with background subtraction, recording file size > 4Gb. However, when running NLPAR pattern processing, it cannot be reshaped array. Do you know what the reason is? I also changed the searchradius value = 5, 7, 8, 9, but it didn't work.

Capture

Capture 2
Based on what and how can I adjust the parameters of nT, nR, tSig, rSig to optimize them?

Thank you very much.

@drowenhorst-nrl
Copy link
Collaborator

So there are two questions here. First, what is wrong with NLPAR, and second, best parameters for indexing:
1 - I think what is happening with NLPAR is that the number of rows listed in the header of the file is NOT what is actually in the scan. This can happen if the scan was interrupted. If that is the case, you are the second person to run into this, and thus I am likely to change how up1/2 file are read in to protect against it. To note, you say the scan is 400µm x 400µm with a 0.5 µm step; however, my reader thinks that there should be 924 rows (according to the up1 header)

What you can try in the mean time is manually set the number of rows when you create the nlpar object:

file0 = './A6061CS.up1'
nlobj = nipar.NLPAR(file0, nrows=800)

This should process only the first 800 rows, and the remaining data will be zeroed out.
I will take a look at what we might do in the future to make this more automatic.

2 - The default parameters listed are a fine starting point. The largest adjustment is the value of rSig, which depends on how wide your bands are (on average) across your detector. I have found the range from 1.0--3.0 to be typical. You want to make the peaks on the Radon image to be as prominent as possible.

Given the smaller pattern size, you could index somewhat faster with a smaller Radon (and thus scale tSig and rSig appropriately), but you want nR = approximately equal to the smallest dimension of your patterns. I might try this:

nT = 90
nR = 90
tSig = 1.5
rSig = 1.5
rhomask = 0.1
backgroundsub = False
nbands = 8

If you are indexing multiple phases, and they are not separating well, you might want to try more bands, 8--15 is typical.

@leminhngoc13021983
Copy link
Author

leminhngoc13021983 commented Jun 19, 2024

Thank you very much. Adding nrows = 800 worked for me. By the way, my data set was recorded with a hexagonal grid, not a square grid. Perhaps this is causing the issue. Does your program handle these two types of data grids differently?

In addition, I've encountered another issue with index pattern processing. My desktop has a weak configuration: an i5 CPU with 4 cores, 16GB DDR4 RAM, and no GPU. I adjusted the ncpu parameter to 4, and it only succeeded once during the indexing process. Since then, it hasn't worked again.

The error notifications always show 6 or 4 running processes remaining and list index out of range, even though completion was 100%.

I look forward to your suggestions to help me resolve this issue.

Capture 3

@drowenhorst-nrl
Copy link
Collaborator

GAH! Hexagonal grid?! Nuts. Yeah, I have not done anything to account for those. I never collect them, and I was hoping no one else did too. Probably a really bad assumption. But, this would explain the issues. Mainly, it is that I am reading the number of columns/rows out of the header of the file assuming that it is square-grid. The fixes for this will take some time, as I am not exactly sure how those files are always constructed.

Hmm - no GPU? Or just integrated graphics? Probably the later, in that your i5 does have a GPU built in, and that should be better than running on just CPU. But, that status line does indicate that no GPUs were detected. Not sure if that is because you set to only use CPUs, or you did not install pyopencl, or if I need to do some double checks on detecting integrated graphics properly. I have a few guesses as to what is up, and that is I was likely to conservative in estimating a timeout on a job being stuck within the band detection process -- which that timeout will be much longer on the CPU than GPU. No easy fix until I send out a new release.

However, you could try running non distributed on your laptop to get you through. It will take a while on that hardware, and will not send out a lot of info, but it should get there ... eventually.

data, bnddata = ebsd_index.index_pats(filename = file, patstart = 0, npats = -1, ebsd_indexer_obj = indxer, verbose=2)

You might also be hitting an issue indexing that same file from before - it might be that it is trying to index patterns that are not really present at the end of the file. Try indexing from patstart = 0, npats = int(800*800) and see if that works (or if you know the exact number of patterns in the file, you can instruct to just index that many.

@leminhngoc13021983
Copy link
Author

I tried with 'npats = int(800*800)', it was successful. I think the hexagonal grid dataset is the cause of my issues. I'll record my dataset with the square-grid in next time and try the running program again.
Best regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants