-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use glsl 3.0 programs throughout - closes #81 closes #62 #7
base: jebeck/update-tfjs-core
Are you sure you want to change the base?
use glsl 3.0 programs throughout - closes #81 closes #62 #7
Conversation
Sorry @duhaime I was on PTO last week, and I just replied to a different closed PR because I was going through my inbox in order. So sorry for the second message. This is a very very out of date fork of tensorflow/tfjs-tsne. Could you redirect these efforts upstream? Or would this not fix the issue happening there with building against current tfjs-core? |
haha, no worries! The long story short is these changes allow one to build against the current core, but the resulting behavior is not as expected. I'm working through the innards and hope to have a PR for the tfjs-tsne repo soon... |
Ok awesome. If for some reason the changes don't work upstream and you need to use this code (which I wouldn't recommend, the hacking here to get OffscreenCanvas functionality is extremely gross), I would say just fork it. We're not really using this anymore (preferring umap-js for client side dimensionality reduction) so it's not gonna be a priority for me to review anytime soon, sorry! |
That sounds great! I saw your PR on the umap.js repo too--my team is also running dimension reduction in workers but my understanding is that performance would be greatest using the GPGPU approach taken by tfjs-tsne. If you've seen umap.js perform comparably with tfjs-tsne in browsers aside from Chrome, it'd be great to hear that! No worries about these PR's--I've been focusing energy on the upstream master... |
No, you're right that tfjs-tsne far outperforms umap-js, except that we do find the umap-js algorithm to also be pretty markedly superior in its results. Anecdotally, umap-js in a Worker is a bit painful on 5K MNIST digits but doable, but on all 10K you'll be waiting a real long while. Not sure if there might be something about the umap-js implementation (and/or how browsers/JavaScript work) that results in this poor scaling; I do believe the Python implementation is supposed to scale pretty well, and I think our data scientists have found that to be true. Aside from my open PR on umap-js, I started recently experimenting with porting umap-js to the tfjs-core backend to try to use the GPU backend, so you and I are definitely on the same wavelength :) (I don't think I really have the math expertise to do the port but was going to naively give it shot anyways (with lots of tests to verify similarity of code)...definitely let me know if you think that approach might work and if you might want to collaborate on it! Though with the caveat that my time on this is very limited right now, and I should finish that umap-js Workers PR first...) |
Yes, it appears we're thinking through the same questions! :) I'm super keen on porting umap js to a GPGPU backend. The tfjs-tsne repo that @Nicola17 devised made use of some very clever linear optimizations using shader textures. Finding similar opportunities in the umap case would be awesome. I know the rapids-ai team has ported umap to a massively parallel GPU implementation, and I wonder if their model could be leveraged effectively in a GPGPU context. Out of curiosity, have you tried running preliminary dimension reduction like PCA or NMF (in javascript) on your native data before passing those observations to umap.js? I wonder if a pipeline along those lines would be sufficiently snappy to keep the compute on the CPU for now. I'll try some experiments... |
Have tried PCA on native data prior to UMAP, but in that case didn't have the opportunity to compare against pre-PCA because the data was published post-PCA. Also just a heads up that my time for this project has dropped to pretty much zero at the moment. Def keep me apprised of your progress, but don't expect updates from me anytime soon! |
Amen, no worries! I'll reach out if we have any solid breakthroughs... |
@duhaime
somewhere near here: https://github.com/stitchfix/tfjs-tsne/blob/master/src/tsne_optimizer.ts#L580 |
This is really great work! I wanted to send along a little PR that resolves the shader linking problem. Here I just set some declarations to move all glsl programs to glsl 3.0, which allows the shaders to compile and allows the demos to run:
I also had to change a few function calls as some of the built in functions changed in glsl 3.0. I'm happy to make any changes you see fit!