Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use KMeans++ For ColorScape Generation #3164

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

F0x1
Copy link

@F0x1 F0x1 commented Sep 14, 2024

Changed

  • Changed: Switched to K-Means++ clustering algorithm for improved initial centroid selection and parallelized some of the code.

Implemented K-means++ for improved initial centroid selection
Parallelized point assignment to nearest centroids
Added early termination when centroids converge
Optimized distance calculations using squared distances

Performance Improvements

Execution time: 15.86x faster (3.58 ms, down from 56.85 ms)
Memory allocation: 99% reduction (532.07 KB, down from 97,486.85 KB)
GC Gen0 collections: 99% reduction (175.7, down from 31777.7)

This Optimizing has passed test cases and returns the same values as the original function ensuring maintained functionality alongside the performance gains.
This enhancements significantly improve the efficiency of parsing through different series, reducing resource usage and improving overall system performance.

…ment

- Implemented caching using ConcurrentDictionary to store normalized results
- Added cache lookup before normalization to improve performance on repeated calls
- Changed ToLower() to ToLowerInvariant() for culture-insensitive lowercase conversion
- Removed unnecessary string.Empty parameter in Trim() call

Performance improvements:
- Mean execution time reduced from 231.819 μs to 16.711 μs
- Memory allocation reduced from 65019 B to 32 B
- GC pressure eliminated (Gen0 collections reduced from 20.5078 to 0)
- Passed Unit Tests
- Implemented K-means++ for better initial centroid selection
- Parallelize point assignment to nearest centroids
- Added early termination when centroids converge
- Optimized distance calculations using squared distances
- Created benchmark to compare original and optimized versions
Benchmark results:
- Execution time: 3.58 ms (down from 56.85 ms, 15.86x faster)
- Memory allocation: 532.07 KB (down from 97,486.85 KB, 99% reduction)
- GC Gen0 collections: 175.7 (down from 31777.7, 99% reduction)

Test Cases showed matching results for the optimized function
@majora2007
Copy link
Member

I'll take a look at your enhancements with K means++ clustering, but unfortunately it's too late to include with the initial release of the colorscape system as we have just wrapped up all the testing for the next stable.

@F0x1
Copy link
Author

F0x1 commented Sep 15, 2024

That's fine, I appreciate you taking a look at least

@majora2007 majora2007 changed the title General Parsing Enhancements Use KMeans++ For ColorScape Generation Sep 28, 2024
@majora2007
Copy link
Member

I tested this and the colors it comes out with is a bit questionable. I've made a lot of tweaks recently wrt Colorscapes so I may need to revisit this and play with the grouping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants