Skip to content

Releases: enjalot/latent-scope

Table improvements & embedding visualization

26 Jul 17:38
Compare
Choose a tag to compare

This release fixes a few bugs with the Explore page table UI and nearest neighbor search, making it much more reliable and performant.

Thank you to @hydrosquall for issues & PRs! #49 #50 #52

A new experimental feature for directly visualizing embeddings in the table is ready to try:
Screenshot 2024-07-26 at 1 38 08 PM

Use any Sentence Transformer from HuggingFace

23 Jul 19:40
Compare
Choose a tag to compare

This release adopts sentence transformers for embedding using local open source models downloaded automatically from HuggingFace hub.

It also keeps track of recently used models and brings it all together in a much improved selector component on the frontend.

Screenshot 2024-07-23 at 3 28 57 PM

Also includes a PR from @hydrosquall that fixed a bug using truncated embeddings in the nearest neighbor search.

One minor note: for now truncating of sentence transformers isn't supported as we don't have a way to tell if the model supports it arbitrarily. We could maintain a list of matroyshka enabled models separately.

export interactive plots

05 Jul 15:53
Compare
Choose a tag to compare

Export interactive DataMapPlots optionally instead of static thanks to @dhruv-anand-aintech

Fixes an unpinned dependency breaking transformers models

Export static plots

21 Jun 17:27
Compare
Choose a tag to compare

Implements #23, creating a UI to easily export static plots using datamapplot

Support more filetype inputs thanks to #40
Support proxy servers / alternate OpenAI compatible endpoints #44

The requirements.txt has been loosened so Python 3.12 should be enabled and more updated versions of some important pip modules will be installed

new models

15 May 13:54
Compare
Choose a tag to compare

Improve setup flow

13 May 13:42
Compare
Choose a tag to compare

Minor improvements to the setup flow

Refined data export

05 May 23:47
Compare
Choose a tag to compare

Creating a scope now also creates a combined parquet of the input data and the scope annotations.

This makes loading curated scopes much easier in other workflows

0.2.0 Explore Overhaul

01 May 22:06
Compare
Choose a tag to compare

This release makes a number of improvements to the exploring and curation part of Latent Scope. You can now filter a number of ways from a unified interface and perform bulk actions on the filtered points.

The following issues were closed:

  • #12 guide for setup page
  • #11 guide for explore page
  • #19 filtering by dataset column

This wasn't closed, but now we can show images in the data table if there is an image url:

  • #24 showing images

Improved documentation and a number of guides have been published to https://enjalot.github.io/latent-scope/

v0.1.8

21 Mar 19:16
Compare
Choose a tag to compare

Fixed a long-standing performance issue with loading the python module. Imports are now done on demand, shaving ~4 seconds off loading the library (including starting the server or running any of the scripts).

Improved ingest flow #34 implemented. When data is ingested we now:

  • Check types of columns and generate summary statistics. this will enable future UI
  • Check for array columns and suggest importing those as embeddings
  • Check for name collisions when uploading a dataset, giving a warning if a collision is detected

0.1.7 Setup process improvements

19 Mar 19:41
Compare
Choose a tag to compare
  • A number of styling improvements are made to the setup page.
  • Fixed an annoying loading issue that was adding ~4 seconds to each part of the process.
  • Added nltk top words as a CPU friendly way to summarize clusters