[PROPOSAL] Add support for asymmetric embedding models to neural-search #620

br3no · 2024-02-29T21:00:58Z

What/Why

What are you proposing?

With opensearch-project/ml-commons#1799 OpenSearch can now use asymmetric embedding models such as e5. Asymmetric embedding models differ from symmetric ones by embedding queries and passages differently. Many of them require the text to be prefixed by special strings (e.g. query: and passage: for the e5 model family). The aforementioned issue documents the work that led to ml-commons supporting serving these kinds of model. There, new model configurations have been added, a special new AsymmetricTextEmbeddingParameters type was introduced and the correct handling of inference for these models has been implemented. Here I want to propose to add support for this in the neural-search plugin.

What users have asked for this feature?

I'm a user and would like to have this feature. 😉
Many of the best performing embedding models are asymmetric ones (cf. MTEB board https://huggingface.co/spaces/mteb/leaderboard). For OpenSearch it is important to keep up with the latest developments in information retrieval research and give users the freedom to choose as many different models as possible.

What problems are you trying to solve?

Neural-search should be able to use the newly introduced capability in ml-commons.

What is the developer experience going to be?

My proposal will not affect any API. The idea is to have neural-search automatically detect that a particular model being used is asymmetric (it is possible to infer this information from the model configuration) and then call the inference method with the right parameters. In the embedding processor, we need to add the information that we are embedding a passage; in the query builder, that we are embedding a query. That's it.

Are there any security considerations?

No.

Are there any breaking changes to the API

No.

What is the user experience going to be?

The user experience will not change in any way.

Are there breaking changes to the User Experience?

No.

Why should it be built? Any reason not to?

Not building this reduces the attractiveness of OS for dense retrieval. Other vector DBs already support these models.

What will it take to execute?

The ground work has been done in ml-commons, here we only need to build on top of it. I will open a PR soon on which we can base the discussion.

Any remaining open questions?

I can't think of any.

The text was updated successfully, but these errors were encountered:

model-collapse · 2024-03-04T01:08:48Z

Seems you are trying to add prompts for text embedding right? Here is my thought.
1, It is tough to deploy GPT based LLM inside opensearch cluster since we don't support model parallel mode to deploy LLM across GPUs. Therefore we strongly suggest to deploy e5-Mistral-7b using a 3rdparty model inference service platform such as TorchServe, Triton or AWS Sagemaker.
2, With remote connector features, we create a ml-commons model connecting to the endpoint of model inference.
3, The remote connector can be configured with request body template, where you can put the prompt in. Please refer to this doc: https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/index/

model-collapse · 2024-03-13T02:05:56Z

@br3no does my reply solve your problem? If yes, this issue will be closed.

br3no · 2024-03-13T06:13:22Z

@model-collapse, it were busy days, I didn’t have time to go through your reply yet. Please don’t close the issue, I’ll get to it soon.

reuschling · 2024-04-05T15:43:25Z

I also vote supporting embedding models that need to fill a template in order to create meaningful responses. Especially e5-multilingual is one of the current rare multilingual embedding models available.
Where making direct requests to these models - which are hosted locally which is crucial - seems to be supported now, for e.g. hybrid search support inside neural search is missing, which is a real blocker currently.

br3no · 2024-04-12T11:57:25Z

@model-collapse my use-case is the same as @reuschling's. I'm aware that running larger models in OpenSearch might not be the best approach, but there are plenty of small asymmetric models that could be interesting (cf. https://huggingface.co/intfloat/multilingual-e5-small).

reuschling · 2024-04-19T11:24:35Z

Are there any new thoughts on it?
Currently it is possible to load these embedding models into OpenSearch, using them with _predict. It is simply not possible to use them in neural search / ingest and search pipelines in order to implement a document search.

br3no · 2024-04-19T18:17:55Z

I have a PR ready that will allow you to use the neural search plugin with an e5 family model. As soon as opensearch-project/ml-commons#2318 gets merged I will open it here.

@model-collapse maybe you could have a look at the issue above. There’s just one approval missing.

br3no · 2024-04-25T19:47:34Z

@reuschling @model-collapse I have opened the PR (#710) mentioned above.

model-collapse · 2024-10-12T06:16:38Z

@br3no @reuschling if possible, let's book a video meeting to talk about this. You can reach me in the public slack channel. The name is Charlie Yang.

yuye-aws · 2024-10-15T09:51:37Z

@br3no You can use the script processor in OpenSearch: https://opensearch.org/docs/latest/ingest-pipelines/processors/script/. Here is a processor which adds two strings.

{
  "description": "Concatenate two fields",
  "processors": [
    {
      "script": {
        "lang": "painless",
        "source": "ctx.concatenated_field = ctx.field1 + ' ' + ctx.field2"
      }
    }
  ]
}

zhichao-aws · 2024-10-15T10:02:14Z

Hi @br3no , have you considered using search templates for query? We can create a search template which concats the query prompt, and we do query using this template

reuschling · 2024-10-16T11:10:15Z

I have the impression that we have to refresh the understanding what's the issue with this proposal and the according PR #710 . In December 2023, @br3no creates the issue Support for asymmetric embedding models in ml-commons, the according PR asymmetric embeddings was merged in February 2024, which enables to add a situation-aware prefix specification inside a model configuration at model registration time.

The two supported 'situations' are query and passage, because asymmetric embedding models are trained to create different embeddings for the query situation and the doc/chunk indexing situation (passage). To specify the current situation, it is possible now to set it as "content_type" : "query" inside the parameters section when calling the model. This currently works if you do it manually with the _predict API, see an example here.

But you almost never use this manually - in OpenSearch you index/ingest documents where you add embeddings with an ingest processor - where you have to specify the situation "passage". Rename it to 'index' if you want. The ingest processor have to do it, because he is the caller to the model. Only he can set this simple parameter, that doesn't influence anything else (it is just calling an existing java method AsymmetricTextEmbeddingParameters.builder().embeddingContentType(EmbeddingContentType.PASSAGE).build()). There is no other possibility to modify this call to the model. This is done in the PR #710 for this proposal, in the TextEmbeddingProcessor.

Same is for the situation "query". It has to be specified when you make a neural query call, the PR #710 sets this inside the NeuralQueryBuilder.

The functionality is available in ml-commons models, it is just about these ~two lines of code that it can be used in OpenSearch searching and indexing also.

model-collapse · 2024-10-19T11:20:13Z

@reuschling thanks for this clarification! We didn't aware the fundamentals are already merged in ml-commons. Although we have some concern on this design but the consistency between the repo ml-commons and neural-search is the primary value. We are ok to move forward and will start code review ASAP. cc: @br3no
Later, we might raise another issue to propose a more generic solution. Comments will be welcome.

github-actions bot added the untriaged label Feb 29, 2024

vamshin added Features Introduces a new unit of functionality that satisfies a requirement and removed untriaged labels Mar 21, 2024

vamshin assigned model-collapse Mar 21, 2024

br3no mentioned this issue Apr 25, 2024

Add support for asymmetric embedding models #710

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Add support for asymmetric embedding models to neural-search #620

[PROPOSAL] Add support for asymmetric embedding models to neural-search #620

br3no commented Feb 29, 2024

model-collapse commented Mar 4, 2024 •

edited

Loading

model-collapse commented Mar 13, 2024

br3no commented Mar 13, 2024

reuschling commented Apr 5, 2024

br3no commented Apr 12, 2024

reuschling commented Apr 19, 2024

br3no commented Apr 19, 2024

br3no commented Apr 25, 2024

model-collapse commented Oct 12, 2024

yuye-aws commented Oct 15, 2024

zhichao-aws commented Oct 15, 2024

reuschling commented Oct 16, 2024 •

edited

Loading

model-collapse commented Oct 19, 2024 •

edited

Loading

[PROPOSAL] Add support for asymmetric embedding models to neural-search #620

[PROPOSAL] Add support for asymmetric embedding models to neural-search #620

Comments

br3no commented Feb 29, 2024

What/Why

What are you proposing?

What users have asked for this feature?

What problems are you trying to solve?

What is the developer experience going to be?

Are there any security considerations?

Are there any breaking changes to the API

What is the user experience going to be?

Are there breaking changes to the User Experience?

Why should it be built? Any reason not to?

What will it take to execute?

Any remaining open questions?

model-collapse commented Mar 4, 2024 • edited Loading

model-collapse commented Mar 13, 2024

br3no commented Mar 13, 2024

reuschling commented Apr 5, 2024

br3no commented Apr 12, 2024

reuschling commented Apr 19, 2024

br3no commented Apr 19, 2024

br3no commented Apr 25, 2024

model-collapse commented Oct 12, 2024

yuye-aws commented Oct 15, 2024

zhichao-aws commented Oct 15, 2024

reuschling commented Oct 16, 2024 • edited Loading

model-collapse commented Oct 19, 2024 • edited Loading

model-collapse commented Mar 4, 2024 •

edited

Loading

reuschling commented Oct 16, 2024 •

edited

Loading

model-collapse commented Oct 19, 2024 •

edited

Loading