Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Slow convert processor in opensearch 2.11.0 #16436

Open
shikeli opened this issue Oct 22, 2024 · 2 comments
Open

[BUG] Slow convert processor in opensearch 2.11.0 #16436

shikeli opened this issue Oct 22, 2024 · 2 comments
Labels

Comments

@shikeli
Copy link

shikeli commented Oct 22, 2024

Describe the bug

We are migrating to opensearch 2.11.0 from elasticsearch 6.8.1. We faced ingestion throughput 70% drop. After investigation, we found the convert processor is very slow compared to elasticsearch 6.8.1. We removed the processor as work around. I know most of team doesn't run processor out of indexing. We are doing this first then we do indexing.

Related component

Indexing:Performance

To Reproduce

Write one plugin to call the processor.

Expected behavior

We expect the same performance or better.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@shikeli shikeli added bug Something isn't working untriaged labels Oct 22, 2024
@andrross
Copy link
Member

Thanks @shikeli. Can you share more detail? i.e. what kind of documents are you sending, what does your ingest pipeline configuration look like, any profiling information you captured, etc. If you have a basic reproduction you can share that would be very helpful too.

@shikeli
Copy link
Author

shikeli commented Oct 23, 2024

We have one pipeline added that is using convert processor


processors:
  - convert: { field: 'foo', target_field: 'bar', type: 'long', ignore_failure: true }

In our ingestion plugin, we have one class PipelineExecutor, this will execute all pipeline for each document before we do bulk indexing. Below is my profiling data.

OSSlow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants