Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update root readme adding key features, repo structure details, user uide reference, jira reference, details for repo clarity. #1066

Merged
merged 7 commits into from
Oct 16, 2024
233 changes: 172 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,102 +1,199 @@
## OpenSearch upgrades, migrations, and comparison tooling

OpenSearch upgrade, migration, and comparison tooling facilitates OpenSearch migrations and upgrades. With these tools, you can set up a proof-of-concept environment locally using Docker containers or deploy to AWS using a one-click deployment script. Once set up and deployed, users can redirect their production traffic from a source cluster to a provisioned target cluster, enabling a comparison of results between the two clusters. All traffic directed to the source cluster is stored for future replay. Meanwhile, traffic to the target cluster is replayed at an identical rate to ensure a direct "apple-to-apple" comparison. This toolset empowers users to fine-tune cluster configurations and manage workloads more effectively.
# OpenSearch Migrations Engine

## Table of Contents
1. [Overview](#overview)
2. [Key Features](#key-features)
3. [Supported Versions and Platforms](#supported-versions-and-platforms)
4. [Issue Tracking](#issue-tracking)
5. [Project Structure](#project-structure)
6. [Documentation](#documentation)
7. [Getting Started](#getting-started)
- [Local Deployment](#local-deployment)
- [AWS Deployment](#aws-deployment)
8. [Development](#development)
- [Prerequisites](#prerequisites)
- [Building the Project](#building-the-project)
- [Running Tests](#running-tests)
- [Code Style](#code-style)
- [Pre-commit Hooks](#pre-commit-hooks)
9. [Contributing](#contributing)
10. [Publishing](#publishing)
11. [Security](#security)
12. [License](#license)
13. [Acknowledgments](#acknowledgments)

## Overview

The OpenSearch Migrations Engine is a comprehensive set of tools designed to facilitate upgrades, migrations, and comparisons for OpenSearch and Elasticsearch clusters. This project aims to simplify the process of moving between different versions and platforms while ensuring data integrity and performance.

Here's an updated and simplified version of the **Key Features** section to improve clarity and readability:

---

## Key Features

- [OpenSearch upgrades, migrations, and comparison tooling](#opensearch-upgrades-migrations-and-comparison-tooling)
- [Table of Contents](#table-of-contents)
- [Supported cluster versions and platforms](#supported-cluster-versions-and-platforms)
- [Supported Source and Target Versions](#supported-source-and-target-versions)
- [Supported Source and Target Platforms](#supported-source-and-target-platforms)
- [Build and deploy](#build-and-deploy)
- [Local deployment](#local-deployment)
- [AWS deployment](#aws-deployment)
- [Developer contributions](#developer-contributions)
- [Code Style](#code-style)
- [Pre-commit hooks](#pre-commit-hooks)
- [Traffic Capture Proxy and Replayer](#traffic-capture-proxy-and-replayer)
- [Running Tests](#running-tests)
- [Security](#security)
- [License](#license)
- [Releasing](#releasing)
- [Publishing](#publishing)
- **Upgrade and Migration Support**: Provides tools for migrating between different versions of Elasticsearch and OpenSearch.
- **[Metadata Migration](MetadataMigration/README.md)**: Migrate essential cluster components such as configuration, settings, templates, and aliases.
- **Multi-Version Upgrade**: Easily migrate across major versions (e.g., from Elasticsearch 6.8 to OpenSearch 2.15), skipping intermediate upgrades and reducing time and risk.
- **Downgrade Support**: Downgrade to an earlier version if needed (e.g., from Elasticsearch 7.17 to 7.10.2).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't any workflows that allow for downgrades. Is this a more idealized version of what the toolset can support or something that we want to promote as possible in the tool today?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend removing this bullet until we've got more concrete support and validation in place.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you concerned about this specific migration path? This solution is recommended for users who want to move from versions greater than 7.10.2 back to 7.10.2.

- **Existing Data Migration with [Reindex-from-Snapshot](RFS/docs/DESIGN.md)**: Migrate indices and documents using snapshots, updating your data to the latest Lucene version quickly without impacting the target cluster.
- **Live Traffic Capture with [Capture-and-Replay](docs/TrafficCaptureAndReplayDesign.md)**: Capture live traffic from the source cluster and replay it on the target cluster for validation. This ensures the target cluster can handle real-world traffic patterns before fully migrating.

- **Zero-Downtime Migration with [Live Traffic Routing](docs/ClientTrafficSwinging.md)**: Tools to seamlessly switch client traffic between clusters while keeping services fully operational.

## Supported cluster versions and platforms
- **Migration Rollback**: Keep your source cluster synchronized during the migration, allowing you to monitor the target cluster's performance before fully committing to the switch. You can safely revert if needed.

There are numerous combinations of source clusters, target clusters, and platforms. While the tools provided in this repository might work with various combinations, they might not support breaking changes between different source and target versions. Below is a list of supported source and target versions and platforms.
- **User-Friendly Interface via [Migration Console](https://github.com/opensearch-project/opensearch-migrations/blob/main/docs/migration-console.md)**: Command Line Interface (CLI) that guides you through each migration step.

### Supported Source and Target Versions
* Elasticsearch 6.x (Coming soon...)
* Elasticsearch 7.0 - 7.17.x
* OpenSearch 1.x
* OpenSearch 2.x
- **Flexible Deployment Options**:
sumobrian marked this conversation as resolved.
Show resolved Hide resolved
- **[AWS Deployment](https://aws.amazon.com/solutions/implementations/migration-assistant-for-amazon-opensearch-service/)**: Fully automated deployment to AWS.
- **[Local Docker Deployment](/TrafficCapture/dockerSolution/README.md)**: Run the solution locally in a container for testing and development.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local docker doesn't work in a number of scenarios, I'd recommend pulling it from this list. By following the developer guide this is technically possible, but we haven't made an effort to make sure this is a good user experience.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this may not be consistent across the solution, it’s crucial to support it. This approach enables rapid development and allows others to test the solution without fully committing to using it in the cloud. It also keeps the solution flexible for broader community adoption. We encourage users to explore these options and open issues if they encounter any challenges. This feedback not only helps improve the solution but also enhances the onboarding experience for everyone.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does work for a number of scenarios and we have at least one engagement that leveraged this, especially in their early development. We should be striving to get a solution that any student or developer could stand up and test anywhere.

Instead of pulling this out, we should open up issues for where this doesn't work and we can track them. My guess is that K8s work can subsume all of this, but we'll run into this case in the future if we don't hold all of our projects to the same standard of building for at least two deployment environments. This documentation is the only thing that we have to uphold that contract.


### Supported Source and Target Platforms
* Self-managed (hosted by cloud provider)
* Self-managed (on-premises)
* Managed cloud offerings (e.g., Amazon OpenSearch, Amazon OpenSearch Serverless)
## Supported Versions and Platforms

## Build and deploy
- **Tested Migration Paths**:
sumobrian marked this conversation as resolved.
Show resolved Hide resolved
- Elasticsearch 6.8 to OpenSearch 1.3, 2.14
- Elasticsearch 7.10.2 to OpenSearch 1.3, 2.14
- Elasticsearch 7.17 to OpenSearch 1.3, 2.14
- OpenSearch 1.3 to OpenSearch 2.14
sumobrian marked this conversation as resolved.
Show resolved Hide resolved

### Local deployment
Note that testing is done on specific minor versions, but any minor versions within a listed major version are expected to work.

A containerized end-to-end solution can be deployed locally using the
[Docker Solution](TrafficCapture/dockerSolution/README.md).
- **Platforms**:
- Self-managed (cloud provider hosted)
- Self-managed (on-premises)
- Managed cloud offerings (e.g., Amazon OpenSearch, Amazon OpenSearch Serverless)

### AWS deployment
While untested, alternative cloud providers are expected to work.
sumobrian marked this conversation as resolved.
Show resolved Hide resolved

Refer to [AWS Deployment](deployment/README.md) to deploy this solution to AWS.
## Issue Tracking

## Developer contributions
We encourage users to open bugs and feature requests in this GitHub repository.

### Code Style
**Encountering a compatibility issue or missing feature?**

- [Search existing issues](https://github.com/opensearch-project/opensearch-migrations/issues) to see if it’s already reported. If it is, feel free to **upvote** and **comment**.
- Can’t find it? [Create a new issue](https://github.com/opensearch-project/opensearch-migrations/issues/new/choose) to let us know.

For issue prioritization and management, the migrations team uses Jira, but uses GitHub issues for community intake:

https://opensearch.atlassian.net/

## Project Structure
sumobrian marked this conversation as resolved.
Show resolved Hide resolved

- [`CreateSnapshot`](CreateSnapshot/README.md): Tools for creating cluster snapshots.
- [`DocumentsFromSnapshotMigration`](DocumentsFromSnapshotMigration/README.md): Utilities for migrating documents from snapshots.
- [`MetadataMigration`](MetadataMigration/README.md): Core functionality for migrating cluster metadata.
- [`RFS`](RFS/README.md) (Reindex-From-Snapshot):
- Migration utilities for document reindexing and metadata migration.
- Includes tracing contexts for both document and metadata migrations.
- [`TrafficCapture`](TrafficCapture/README.md) (Capture-and-Replay): Projects for proxying, capturing, and replaying HTTP traffic.
- [`migrationConsole`](TrafficCapture/dockerSolution/src/main/docker/migrationConsole/README.md): A comprehensive CLI tool for executing the migration workflow.
- [`console_api`](TrafficCapture/dockerSolution/src/main/docker/migrationConsole/console_api/README.md) (experimental): Django-based API for orchestrating migration tasks.
- [`lib/console_link`](TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/README.md): Core library for migration operations.
- Provides CLI interface (`cli.py`) for user interactions.
- Implements various middleware components for error handling, metadata management, metrics collection, and more.
- Includes models for clusters, backfill operations, replay functionality, and other migration-related tasks.
- Supports various migration scenarios including backfill, replay, and metrics collection.
- Integrates with AWS services like ECS and CloudWatch for deployment and monitoring.
- [`deployment`](deployment/README.md): AWS deployment scripts and configurations.
- `dev-tools`: Development utilities and API request templates.
- `docs`: Project documentation and architecture diagrams.
- `libraries`: Shared libraries used across the project.
- [`test`](test/README.md): End-to-end testing scripts and configurations.
- `transformation`: Data transformation utilities for migration processes.
- [`dashboardsSanitizer`](dashboardsSanitizer/README.md): CLI tool for sanitizing dashboard configurations.
- `testHelperFixtures`: Test utilities including HTTP client for testing.

The migration console CLI provides users with a centralized interface to execute and manage the entire migration workflow, including:
- Configuring source and target clusters
- Managing backfill operations
- Controlling traffic replay
- Monitoring migration progress through metrics
- Handling snapshots and metadata
- Integrating with various deployment environments (Docker locally and AWS ECS)

Users can interact with the migration process through the CLI, which orchestrates the different components of the migration toolkit to perform a seamless migration between Elasticsearch and OpenSearch clusters.

## Documentation

User guide documentation is available in the [OpenSearch Migrations Wiki](https://github.com/opensearch-project/opensearch-migrations/wiki).

## Getting Started

### Local Deployment

For local development and testing, use the Docker solution:

```
cd TrafficCapture/dockerSolution
# Follow instructions in the README.md file
```

### AWS Deployment

To deploy the solution on AWS, follow the steps outlined in [Migration Assistant for Amazon OpenSearch Service](https://aws.amazon.com/solutions/implementations/migration-assistant-for-amazon-opensearch-service/), specifically [deploying the solution](https://docs.aws.amazon.com/solutions/latest/migration-assistant-for-amazon-opensearch-service/deploy-the-solution.html).

There are many different source type under this project, the overall style is enforced via `./gradlew spotlessCheck` and is verified on all pull requests. Spotless can resolve these issues automatically with `./gradlew spotlessApply`. An recommended eclipse formatter [formatter.xml](./formatter.xml) is available at the root of the project, consult your IDE extensions/plugins for how to use this formatter during development.

### Pre-commit hooks
## Development

Developers must run the "install_githooks.sh" script in order to add any pre-commit hooks. Developers should run these hooks before opening a pull request to ensure checks pass and prevent potential rejection of the pull request."
### Prerequisites

### Traffic Capture Proxy and Replayer
- Java Development Kit (JDK)
- Gradle
- Python
- Docker and Docker Compose (for local deployment)
- AWS CLI (for AWS deployment)
- CDK (for AWS deployment)
- Node (for AWS deployment)

The TrafficCapture directory hosts a set of projects designed to facilitate the proxying and capturing of HTTP traffic, which can then be offloaded and replayed to other HTTP(S) server(s).
### Building the Project

More documentation on this directory including the projects within it can be found here: [Traffic Capture](TrafficCapture/README.md).
```bash
./gradlew build
sumobrian marked this conversation as resolved.
Show resolved Hide resolved
```

### Running Tests

Developers can run a test script which will verify the end-to-end Local Docker Solution.
```bash
./gradlew test
```

More documentation on this test script can be found here:
[End-to-End Testing](test/README.md)
### Continuous Integration/ Continuous Deployment
We use a combination of github actions and jenkins so that we can publish released on a weekly basis and allow users to provide attestation for users interested in migration tooling.

## Security
Jenkins pipelines are available [here](https://migrations.ci.opensearch.org/)

See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.

## License
### Code Style

This project is licensed under the Apache-2.0 License.
We use Spotless for code formatting. To check and apply the code style:

```bash
./gradlew spotlessCheck
./gradlew spotlessApply
```

### Pre-commit Hooks

## Releasing
Install the pre-commit hooks:

```bash
./install_githooks.sh
```

The release process is standard across repositories in this org and is run by a release manager volunteering from amongst [maintainers](MAINTAINERS.md).
## Contributing

1. Create a tag, e.g. 0.1.0, and push it to this GitHub repository.
2. The [release-drafter.yml](.github/workflows/release-drafter.yml) will be automatically kicked off and a draft release will be created.
3. This draft release triggers the [jenkins release workflow](https://build.ci.opensearch.org/job/opensearch-migrations-release) as a result of which the opensearch-migrations toolset is released and published on artifacts.opensearch.org example as https://artifacts.opensearch.org/migrations/0.1.0/opensearch-migrations-0.1.0.tar.gz.
4. Once the above release workflow is successful, the drafted release on GitHub is published automatically.
Please read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct and the process for submitting pull requests.

## Publishing

This project can be published to a local maven repository with:
This project can be published to a local Maven repository with:
```sh
./gradlew publishToMavenLocal
```

And subsequently imported into a separate gradle project with (replacing name with any subProject name)
And subsequently imported into a separate Gradle project with (replacing name with any subProject name)
```groovy
repositories {
mavenCentral()
Expand All @@ -115,8 +212,22 @@ The entire list of published subprojects can be viewed with
```


To include a testFixture dependency, define the import like
To include a test Fixture dependency, define the import like

```groovy
testImplementation testFixtures('org.opensearch.migrations.trafficcapture:trafficReplayer:0.1.0-SNAPSHOT')
```
## Security

See [SECURITY.md](SECURITY.md) for information about reporting security vulnerabilities.

## License

This project is licensed under the Apache-2.0 License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- OpenSearch Community
- Contributors and maintainers

For more detailed information about specific components, please refer to the README files in the respective directories.