diff --git a/FetchMigration/Dockerfile b/FetchMigration/Dockerfile index 873732470..8d9b5ab99 100644 --- a/FetchMigration/Dockerfile +++ b/FetchMigration/Dockerfile @@ -1,12 +1,10 @@ -# TODO Move away from snapshot version after OS source is released -# https://github.com/opensearch-project/data-prepper/issues/1985 -FROM opensearch-data-prepper:2.5.0-SNAPSHOT +FROM opensearchproject/data-prepper:2.5.0 COPY python/requirements.txt . # Install dependencies to local user directory -RUN apk update -RUN apk add --no-cache python3 py-pip -RUN pip install --user -r requirements.txt +RUN apt -y update +RUN apt -y install python3 python3-pip +RUN pip3 install --user -r requirements.txt ENV FM_CODE_PATH /code WORKDIR $FM_CODE_PATH @@ -21,4 +19,4 @@ RUN echo "ssl: false" > $DATA_PREPPER_PATH/config/data-prepper-config.yaml RUN echo "metricRegistries: [Prometheus]" >> $DATA_PREPPER_PATH/config/data-prepper-config.yaml # Include the -u flag to have stdout logged -ENTRYPOINT python -u ./fetch_orchestrator.py $DATA_PREPPER_PATH $FM_CODE_PATH/input.yaml http://localhost:4900 +ENTRYPOINT python3 -u ./fetch_orchestrator.py $DATA_PREPPER_PATH $FM_CODE_PATH/input.yaml http://localhost:4900 diff --git a/FetchMigration/README.md b/FetchMigration/README.md index bef2cb542..3685ed9ed 100644 --- a/FetchMigration/README.md +++ b/FetchMigration/README.md @@ -1,43 +1,38 @@ -# Index Configuration Tool +# "Fetch" Data Migration / Backfill -Python package that automates the creation of indices on a target cluster based on the contents of a source cluster. -Index settings and index mappings are correctly copied over, but no data is transferred. -This tool seeks to eliminate the need to [specify index templates](https://github.com/awslabs/logstash-output-amazon_es#optional-parameters) when migrating data from one cluster to another. -The tool currently supports ElasticSearch or OpenSearch as source and target. +Fetch Migration provides an easy-to-use tool that simplifies the process of moving indices and their data from a +"source" cluster (either Elasticsearch or OpenSearch) to a "target" OpenSearch cluster. It automates the process of +comparing indices between the two clusters and only creates index metadata (settings and mappings) that do not already +exist on the target cluster. Internally, the tool uses [Data Prepper](https://github.com/opensearch-project/data-prepper) +to migrate data for these created indices. -## Parameters +The Fetch Migration tool is implemented in Python. +A Docker image can be built using the included [Dockerfile](./Dockerfile). -The first required input to the tool is a path to a [Data Prepper](https://github.com/opensearch-project/data-prepper) pipeline YAML file, which is parsed to obtain the source and target cluster endpoints. -The second required input is an output path to which a modified version of the pipeline YAML file is written. -This version of the pipeline adds an index inclusion configuration to the sink, specifying only those indices that were created by the index configuration tool. -The tool also supports several optional flags: +## Components -| Flag | Purpose | -| ------------- | ------------- | -| `-h, --help` | Prints help text and exits | -| `--report, -r` | Prints a report of indices indicating which ones will be created, along with indices that are identical or have conflicting settings/mappings. | -| `--dryrun` | Skips the actual creation of indices on the target cluster | +The tool consists of 3 components: +* A "metadata migration" module that handles metadata comparison between the source and target clusters. +This can output a human-readable report as well as a Data Prepper pipeline `yaml` file. +* A "migration monitor" module that monitors the progress of the migration and shuts down the Data Prepper pipeline +once the target document count has been reached +* An "orchestrator" module that sequences these steps as a workflow and manages the kick-off of the Data Prepper +process between them. -### Reporting - -If `--report` is specified, the tool prints all processed indices organized into 3 buckets: -* Successfully created on the target cluster -* Skipped due to a conflict in settings/mappings -* Skipped since the index configuration is identical on source and target +The orchestrator module is the Docker entrypoint for the tool, though each component can be executed separately +via Python. Help text for each module can be printed by supplying the `-h / --help` flag. ## Current Limitations -* Only supports ElasticSearch and OpenSearch endpoints for source and target -* Only supports basic auth -* Type mappings for legacy indices are not handled -* Index templates and index aliases are not copied -* Index health is not validated after creation - -## Usage +* Fetch Migration runs as a single instance and does not support vertical scaling or data slicing +* The tool does not support customizing the list of indices included for migration +* Metadata migration only supports basic auth +* The migration does not filter out `red` indices +* In the event that the migration fails or the process dies, the created indices on the target cluster are not rolled back -### Command-Line +## Execution -#### Setup: +### Python * [Clone](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository) this GitHub repo * Install [Python](https://www.python.org/) @@ -47,15 +42,13 @@ If `--report` is specified, the tool prints all processed indices organized into Navigate to the cloned GitHub repo. Then, install the required Python dependencies by running: ```shell -python -m pip install -r index_configuration_tool/requirements.txt +python -m pip install -r python/requirements.txt ``` -#### Execution: - -After [setup](#setup), the tool can be executed using: +The Fetch Migration workflow can then be kicked off via the orchestrator module: ```shell -python index_configuration_tool/metadata_migration.py +python python/fetch_orchestrator.py --help ``` ### Docker @@ -67,42 +60,48 @@ docker build -t fetch-migration . ``` Then run the `fetch-migration` image. -Replace `` in the command below with the path to your Logstash config file: +Replace `` in the command below with the path to your Data Prepper pipeline `yaml` file: ```shell -docker run -p 4900:4900 -v :/code/input.yaml ict +docker run -p 4900:4900 -v :/code/input.yaml fetch-migration ``` +### AWS deployment + +Refer to [AWS Deployment](../deployment/README.md) to deploy this solution to AWS. + ## Development -The source code for the tool is located under the `index_configuration_tool/` directory. Please refer to the [Setup](#setup) section to ensure that the necessary dependencies are installed prior to development. +The source code for the tool is located under the `python/` directory, with unit test in the `tests/` subdirectory. +Please refer to the [Setup](#setup) section to ensure that the necessary dependencies are installed prior to development. Additionally, you'll also need to install development dependencies by running: ```shell -python -m pip install -r index_configuration_tool/dev-requirements.txt +python -m pip install -r python/dev-requirements.txt ``` ### Unit Tests -Unit tests are located in a sub-directory named `tests`. Unit tests can be run using: +Unit tests can be run from the `python/` directory using: ```shell -python -m unittest +python -m coverage run -m unittest ``` ### Coverage -Code coverage metrics can be generated by first running unit tests using _coverage run_: +_Code coverage_ metrics can be generated after a unit-test run. A report can either be printed on the command line: ```shell -python -m coverage run -m unittest +python -m coverage report --omit "*/tests/*" ``` -Then a report can either be printed on the command line or generated as HTML. -Note that the `--omit` parameter must be specified to avoid tracking code coverage on unit test code: +or generated as HTML: ```shell python -m coverage report --omit "*/tests/*" python -m coverage html --omit "*/tests/*" -``` \ No newline at end of file +``` + +Note that the `--omit` parameter must be specified to avoid tracking code coverage on unit test code itself. \ No newline at end of file diff --git a/FetchMigration/python/dev-requirements.txt b/FetchMigration/python/dev-requirements.txt index ecef90389..2efcff09b 100644 --- a/FetchMigration/python/dev-requirements.txt +++ b/FetchMigration/python/dev-requirements.txt @@ -1 +1,2 @@ -coverage>=7.2.3 \ No newline at end of file +coverage>=7.3.2 +pur>=7.3.1 \ No newline at end of file diff --git a/FetchMigration/python/fetch_orchestrator.py b/FetchMigration/python/fetch_orchestrator.py index ec5985002..1c5b467c4 100644 --- a/FetchMigration/python/fetch_orchestrator.py +++ b/FetchMigration/python/fetch_orchestrator.py @@ -59,8 +59,9 @@ def run(dp_base_path: str, dp_config_file: str, dp_endpoint: str): cli_args = arg_parser.parse_args() base_path = os.path.expandvars(cli_args.data_prepper_path) - if os.environ["INLINE_PIPELINE"] is not None: - decoded_bytes = base64.b64decode(os.environ["INLINE_PIPELINE"]) + inline_pipeline = os.environ.get("INLINE_PIPELINE", None) + if inline_pipeline is not None: + decoded_bytes = base64.b64decode(inline_pipeline) with open(cli_args.config_file_path, 'wb') as config_file: config_file.write(decoded_bytes) - run(base_path, cli_args.config_file_path, cli_args.dp_endpoint) + run(base_path, cli_args.config_file_path, cli_args.data_prepper_endpoint) diff --git a/FetchMigration/python/requirements.txt b/FetchMigration/python/requirements.txt index dff9f1b69..03e801384 100644 --- a/FetchMigration/python/requirements.txt +++ b/FetchMigration/python/requirements.txt @@ -1,5 +1,5 @@ jsondiff>=2.0.0 prometheus-client>=0.17.1 -pyyaml>=6.0 +pyyaml>=6.0.1 requests>=2.31.0 -responses>=0.23.1 \ No newline at end of file +responses>=0.23.3 diff --git a/README.md b/README.md index 3e721a156..6eac48970 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ A containerized end-to-end solution can be deployed locally using the ### AWS deployment -Refer to [AWS Deployment](deployment/copilot/README.md) to deploy this solution to AWS. +Refer to [AWS Deployment](deployment/README.md) to deploy this solution to AWS. ## Developer contributions @@ -46,6 +46,12 @@ The TrafficCapture directory hosts a set of projects designed to facilitate the More documentation on this directory including the projects within it can be found here: [Traffic Capture](TrafficCapture/README.md). +### Fetch Migration + +The FetchMigration directory hosts tools that simplify the process of backfilling / moving data from one cluster to another. + +Further documentation can be found here: [Fetch Migration README](FetchMigration/README.md). + ### Running Tests Developers can run a test script which will verify the end-to-end Local Docker Solution. diff --git a/TrafficCapture/dockerSolution/src/main/docker/migrationConsole/runTestBenchmarks.sh b/TrafficCapture/dockerSolution/src/main/docker/migrationConsole/runTestBenchmarks.sh index 37581919b..acec04354 100644 --- a/TrafficCapture/dockerSolution/src/main/docker/migrationConsole/runTestBenchmarks.sh +++ b/TrafficCapture/dockerSolution/src/main/docker/migrationConsole/runTestBenchmarks.sh @@ -5,6 +5,7 @@ endpoint="https://capture-proxy-es:9200" auth_user="admin" auth_pass="admin" no_auth=false +no_ssl=false # Override default values with optional command-line arguments while [[ $# -gt 0 ]]; do @@ -29,6 +30,10 @@ while [[ $# -gt 0 ]]; do no_auth=true shift ;; + --no-ssl) + no_ssl=true + shift + ;; *) shift ;; @@ -42,8 +47,13 @@ else auth_string=",basic_auth_user:${auth_user},basic_auth_password:${auth_pass}" fi +if [ "$no_ssl" = true ]; then + base_options_string="" +else + base_options_string="use_ssl:true,verify_certs:false" +fi + # Construct the final client options string -base_options_string="use_ssl:true,verify_certs:false" client_options="${base_options_string}${auth_string}" echo "Running opensearch-benchmark workloads against ${endpoint}" diff --git a/deployment/README.md b/deployment/README.md index 6f8727b39..0a005486d 100644 --- a/deployment/README.md +++ b/deployment/README.md @@ -1,34 +1,248 @@ -### Deployment -This directory is aimed at housing deployment/distribution methods for various migration related images and infrastructure. It is not specific to any given platform and should be expanded to more platforms as needed. +# Deploying to AWS using CDK and Copilot +Copilot is a tool for deploying containerized applications on AWS ECS. Official documentation can be found [here](https://aws.github.io/copilot-cli/docs/overview/). -### Deploying Migration solution to AWS +**Notice**: These tools are free to use, but the user is responsible for the cost of underlying infrastructure required to operate the solution. We welcome feedback and contributions to optimize costs. -**Note**: These features are still under development and subject to change +## Initial Setup -Detailed instructions for deploying the CDK and setting up its prerequisites can be found in the opensearch-service-migration [README](./cdk/opensearch-service-migration/README.md). This could involve setting CDK context parameters to customize your OpenSearch Service Domain and VPC as well as setting needed migration parameters. A sample **testing** `cdk.context.json` for an E2E migration setup could be: +### Install Prerequisites + +#### Docker +Docker is used by Copilot to build container images. If not installed, follow the steps [here](https://docs.docker.com/engine/install/) to set up. Later versions are recommended. +#### Git +Git is used by the opensearch-migrations repo to fetch associated repositories (such as the traffic-comparator repo) for constructing their respective Dockerfiles. Steps to set up can be found [here](https://github.com/git-guides/install-git). +#### Java 11 +Java is used by the opensearch-migrations repo and Gradle, its associated build tool. The current required version is Java 11. + +#### Creating Dockerfiles +This project needs to build the required Dockerfiles that Copilot will use in its services. From the `TrafficCapture` directory the following command can be ran to build these files +``` +./gradlew :dockerSolution:buildDockerImages ``` -{ - "engineVersion": "OS_1.3", - "domainName": "aos-test-domain", - "dataNodeCount": 2, - "vpcEnabled": true, - "availabilityZoneCount": 2, - "openAccessPolicyEnabled": true, - "domainRemovalPolicy": "DESTROY", - "migrationAssistanceEnabled": true, - "MSKARN": "arn:aws:kafka:us-east-1:12345678912:cluster/logging-msk-cluster/123456789-12bd-4f34-932b-52060474aa0f-7", - "MSKBrokers": [ - "b-2-public.loggingmskcluster.abc123.c7.kafka.us-east-1.amazonaws.com:9198", - "b-1-public.loggingmskcluster.abc123.c7.kafka.us-east-1.amazonaws.com:9198" - ], - "MSKTopic": "logging-traffic-topic" -} +More details can be found [here](../TrafficCapture/dockerSolution/README.md) + +#### Setting up Copilot CLI +If you are on Mac the following Homebrew command can be run to set up the Copilot CLI: +``` +brew install aws/tap/copilot-cli ``` +Otherwise, please follow the manual instructions [here](https://aws.github.io/copilot-cli/docs/getting-started/install/) +## Deployment +### Using the devDeploy script -Once prerequisites are met and context parameters are set, deploying the resources to AWS can be done simply by running the following command: +The following script command can be executed to deploy both the CDK infrastructure and Copilot services for a development environment +```shell +./devDeploy.sh +``` +Options can be found with: +```shell +./devDeploy.sh --help ``` -cdk deploy "*" -``` \ No newline at end of file + +Before using the script, please ensure that prerequisites have been completed, and AWS credentials have been configured. + +### Step-by-step deployment + +#### CDK + +Before starting this section, ensure that AWS credentials have been configured and are still valid. + +* Ensure that `cdk/opensearch-service-migration` is your working directory +* Export environment variables required by CDK and Copilot: + * `export COPILOT_APP_NAME=migration-copilot` + * `export CDK_DEPLOYMENT_STAGE=dev && COPILOT_DEPLOYMENT_STAGE=dev` + * If necessary, `export AWS_DEFAULT_REGION=` + * `export COPILOT_REGION=$AWS_DEFAULT_REGION` +* Set up the `cdk.context.json` file as necessary. For more documentation on the available options, see the README [here](cdk/opensearch-service-migration/README.md) +* If required, update/customize the [dp_pipeline_template.yaml](cdk/opensearch-service-migration/dp_pipeline_template.yaml) Data Prepper pipeline configuration file. Consult the [Data Prepper documentation](https://github.com/opensearch-project/data-prepper/blob/main/docs/overview.md) for more information + * Note: leave the `<*_CLUSTER_HOST>` values as-is - these will be replaced by CDK +* Deploy all stacks using CDK: + * `cdk bootstrap` + * `cdk deploy -O cdkOutput.json --require-approval never --concurrency 2 "*"` +* If the historical migration stack was deployed, note down the ECS run task command from the CDK output +* Finally, eval the exports from the CDK output + * `eval "$(grep -o "export [a-zA-Z0-9_]*=[^\\;\"]*" cdkOutput.json | sed 's/=/="/' | sed 's/.*/&"/')"` + +Additionally, the following export is needed for the Replayer Copilot service: +``` +export MIGRATION_REPLAYER_COMMAND=/bin/sh -c "/runJavaWithClasspath.sh org.opensearch.migrations.replay.TrafficReplayer $MIGRATION_DOMAIN_ENDPOINT --insecure --kafka-traffic-brokers $MIGRATION_KAFKA_BROKER_ENDPOINTS --kafka-traffic-topic logging-traffic-topic --kafka-traffic-group-id default-logging-group --kafka-traffic-enable-msk-auth --auth-header-user-and-secret $MIGRATION_DOMAIN_USER_AND_SECRET_ARN | nc traffic-comparator 9220" +``` + +#### Copilot + +##### Overview + +It is **vital** to run any `copilot` commands from within the `deployment/copilot` directory. When components are initialized the name given will be searched for in the immediate directory structure to look for an existing `manifest.yml` for that component. If found it will use the existing manifest and not create its own. This Copilot app already has existing manifests for each of its services and a dev environment, which should be used for proper operation. + +When initially setting up Copilot, each component (apps, services, and environments) need to be initialized. Beware when initializing an environment in Copilot, it will prompt you for values even if you've defined them in the `manifest.yml`, though values input at the prompt are ignored in favor of what was specified in the file. + +If using temporary environment credentials when initializing an environment: +* Copilot will prompt you to enter each variable (AWS Access Key ID, AWS Secret Access Key, AWS Session Token). If these variables are already available in your environment, these three prompts can be `enter`'d through and ignored. +* When prompted ` Would you like to use the default configuration for a new environment?` select `Yes, use default.` as this will ultimately get ignored for what has been configured in the existing `manifest.yml` +* The last prompt will ask for the desired deployment region and should be filled out as Copilot will store this internally. + +This Copilot app supports deploying the Capture Proxy and Elasticsearch as a single service `capture-proxy-es` (as shown below) or as separate services `capture-proxy` and `elasticsearch` + +**Note**: This app also contains `kafka-broker` and `kafka-zookeeper` services which are currently experimental and usage of MSK is preferred. These services do not need to be deployed, and as so are not listed below. + +##### Commands + +Before starting this section, ensure that AWS credentials have been configured and are still valid. + +* Unset the AWS region environment variable to prevent potential conflicts in Copilot + * `export AWS_DEFAULT_REGION=` +* Initialize the app: `copilot app init $COPILOT_APP_NAME` +* Initialize the environment using the environment variables exported from the CDK: + +```shell +copilot env init -a $COPILOT_APP_NAME --name $COPILOT_DEPLOYMENT_STAGE --import-vpc-id $MIGRATION_VPC_ID --import-public-subnets $MIGRATION_PUBLIC_SUBNETS --import-private-subnets $MIGRATION_PRIVATE_SUBNETS --aws-access-key-id $AWS_ACCESS_KEY_ID --aws-secret-access-key $AWS_SECRET_ACCESS_KEY --aws-session-token $AWS_SESSION_TOKEN --region $COPILOT_REGION +``` + +* Initialize services by name: + +```shell +copilot svc init --name traffic-replayer +copilot svc init --name traffic-comparator +copilot svc init --name traffic-comparator-jupyter +copilot svc init --name capture-proxy-es +copilot svc init --name migration-console +``` + +* Deploy the environment: `copilot env deploy -a $COPILOT_APP_NAME --name $COPILOT_DEPLOYMENT_STAGE` +* Deploy services by name: + +```shell +copilot svc deploy --name traffic-comparator-jupyter --env $COPILOT_DEPLOYMENT_STAGE +copilot svc deploy --name traffic-comparator --env $COPILOT_DEPLOYMENT_STAGE +copilot svc deploy --name traffic-replayer --env $COPILOT_DEPLOYMENT_STAGE +copilot svc deploy --name capture-proxy-es --env $COPILOT_DEPLOYMENT_STAGE +copilot svc deploy --name migration-console --env $COPILOT_DEPLOYMENT_STAGE +``` + +Currently, Copilot does not support deploying all services at once (issue [here](https://github.com/aws/copilot-cli/issues/3474)) or creating dependencies between separate services. Thus, services need to be initialized and deployed one at a time as shown above. + +Note - When deploying a service with the Copilot CLI, a status bar will be displayed that gets updated as the deployment progresses. The command will complete when the specific service has all its resources created and health checks are passing on the deployed containers. + +## Working with the deployed solution + +### Executing Commands on a Service + +A command shell can be opened in the service's container if that service has enabled `exec: true` in their `manifest.yml` and the SSM Session Manager plugin is installed when prompted. +``` +copilot svc exec -a $COPILOT_APP_NAME -e $COPILOT_DEPLOYMENT_STAGE -n traffic-comparator-jupyter -c "bash" +copilot svc exec -a $COPILOT_APP_NAME -e $COPILOT_DEPLOYMENT_STAGE -n traffic-comparator -c "bash" +copilot svc exec -a $COPILOT_APP_NAME -e $COPILOT_DEPLOYMENT_STAGE -n traffic-replayer -c "bash" +copilot svc exec -a $COPILOT_APP_NAME -e $COPILOT_DEPLOYMENT_STAGE -n elasticsearch -c "bash" +copilot svc exec -a $COPILOT_APP_NAME -e $COPILOT_DEPLOYMENT_STAGE -n capture-proxy -c "bash" +copilot svc exec -a $COPILOT_APP_NAME -e $COPILOT_DEPLOYMENT_STAGE -n capture-proxy-es -c "bash" +copilot svc exec -a $COPILOT_APP_NAME -e $COPILOT_DEPLOYMENT_STAGE -n migration-console -c "bash" +``` + +### Running OpenSearch Benchmarks + +Once the solution is deployed, the easiest way to test the solution is to access the migration-console container and run a benchmark test through, as the following steps illustrate: + +``` +// Access Migration Console container +copilot svc exec -a $COPILOT_APP_NAME -e $COPILOT_DEPLOYMENT_STAGE -n migration-console -c "bash" + +// Run simple opensearch-benchmark workload (i.e. geonames, nyc_taxis, http_logs) + +// Option 1: Automated script +./runTestBenchmarks.sh + +// Option 2: Manually execute command +opensearch-benchmark execute-test --distribution-version=1.0.0 --pipeline=benchmark-only --kill-running-processes --workload=geonames --workload-params "target_throughput:10,bulk_size:1000,bulk_indexing_clients:2,search_clients:1" --client-options "use_ssl:true,verify_certs:false,basic_auth_user:admin,basic_auth_password:admin" --target-host=https://capture-proxy-es:9200 +``` + +After the benchmark has been run, the indices and documents of the source and target clusters can be checked from the same migration-console container to confirm +``` +// Option 1: Automated script +./catIndices.sh + +// Option 2: Manually execute cluster requests +// Check source cluster +curl https://capture-proxy-es:9200/_cat/indices?v --insecure -u admin:admin + +// Check target cluster +curl https://$MIGRATION_DOMAIN_ENDPOINT:443/_cat/indices?v --insecure -u admin:Admin123! +``` + +### Kicking off Fetch Migration + +* First, access the Migration Console container + +```shell +copilot svc exec -a $COPILOT_APP_NAME -e $COPILOT_DEPLOYMENT_STAGE -n migration-console -c "bash" +``` + +* Assume the Fetch Migration Task Execution role: + +```shell +export $(printf "AWS_ACCESS_KEY_ID=%s AWS_SECRET_ACCESS_KEY=%s AWS_SESSION_TOKEN=%s" \ +$(aws sts assume-role --duration-seconds 900 --role-session-name FetchMigrationExec --output text \ +--query "Credentials.[AccessKeyId,SecretAccessKey,SessionToken]" \ +--role-arn $EXEC_ROLE_ARN)) +``` + +* Execute the ECS run task command that you noted when deploying the CDK + * The status of the ECS Task can be monitored from the AWS Console. Once the task is in the `Running` state, logs and progress can be viewed via CloudWatch. + +The pipeline configuration file can be viewed (and updated) via AWS Secrets Manager. + +## Teardown + +To remove the resources installed from the steps above, follow these instructions: +1. `./devDeploy.sh --destroy-env` - Destroy all CDK and Copilot CloudFormation stacks deployed, excluding the Copilot app level stack, for the given env/stage and return to a clean state. +2. `./devDeploy.sh --destroy-all-copilot` - Destroy Copilot app and all Copilot CloudFormation stacks deployed for the given app across all regions +3. After execution of the above steps, a CDK bootstrap stack remains. To remove this stack, begin by deleting the S3 objects and the associated bucket. After that, you can delete the stack using the AWS Console or CLI. + +## Frequently Asked Questions (FAQs) + +### How is an Authorization header set for requests from the Replayer to the target cluster? + +See Replayer explanation [here](../TrafficCapture/trafficReplayer/README.md#authorization-header-for-replayed-requests) + +### How to run multiple Replayer scenarios + +The migration solution has support for running multiple Replayer services simultaneously, such that captured traffic from the Capture Proxy (which has been stored on Kafka) can be replayed on multiple different cluster configurations at the same time. These additional independent and distinct Replayer services can either be spun up together initially to replay traffic as it comes in, or added later, in which case they will begin processing captured traffic from the beginning of what is stored in Kafka. + +A **prerequisite** to use this functionality is that the migration solution has been deployed with the `devDeploy.sh` script, so that necessary environment values from CDK resources like the VPC, MSK, and EFS volume can be retrieved for additional Replayer services + +To test this scenario, you can create an additional OpenSearch Domain target cluster within the existing VPC by executing the following series of commands: +```shell +# Assuming you are in the copilot directory and the default "dev" environment was used for ./devDeploy.sh +source ./environments/dev/envExports.sh +cd ../cdk/opensearch-service-migration +# Pick a name to be used for identifying this new domain stack that is different from the one used for ./devDeploy.sh +export CDK_DEPLOYMENT_STAGE=dev2 +cdk deploy "*" --c domainName="test-domain-2-7" --c engineVersion="OS_2.7" --c dataNodeCount=2 --c vpcEnabled=true --c vpcId="$MIGRATION_VPC_ID" --c vpcSecurityGroupIds="[\"$MIGRATION_DOMAIN_SG_ID\"]" --c availabilityZoneCount=2 --c openAccessPolicyEnabled=true --c domainRemovalPolicy="DESTROY" --c migrationAssistanceEnabled=false --c enableDemoAdmin=true --require-approval never --concurrency 3 +``` +To launch an additional Replayer service that directs traffic to this new Domain, run a command like the one below. In this command, `id` is a unique label for the Replayer service, `target-uri` is the endpoint of the target cluster where traffic will be replayed, and `extra-args` is specifying a Replayer Auth header option to use. You can obtain the below endpoint and secret name (or secret ARN) from either from the CDK command output mentioned earlier or from the AWS Console: +```shell +./createReplayer.sh --id test-os-2-7 --target-uri https://vpc-aos-domain-123.us-east-1.es.amazonaws.com:443 --extra-args "--auth-header-user-and-secret admin demo-user-secret-dev2-us-east-1" --tags migration_deployment=0.1.0 +``` +More options can be found with: +```shell +./createReplayer.sh --help +``` + +## Appendix + +### Useful Copilot Commands + +`copilot app show` - Provides details on the current app \ +`copilot svc show` - Provides details on a particular service + +### Addons + +Addons are a Copilot concept for adding additional AWS resources outside the core ECS resources that it sets up. + +An example of this can be seen in the `traffic-replayer/addons/taskRole.yml` service which has an `addons` directory and yaml file. + +That yaml file adds an IAM ManagedPolicy to the task role that Copilot creates for the service. This added policy is to allow communication with MSK. (Note that `taskRole.yml` will only exist after building.) + +Official documentation on Addons can be found [here](https://aws.github.io/copilot-cli/docs/developing/addons/workload/). \ No newline at end of file diff --git a/deployment/cdk/opensearch-service-migration/README.md b/deployment/cdk/opensearch-service-migration/README.md index 3ab8dfb98..0ba5e35e6 100644 --- a/deployment/cdk/opensearch-service-migration/README.md +++ b/deployment/cdk/opensearch-service-migration/README.md @@ -1,6 +1,6 @@ # OpenSearch Service Domain CDK -This repo contains an IaC CDK solution for deploying an OpenSearch Service Domain. Users have the ability to easily deploy their Domain using default values or provide [configuration options](#Configuration-Options) for a more customized setup. The goal of this repo is not to become a one-size-fits-all solution for users. Supporting this would be unrealistic, and likely conflicting at times, when considering the needs of many users. Rather this code base should be viewed as a starting point for users to use and add to individually as their custom use case requires. +This repo contains an IaC CDK solution for deploying an OpenSearch Service Domain. Users have the ability to easily deploy their Domain using default values or provide [configuration options](#Configuration-Options) for a more customized setup. The goal of this repo is not to become a one-size-fits-all solution for users - rather, this code base should be viewed as a starting point for users to use and add to individually as their custom use case requires. ### Getting Started @@ -57,12 +57,12 @@ This is the core required stack of this CDK which is responsible for deploying t #### Network Stack (OSServiceNetworkCDKStack-STAGE-REGION) This optional stack will be used when the Domain is configured to be placed inside a VPC and will contain resources related to the networking of this VPC such as Security Groups and Subnets. It has a dependency on the Domain stack. -#### Historical Capture Stack (OSServiceHistoricalCDKStack-STAGE-REGION) -This optional stack sets up a ECS cluster to host/run Fetch Migration tasks for historical data migration. It has dependencies on both the Domain and Network stacks. - #### Migration Assistance Stack (OSServiceMigrationCDKStack-STAGE-REGION) This optional stack is used to house the migration assistance resources which are in the process of being developed to assist in migrating to an OpenSearch domain. It has dependencies on both the Domain and Network stacks. +#### Fetch Migration Stack (OSServiceHistoricalCDKStack-STAGE-REGION) +This optional stack sets up an ECS cluster to host / run Fetch Migration tasks for data backfill / historical data migration. It has dependencies on the Domain, Network and Migration Assistance stacks. + ### Configuration Options The available configuration options are listed below. The vast majority of these options do not need to be provided, with only `domainName` and `engineVersion` being required. All non-required options can be provided as an empty string `""` or simply not included, and in each of these cases the option will be allocated with the CDK Domain default value @@ -113,7 +113,7 @@ Additional context on some of these options, can also be found in the Domain con | mskBrokerNodeCount | false | number | 2 | The number of broker nodes to be used by the MSK cluster | | historicalCaptureEnabled | false | boolean | false | Creates ECS resources to enable the kick off of hisotircal Fetch Migration tasks from the Migration Console | | sourceClusterEndpoint | false | string | `"https://source-cluster.elb.us-east-1.endpoint.com"` | The endpoint for the source cluster from which Fetch Migration will pull data. Required if `historicalCaptureEnabled` is set to `true` | -| dpPipelineTemplatePath | false | string | "path/to/config.yaml" | Path to a local Data Prepper pipeline configuration YAML file that Fetch Migratino will use to derive source and target cluster endpoints and other settings. Default value is the included template file i.e. [dp_pipeline_template.yaml](dp_pipeline_template.yaml)| +| dpPipelineTemplatePath | false | string | "path/to/config.yaml" | Path to a local Data Prepper pipeline configuration YAML file that Fetch Migration will use to derive source and target cluster endpoints and other settings. Default value is the included template file i.e. [dp_pipeline_template.yaml](dp_pipeline_template.yaml)| A template `cdk.context.json` to be used to fill in these values is below: @@ -122,38 +122,41 @@ A template `cdk.context.json` to be used to fill in these values is below: "engineVersion": "", "domainName": "", "dataNodeType": "", - "dataNodeCount": "", + "dataNodeCount": 2, "dedicatedManagerNodeType": "", - "dedicatedManagerNodeCount": "", + "dedicatedManagerNodeCount": 3, "warmNodeType": "", - "warmNodeCount": "", + "warmNodeCount": 3, "accessPolicies": "", - "useUnsignedBasicAuth": "", + "useUnsignedBasicAuth": false, "fineGrainedManagerUserARN": "", "fineGrainedManagerUserName": "", "fineGrainedManagerUserSecretManagerKeyARN": "", - "enableDemoAdmin": "", - "enforceHTTPS": "", + "enableDemoAdmin": false, + "enforceHTTPS": true, "tlsSecurityPolicy": "", - "ebsEnabled": "", - "ebsIops": "", - "ebsVolumeSize": "", + "ebsEnabled": true, + "ebsIops": 4000, + "ebsVolumeSize": 15, "ebsVolumeType": "", - "encryptionAtRestEnabled": "", + "encryptionAtRestEnabled": true, "encryptionAtRestKmsKeyARN": "", - "loggingAppLogEnabled": "", + "loggingAppLogEnabled": true, "loggingAppLogGroupARN": "", - "nodeToNodeEncryptionEnabled": "", - "vpcEnabled": "", + "nodeToNodeEncryptionEnabled": true, + "vpcEnabled": true, "vpcId": "", "vpcSubnetIds": "", "vpcSecurityGroupIds": "", - "availabilityZoneCount": "", - "openAccessPolicyEnabled": "", + "availabilityZoneCount": 2, + "openAccessPolicyEnabled": false, "domainRemovalPolicy": "", "mskARN": "", - "mskEnablePublicEndpoints": "", - "mskBrokerNodeCount": "" + "mskEnablePublicEndpoints": false, + "mskBrokerNodeCount": 2, + "historicalCaptureEnabled": false, + "sourceClusterEndpoint": "", + "dpPipelineTemplatePath": "" } ``` diff --git a/deployment/cdk/opensearch-service-migration/lib/historical-capture-stack.ts b/deployment/cdk/opensearch-service-migration/lib/historical-capture-stack.ts index 9e1017961..684d4edd3 100644 --- a/deployment/cdk/opensearch-service-migration/lib/historical-capture-stack.ts +++ b/deployment/cdk/opensearch-service-migration/lib/historical-capture-stack.ts @@ -25,8 +25,8 @@ export class HistoricalCaptureStack extends Stack { // ECS Task Definition const historicalCaptureFargateTask = new FargateTaskDefinition(this, "historicalCaptureFargateTask", { - memoryLimitMiB: 2048, - cpu: 512 + memoryLimitMiB: 4096, + cpu: 1024 }); // Create Historical Capture Container diff --git a/deployment/copilot/README.md b/deployment/copilot/README.md deleted file mode 100644 index c6700773b..000000000 --- a/deployment/copilot/README.md +++ /dev/null @@ -1,214 +0,0 @@ -## Deploying to AWS - -### Copilot Deployment -Copilot is a tool for deploying containerized applications on AWS ECS. Official documentation can be found [here](https://aws.github.io/copilot-cli/docs/overview/). - -**Notice**: These tools are free to use, but the user is responsible for the cost of underlying infrastructure required to operate the solution. We welcome feedback and contributions to optimize costs. - -### Initial Setup - -#### Install Prerequisites - -###### Docker -Docker is used by Copilot to build container images. If not installed, follow the steps [here](https://docs.docker.com/engine/install/) to set up. Later versions are recommended. -###### Git -Git is used by the opensearch-migrations repo to fetch associated repositories (such as the traffic-comparator repo) for constructing their respective Dockerfiles. Steps to set up can be found [here](https://github.com/git-guides/install-git). -###### Java 11 -Java is used by the opensearch-migrations repo and Gradle, its associated build tool. The current required version is Java 11. - -#### Creating Dockerfiles -This project needs to build the required Dockerfiles that Copilot will use in its services. From the `TrafficCapture` directory the following command can be ran to build these files -``` -./gradlew :dockerSolution:buildDockerImages -``` -More details can be found [here](../../TrafficCapture/dockerSolution/README.md) - -#### Setting up Copilot CLI -If you are on Mac the following Homebrew command can be run to set up the Copilot CLI: -``` -brew install aws/tap/copilot-cli -``` -Otherwise, please follow the manual instructions [here](https://aws.github.io/copilot-cli/docs/getting-started/install/) - -### Deploy with an automated script - -The following script command can be executed to deploy both the CDK infrastructure and Copilot services for a development environment -```shell -./devDeploy.sh -``` -Options can be found with: -```shell -./devDeploy.sh --help -``` - -Requirements: -* AWS credentials have been configured -* CDK and Copilot CLIs have been installed - -#### How is an Authorization header set for requests from the Replayer to the target cluster? - -See Replayer explanation [here](../../TrafficCapture/trafficReplayer/README.md#authorization-header-for-replayed-requests) - -### How to run multiple Replayer scenarios - -The migration solution has support for running multiple Replayer services simultaneously, such that captured traffic from the Capture Proxy (which has been stored on Kafka) can be replayed on multiple different cluster configurations at the same time. These additional independent and distinct Replayer services can either be spun up together initially to replay traffic as it comes in, or added later, in which case they will begin processing captured traffic from the beginning of what is stored in Kafka. - -A **prerequisite** to use this functionality is that the migration solution has been deployed with the `devDeploy.sh` script, so that necessary environment values from CDK resources like the VPC, MSK, and EFS volume can be retrieved for additional Replayer services - -To test this scenario, you can create an additional OpenSearch Domain target cluster within the existing VPC by executing the following series of commands: -```shell -# Assuming you are in the copilot directory and the default "dev" environment was used for ./devDeploy.sh -source ./environments/dev/envExports.sh -cd ../cdk/opensearch-service-migration -# Pick a name to be used for identifying this new domain stack that is different from the one used for ./devDeploy.sh -export CDK_DEPLOYMENT_STAGE=dev2 -cdk deploy "*" --c domainName="test-domain-2-7" --c engineVersion="OS_2.7" --c dataNodeCount=2 --c vpcEnabled=true --c vpcId="$MIGRATION_VPC_ID" --c vpcSecurityGroupIds="[\"$MIGRATION_DOMAIN_SG_ID\"]" --c availabilityZoneCount=2 --c openAccessPolicyEnabled=true --c domainRemovalPolicy="DESTROY" --c migrationAssistanceEnabled=false --c enableDemoAdmin=true --require-approval never --concurrency 3 -``` -To launch an additional Replayer service that directs traffic to this new Domain, run a command like the one below. In this command, `id` is a unique label for the Replayer service, `target-uri` is the endpoint of the target cluster where traffic will be replayed, and `extra-args` is specifying a Replayer Auth header option to use. You can obtain the below endpoint and secret name (or secret ARN) from either from the CDK command output mentioned earlier or from the AWS Console: -```shell -./createReplayer.sh --id test-os-2-7 --target-uri https://vpc-aos-domain-123.us-east-1.es.amazonaws.com:443 --extra-args "--auth-header-user-and-secret admin demo-user-secret-dev2-us-east-1" --tags migration_deployment=0.1.0 -``` -More options can be found with: -```shell -./createReplayer.sh --help -``` - -### Deploy commands one at a time - -The following sections list out commands line-by-line for deploying this solution - -#### Importing values from CDK -The typical use case for this Copilot app is to initially use the `opensearch-service-migration` CDK to deploy the surrounding infrastructure (VPC, OpenSearch Domain, Managed Kafka (MSK)) that Copilot requires, and then deploy the desired Copilot services. Documentation for setting up and deploying these resources can be found in the CDK [README](../cdk/opensearch-service-migration/README.md). - -The provided CDK will output export commands once deployed that can be ran on a given deployment machine to meet the required environment variables this Copilot app uses i.e.: -``` -export MIGRATION_DOMAIN_SG_ID=sg-123; -export MIGRATION_DOMAIN_ENDPOINT=vpc-aos-domain-123.us-east-1.es.amazonaws.com; -export MIGRATION_DOMAIN_USER_AND_SECRET_ARN=admin arn:aws:secretsmanager:us-east-1:12345678912:secret:demo-user-secret-123abc -export MIGRATION_VPC_ID=vpc-123; -export MIGRATION_CAPTURE_MSK_SG_ID=sg-123; -export MIGRATION_COMPARATOR_EFS_ID=fs-123; -export MIGRATION_COMPARATOR_EFS_SG_ID=sg-123; -export MIGRATION_REPLAYER_OUTPUT_EFS_ID=fs-124 -export MIGRATION_REPLAYER_OUTPUT_EFS_SG_ID=sg-124 -export MIGRATION_PUBLIC_SUBNETS=subnet-123,subnet-124; -export MIGRATION_PRIVATE_SUBNETS=subnet-125,subnet-126; -export MIGRATION_KAFKA_BROKER_ENDPOINTS=b-1-public.loggingmskcluster.123.45.kafka.us-east-1.amazonaws.com:9198,b-2-public.loggingmskcluster.123.46.kafka.us-east-1.amazonaws.com:9198 -``` -Additionally, if not using the deploy script, the following export is needed for the Replayer service: -``` -export MIGRATION_REPLAYER_COMMAND=/bin/sh -c "/runJavaWithClasspath.sh org.opensearch.migrations.replay.TrafficReplayer $MIGRATION_DOMAIN_ENDPOINT --insecure --kafka-traffic-brokers $MIGRATION_KAFKA_BROKER_ENDPOINTS --kafka-traffic-topic logging-traffic-topic --kafka-traffic-group-id default-logging-group --kafka-traffic-enable-msk-auth --auth-header-user-and-secret $MIGRATION_DOMAIN_USER_AND_SECRET_ARN | nc traffic-comparator 9220" -``` - -#### Setting up existing Copilot infrastructure - -It is **important** to run any `copilot` commands from within this directory (`deployment/copilot`). When components are initialized the name given will be searched for in the immediate directory structure to look for an existing `manifest.yml` for that component. If found it will use the existing manifest and not create its own. This Copilot app already has existing manifests for each of its services and a dev environment, which should be used for proper operation. - -When initially setting up Copilot, each component (apps, services, and environments) need to be initialized. Beware when initializing an environment in Copilot, it will prompt you for values even if you've defined them in the `manifest.yml`, though values input at the prompt are ignored in favor of what was specified in the file. - -If using temporary environment credentials when initializing an environment: -* Copilot will prompt you to enter each variable (AWS Access Key ID, AWS Secret Access Key, AWS Session Token). If these variables are already available in your environment, these three prompts can be `enter`'d through and ignored. -* When prompted ` Would you like to use the default configuration for a new environment?` select `Yes, use default.` as this will ultimately get ignored for what has been configured in the existing `manifest.yml` -* The last prompt will ask for the desired deployment region and should be filled out as Copilot will store this internally. - -This Copilot app supports deploying the Capture Proxy and Elasticsearch as a single service `capture-proxy-es` (as shown below) or as separate services `capture-proxy` and `elasticsearch` - -**Note**: This app also contains `kafka-broker` and `kafka-zookeeper` services which are currently experimental and usage of MSK is preferred. These services do not need to be deployed, and as so are not listed below. -``` -// Initialize app -copilot app init - -// Initialize env with required "dev" name -// Be cautious to specify the proper region as this will dictate where resources are deployed -copilot env init --name dev - -// Initialize services with their respective required name -copilot svc init --name traffic-replayer -copilot svc init --name traffic-comparator -copilot svc init --name traffic-comparator-jupyter -copilot svc init --name capture-proxy-es -copilot svc init --name migration-console - -``` - -#### Deploying Services to an Environment -When deploying a service with the Copilot CLI, a status bar will be displayed that gets updated as the deployment progresses. The command will complete when the specific service has all its resources created and health checks are passing on the deployed containers. - -Currently, it seems that Copilot does not support deploying all services at once (issue [here](https://github.com/aws/copilot-cli/issues/3474)) or creating dependencies between separate services. In light of this, services need to be deployed one at a time as show below. - -``` -// Deploy environment -copilot env deploy --name dev - -// Deploy services to a deployed environment -copilot svc deploy --name traffic-comparator-jupyter --env dev -copilot svc deploy --name traffic-comparator --env dev -copilot svc deploy --name traffic-replayer --env dev -copilot svc deploy --name capture-proxy-es --env dev -copilot svc deploy --name migration-console --env dev -``` - -### Running Benchmarks on the Deployed Solution - -Once the solution is deployed, the easiest way to test the solution is to exec into the migration-console container and run a benchmark test through, as the following steps illustrate - -``` -// Exec into container -copilot svc exec -a migration-copilot -e dev -n migration-console -c "bash" - -// Run opensearch-benchmark workload (i.e. geonames, nyc_taxis, http_logs) - -// Option 1: Automated script -./runTestBenchmarks.sh - -// Option 2: Manually execute command -opensearch-benchmark execute-test --distribution-version=1.0.0 --target-host=https://capture-proxy-es:9200 --workload=geonames --pipeline=benchmark-only --test-mode --kill-running-processes --workload-params "target_throughput:0.5,bulk_size:10,bulk_indexing_clients:1,search_clients:1" --client-options "use_ssl:true,verify_certs:false,basic_auth_user:admin,basic_auth_password:admin" -``` - -After the benchmark has been run, the indices and documents of the source and target clusters can be checked from the same migration-console container to confirm -``` -// Option 1: Automated script -./catIndices.sh - -// Option 2: Manually execute cluster requests -// Check source cluster -curl https://capture-proxy-es:9200/_cat/indices?v --insecure -u admin:admin - -// Check target cluster -curl https://$MIGRATION_DOMAIN_ENDPOINT:443/_cat/indices?v --insecure -u admin:Admin123! -``` - -### Executing Commands on a Deployed Service - -A command shell can be opened in the service's container if that service has enabled `exec: true` in their `manifest.yml` and the SSM Session Manager plugin is installed when prompted. -``` -copilot svc exec -a migration-copilot -e dev -n traffic-comparator-jupyter -c "bash" -copilot svc exec -a migration-copilot -e dev -n traffic-comparator -c "bash" -copilot svc exec -a migration-copilot -e dev -n traffic-replayer -c "bash" -copilot svc exec -a migration-copilot -e dev -n elasticsearch -c "bash" -copilot svc exec -a migration-copilot -e dev -n capture-proxy -c "bash" -copilot svc exec -a migration-copilot -e dev -n capture-proxy-es -c "bash" -copilot svc exec -a migration-copilot -e dev -n migration-console -c "bash" -``` - -### Addons - -Addons are a Copilot concept for adding additional AWS resources outside the core ECS resources that it sets up. - -An example of this can be seen in the `traffic-replayer/addons/taskRole.yml` service which has an `addons` directory and yaml file. - -That yaml file adds an IAM ManagedPolicy to the task role that Copilot creates for the service. This added policy is to allow communication with MSK. (Note that `taskRole.yml` will only exist after building.) - -Official documentation on Addons can be found [here](https://aws.github.io/copilot-cli/docs/developing/addons/workload/). - -### Useful Commands - -`copilot app show` - Provides details on the current app \ -`copilot svc show` - Provides details on a particular service - -### Removing deloyed resources from AWS - -To remove the resources installed from the steps above, follow these instructions: -1. `./devDeploy.sh --destroy-env` - Destroy all CDK and Copilot CloudFormation stacks deployed, excluding the Copilot app level stack, for the given env/stage and return to a clean state. -2. `./devDeploy.sh --destroy-all-copilot` - Destroy Copilot app and all Copilot CloudFormation stacks deployed for the given app across all regions -3. After execution of the above steps, a CDK bootstrap stack remains. To remove this stack, begin by deleting the S3 objects and the associated bucket. After that, you can delete the stack using the AWS Console or CLI.