Migrate data from Elasticsearch to OpenSearch 2
In this How-to you will learn how to migrate data from your existing Elasticsearch service instance to a new OpenSearch 2 service instance.
The recommended strategy to migrate the full dataset from an Elasticsearch service instance to an OpenSearch 2 service instance is to use the reindex Data Operation.
Requirements
Section titled “Requirements”To follow this How-to, you need the following tools installed:
- curl – Command line tool and library for transferring data with URL syntax
- jq – Command-line JSON processor
- cf CLI – Cloud Foundry command line interface
Preparation
Section titled “Preparation”Before starting the migration, ensure the following steps are completed:
- Isolate the source service instance so no new data is written.
- Trigger a manual backup of the source service instance via the Service Dashboard and wait until it finishes.
- Generate service credentials for the source service instance if not already available.
- Check database consistency of the source instance.
- Order a new OpenSearch 2 destination instance and create a credentials key.
Migration strategy
Section titled “Migration strategy”The migration is performed using the reindex operation. reindexing must be done one index at a time.
For each index, the following steps must be executed on the destination OpenSearch 2 service instance.
Check database consistency
Section titled “Check database consistency”# refresh indexcurl -k -u <service-instance-username>:<service-instance-password> https://<service-instance-host>:<service-instance-port>/<target_index>/_refresh
# list all indicescurl -k -u <service-instance-username>:<service-instance-password> https://<service-instance-host>:<service-instance-port>/_cat/indices?vDocumentation
Section titled “Documentation”- Elasticsearch: Refresh API
- Elasticsearch: Cat indices API
- OpenSearch: Cat indices
Verification checklist
Section titled “Verification checklist”- Both service instances contain the same indices created by your application
- Each index has the same
docs.count - Sample documents are equivalent (for example using a specific timestamp)
Source instance actions (Elasticsearch)
Section titled “Source instance actions (Elasticsearch)”List available indices
Section titled “List available indices”curl -k -u <source-username>:<source-password> \https://<source-host>:<source-port>/_cat/indices?vExport index settings
Section titled “Export index settings”curl -k -u <source-username>:<source-password> \-X GET https://<source-host>:<source-port>/<index_name>/_settings?prettyExport mappings
Section titled “Export mappings”curl -k -u <source-username>:<source-password> \-X GET https://<source-host>:<source-port>/<index_name>/_mapping?prettyExport index templates
Section titled “Export index templates”curl -k -u <source-username>:<source-password> \-X GET https://<source-host>:<source-port>/_template?prettyExport index aliases
Section titled “Export index aliases”curl -k -u <source-username>:<source-password> \-X GET https://<source-host>:<source-port>/_alias?prettyDestination instance actions (OpenSearch2)
Section titled “Destination instance actions (OpenSearch2)”Create index with settings and mappings
Section titled “Create index with settings and mappings”curl -k -u <destination-username>:<destination-password> \-X PUT https://<destination-host>:<destination-port>/<target_index> \-H 'Content-Type: application/json' \-d '{ "settings": { "index": { "number_of_shards": 2, "number_of_replicas": 1 } }, "mappings": { "properties": { "age": { "type": "integer" } } }}'OpenSearch documentation: Create Index
Set index templates
Section titled “Set index templates”curl -k -u <destination-username>:<destination-password> \-X PUT https://<destination-host>:<destination-port>/_index_template/<template_name> \-H 'Content-Type: application/json' \-d '{ "index_patterns": ["logs-2020-01-*"], "template": { "aliases": { "my_logs": {} }, "settings": { "number_of_shards": 2, "number_of_replicas": 1 }, "mappings": { "properties": { "timestamp": { "type": "date", "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis" }, "value": { "type": "double" } } } }}'OpenSearch documentation: Create a template
Improve migration performance (optional)
Section titled “Improve migration performance (optional)”curl -k -u <destination-username>:<destination-password> \-X PUT https://<destination-host>:<destination-port>/<target_index>/_settings \-H 'Content-Type: application/json' \-d '{ "index": { "refresh_interval": -1, "number_of_replicas": 0 }}'OpenSearch documentation: Update settings
Perform migration
Section titled “Perform migration”The reindex operation requires an internal hostname and port of your source Elasticsearch service instance.
This information can be retrieved using the CF CLI (Cloud Foundry command line interface).
-
Login via CF CLI
Cloud Foundry Runtime is not required in this case, as STACKIT Data Services are located in technical organisations. -
Choose your technical organization
Technical organization names always start with the prefixstackit_portaland contain your STACKIT project name.Terminal window cf target -o <technical_organization_name>cf target -o stackit_portal_prod_my_project_h4GU6Tew -
List your Data Services
Identify the Elasticsearch source service instance.Terminal window cf services -
Identify the internal Elasticsearch service instance
You should find two service instances with the same name:- one regular instance
- one instance with the suffix
-internal
The
-internalinstance is required for the reindex operation. -
Create a service key for the internal instance
Terminal window cf create-service-key <my_internal_instance_name> <my_key_name>cf create-service-key my_elasticsearch-internal mykey -
Retrieve internal credentials
This command returns the internal hostname, port, username, and password.Terminal window cf service-key <my_internal_instance_name> <my_key_name>cf service-key my_elasticsearch-internal mykey -
Start the reindex operation
Terminal window curl -k -u <destination-opensearch-service-instance-username>:<destination-opensearch-service-instance-password> \-X POST https://<destination-opensearch-service-instance-host>:<destination-opensearch-service-instance-port>/_reindex?wait_for_completion=false \-H 'Content-Type: application/json' \-d '{"source": {"remote": {"host": "https://<internal-source-elasticsearch-service-instance-host>:<internal-source-elasticsearch-service-instance-port>","username": "<internal-source-elasticsearch-service-instance-username>","password": "<internal-source-elasticsearch-service-instance-password>","socket_timeout": "<socket-timeout>"},"index": "<target_index>","size": <number-of-documents-to-reindex-by-batch>},"dest": {"index": "<target_index>"}}'
OpenSearch documentation: reindex data
Monitor reindex process
Section titled “Monitor reindex process”curl -k -u <destination-username>:<destination-password> \-X GET https://<destination-host>:<destination-port>/_tasks/<task-id>?prettyOpenSearch documentation: Tasks
Post-migration checks
Section titled “Post-migration checks”- Repeat database consistency checks on the destination instance
- Optionally verify settings and mappings
Revert index settings
Section titled “Revert index settings”After reindexing is complete, restore your desired settings:
curl -k -u <destination-username>:<destination-password> \-X PUT https://<destination-host>:<destination-port>/<target_index>/_settings \-H 'Content-Type: application/json' \-d '{ "index": { "refresh_interval": "1s", "number_of_replicas": 1 }}'