Data Migration to New AWS Elasticsearch Service Domain

This story provides guidelines to migration data to the new AWS Elasticsearch Service Domain.

Overview

Data migration to the new AWS Elasticsearch Service domain consists of two steps:

  1. Creating a manual snapshot of Elasticsearch Service domain data on the S3 bucket.

Assumption

I am assuming that you already know how to create an AWS Elasticsearch Service domain.

Manual Snapshot/Backup

  1. Create a bucket in the same region where the Elasticsearch domain exists.
{
"Version": "2012-10-17",
"Statement": [{
"Action": [
"s3:ListBucket"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::<bucket-name>"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::<bucket-name>/*"
]
}
]
}

5. Add a trust relationship so that Elasticsearch can assume this role:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "es.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}

6. Create an IAM user with AWS CLI utility usage enabled. This user will be used to register the manual snapshot repository. Attach the inline JSON policy given below:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": "<role-arn-created-in-step-3>"
},
{
"Effect": "Allow",
"Action": "es:ESHttpPut",
"Resource": "<elasitcsaerch-arn>"
}
]
}

7. Configure the user created in step-6 using its access-id and access-secret:

aws configure

Enter the data for each prompt

8. Install pip and some packages

sudp install python-pip
sudo pip install requests-aws4auth

9. Create a python file and paste the script given below:

import boto3
import requests
from requests_aws4auth import AWS4Auth

host = '<existing elasticsearch service domain url>'
region = '<elasticsearch service domain region>'
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

# Register repository
path = '_snapshot/<snapshot-repository-name>' # the Elasticsearch API endpoint
url = host + path

payload = {
"type": "s3",
"settings": {
"bucket": "<enter bucket name created in step-1>",
"region": "<bucket region>",
"role_arn": "<arn of role created in step-3>"
}
}

headers = {"Content-Type": "application/json"}

r = requests.put(url, auth=awsauth, json=payload, headers=headers)

print(r.status_code)

10. Run the script, it will print the data given below:

Note: Make sure if configure (command:aws configure) aws user command is executed using sudo run the python script using sudo otherwise, there is no need to use sudo.

200

11. Take the manual snapshot either by using elasticsearch api or kibana dev tool console:

PUT _snapshot/<snapshot-repository-name>/<date/snapshot-name>

12. To check snapshot has been created successfully and the indices that are part of this snapshot:

GET _snapshot/<snapshot-repository-name>/_all?pretty

13. Check the s3 bucket to check whether data has been created successfully.

Restore Snapshot

  1. Create a new Elasticsearch Service Domain.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": "<role-arn-refered-in-step-2>"
},
{
"Effect": "Allow",
"Action": "es:ESHttp*",
"Resource": "<new-elasitcsaerch-arn>"
}
]
}

4. Configure the user on a system:

aws configure

5. Install pip and packages but if already exists then no need for this step:

sudo install python-pip
sudo pip install requests-aws4auth

6. Create a file and paste the python script given below:

import boto3
import requests
from requests_aws4auth import AWS4Auth

host = '<new existing elasticsearch service domain url>'
region = '<new elasticsearch service domain region>'
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

# Register repository
path = '_snapshot/<snapshot-repository-name-used-in-manual-snapshot>' # the Elasticsearch API endpoint
url = host + path

payload = {
"type": "s3",
"settings": {
"bucket": "<enter bucket name created manual snapshot process>",
"region": "<bucket region>",
"role_arn": "<arn of role refered in step-2>"
}
}

headers = {"Content-Type": "application/json"}

r = requests.put(url, auth=awsauth, json=payload, headers=headers)

print(r.status_code)

7. Run the python script

Note: Make sure if configure (command:aws configure) aws user command is executed using sudo run the python script using sudo otherwise, there is no need to use sudo.

200

8. To check snapshot repository is configured, check the existing snapshots by either by using elasticsearch api or kibana dev tool console:

GET _snapshot/<snapshot-repository-name>/_all?pretty

It must show the snapshot that was created in the manual snapshot process.

9. To check existing indices:

GET _aliases?pretty=true

10. Restore the snapshot either by using elasticsearch api or kibana dev tool console:

POST _snapshot/<snapshot-repository-name>/<date/snapshot-name>/_restore -d
{
"indices": "<index-name>",
"ignore_unavailable": false,
"include_global_state": false
}

11. Verify that the index has been restored:

GET _aliases?pretty=true

12. Verify the data of the index:

GET /<index-name>/_search/

Final Thoughts

I hope you have liked this tutorial. Do give me feedback about anything that can be improved. Thank you.

DevSecOps Engineer https://irtizaali.com/