Tag Cloud
Concourse CI/CD devops prometheus kubernetes monitoring modbus kepware c-programming IoT golang telegram bot python cli urwid elasticsearch aws ecs apache spark scala AWS EMR hadoop webhooks ssl nginx digital-ocean emr apache pig datapipeline found.io elastic-cloud rails try capybara docker capistrano heka bigquery kafka protobuf vim iterm javascript emberjs git scripting dnsmasq bem frontend meteorjs meteorite heroku

Upgrading to ElasticSearch 5.2.2 on Amazon ECS

In one of my previous post, I talked about how I set up Elasticsearch 2.3.5 on ECS. I got a comment in that post that prompted me to update the setup for Elasticsearch 5. It’s been awhile, but better late than never right? So I gave it a go! In this post I’ll like to share what I found in the process.

There were a couple of other configuration changes that were required to upgrade to 5.2.2 from 2.3.5 but they weren’t difficult, except one that may potentially deter you from using ECS with Elasticsearch 5, for the time being at least.

Main Caveat

At this point, I’ll mention a caveat that will likely save you an hour of headache and trouble.

Long story short, You will need to SSH into the ECS instances to run the command on the parent to get past the error message below. I am not aware of any other solutions but if you do, feel free to let me know in the comments section below!

elasticsearch:5.2.2 max virtual memory areas vm.max_map_count [65530] likely too low, increase to at least [262144]

This docker-library/elasticsearch github issue suggests running sudo sysctl -w vm.max_map_count=262144 or run the docker container with the --sysctl option to fix this problem.

However, because of how ECS implements the agents currently, many of the docker run options are not available. This is well-documented over at the amazon-ecs-agent Github repository so I won’t echo them here. But it does seem like there are a bunch of others who are encountering the same issue.

Continuing..

In my opinion, this makes the combination slightly less-than-ideal because the manual configuration work that is required on the EC2 instances takes away some of the benefits of implementing ElasticSearch in ECS.

If you’re okay with the manual configuration running that command on the instances, or for example, if you plan to provision a few instances and leave them there for awhile, then this hiccup would deal no damage.

Let’s forge on.

Configuration Changes

The starting point is the Dockerfile for ElasticSearch 2.3.5 in my docker-elasticsearch-ecs repo:

FROM elasticsearch:2.3

WORKDIR /usr/share/elasticsearch

RUN bin/plugin install cloud-aws
RUN bin/plugin install mobz/elasticsearch-head
RUN bin/plugin install analysis-phonetic

COPY elasticsearch.yml config/elasticsearch.yml
COPY logging.yml config/logging.yml
COPY elasticsearch-entrypoint.sh /docker-entrypoint.sh

And modified to:

FROM elasticsearch:5.2.2

COPY elasticsearch.yml config/elasticsearch.yml
COPY logging.yml config/logging.yml
COPY elasticsearch-entrypoint.sh /docker-entrypoint.sh

RUN bin/elasticsearch-plugin install discovery-ec2

Notable changes include bumping the version and changing cloud-aws plugin to discovery-ec2 which is the new plugin for the same purpose of node discovery in cloud environments.

File Descriptors and Ulimits

I needed to change the docker-compose file slightly to include the ulimits. It is a new mandatory configuration item. You can find out more in this documentation.

version: '2'
services:
  data:
    build: ./docker-data/
    volumes:
      - /usr/share/elasticsearch/data

  search:
    build: ./docker-elasticsearch/
    volumes_from:
      - data
    ports:
      - "9200:9200"
      - "9300:9300"
    ulimits:
        nofile:
           soft: 65536
           hard: 65536

elasticsearch.yml

plugin.mandatory: cloud-aws and discovery.type: EC2 and discovery.zen.ping.multicast.enabled: false has been removed or modified to the following below.

script.inline: true
bootstrap.memory_lock: false
network.host: 0.0.0.0
network.publish_host: _ec2:privateIp_
discovery.zen.hosts_provider: ec2
discovery.ec2.groups: dockerecs

Task Definition / Heap Size

In Elasticsearch 5, the heap size is also a mandatory configuration. For this, I set it directly in ECS via the JSON task definition. I had to set the ES_JAVA_OPTS for it to work.

ES_JAVA_OPTS="-Xms1g -Xmx1g"

Wrapping up

It isn’t a whole lot of changes but it did take some time googling each of the issues that came up as I tried to start the services on ECS and also eventually had to SSH into the instance to set the vm.max_map_count before I managed to get the cluster up.

This is obviously less than ideal in a deployment process which otherwise could be full-automated. But if you’re still looking ahead to use ElasticSearch 5 in ECS, I hope the above steps serve you well!

comments powered by Disqus