Home United States USA — software Orchestration With Kubernetes, Docker Swarm, and Mesos Orchestration With Kubernetes, Docker Swarm,...

Orchestration With Kubernetes, Docker Swarm, and Mesos Orchestration With Kubernetes, Docker Swarm, and Mesos

114
0
SHARE

Kubernetes, Docker Swarm, and Apache Mesos are the three best-known container orchestration platforms. Let’s see their architecture and capabilities in action.
Container orchestration for multiple containers across a fleet of machines has the potential to solve issues across the range of scaling, replication, fault tolerance and container communication.
In this GraphConnect presentation, Dippy Aggarwal walks you through the basics of Docker containerization and why Docker alone isn’ t enough to handle today’s challenges of scale and multi-cluster deployments. The three most popular container orchestration tools she examines include Docker Swarm, Kubernetes, and Apache Mesos.
Aggarwal first demonstrates the advantages and tradeoffs of Docker Swarm, including the familiarity of the Docker API. She also dives into the three main deployment strategies: spread, binpack and random. Aggarwal concludes with a demo of Docker Swarm.
Next, Aggarwal presents on the strengths of Kubernetes, most notably the greater number of concepts, a higher level of abstraction, better stability and more maturity. As before, Aggarwal includes a hands-on demo of deploying to Kubernetes.
Finally, Aggarwal shares the advantages of using Apache Mesos for container orchestration. She points out how Mesos can be used to manage both Docker and non-Docker jobs, providing you with a framework to launch both of these heterogeneous workloads onto the same cluster. Again, she ends with an Apache Mesos demo.
What we’ re going to be talking about today are three different container orchestration tools: Kubernetes, Docker Swarm, and Apache Mesos:
My dissertation is largely about the use of graph databases and provenance to study the impact assessment of schema evolution the context of data warehouses. The following presentation was inspired by an internship I did at the Cincinnati Children’s Hospital.
Cincinnati Children’s is a premier research organization that is ranked third in the country. As an infrastructure team, we set up and manage clusters and provision of resources to all the teams and groups across the Children’s Hospital.
Over the last several months we have seen a joint interest across teams in launching applications as document containers, which is great because of benefits of containerization that include multiple infrastructure and applications. But containerization also requires overcoming issues like scaling, replication, and monitoring that come with containerization.
The following post will provide a conceptual hands-on introduction for deploying Neo4j containers across a cluster using the three most popular container orchestration tools: Docker Swarm, Kubernetes, and Mesos.
I’ ll start by giving an introduction to orchestration, an overview of these three orchestration tools, and demos of automated cluster deployment of Neo4j with the three orchestration approaches.
Containers allow you to avoid “runs on my machine” issues and allows you to package your machine as a standardized unit. You package your application with all its dependencies, application codes and run-time environments and ship it as a container — which offers the benefits of portability and isolation. And while the idea of containerization is not new, Docker really deserves credit for making it popular.
Datadog compiled a report that shows the interest trends in Docker. As you can see below, interest in the software increased by 30% in one year, and the number of containers launched increased fivefold:
Below is an example Neo4j Dockerfile, a simple text file with commands you would typically run on shell, but here we have environmental variables, commands for downloading and installing Neo4j ports, and how to run the command:
To build and run one container, we run a command to build an image and another to launch a container based on that image. But we quickly encounter a challenge — Docker by itself just isn’ t good enough. When you start talking about large numbers of containers, which can scale up into the millions, you encounter challenges that come with distributed systems such as scaling, replication, fault tolerance and container communication.
This where orchestration comes in: the idea of going from launching a container on one machine to multi-containers across a fleet of machines.
Given that containerization has become very popular, the orchestration space has become extremely crowded. A report released by the New Stack shows that Kubernetes, Docker Swarm, and Apache Mesos are amongst the most popular. Cincinatti Children’s Hospital is leaning towards Kubernetes.
At GraphConnect San Francisco in 2015, Patrick Chanezon and David Makogon gave the talk Containerized Neo4j: Automating Deployments with Docker on Azure, which focuses on launching an image on Microsoft Azure and is a helpful supplement to this post.
The main idea behind Docker Swarm is that you want to put a Docker API in front of a cluster. If you’ re talking about one container running on a host, you have a Docker command-line interface.
When you have multiple containers and start scaling, an option is to have a Docker API talking to each of your containers — but that won’ t scale when you have thousands of machines. What you need is a single-facing, unified interface that can talk to all of these Docker engines. And that’s exactly what Docker Swarm does:
On the right is a single unknown, which is the Swarm manager that can talk to each of the Docker engines deployed on Swarm agents. In addition to providing a unified interface, Swarm provides device scheduling decisions and serves as a single pool of resources — which means that it maintains information such as the state of each agent and which containers are deployed on them.
Below is a more elaborate picture of the Docker Swarm architecture:
This shows that we have three agents running the Docker engine along with a manager that is communicating with each of them. The big rectangles show the containers, so each agent is running three containers.
The important thing to note is that our manager is composed of two components; a scheduler and discovery service. A discovery service allows the master to identify when new nodes join the cluster or if nodes leave the cluster.
There are different ways you can provide this discovery mechanism. You can do a token-based system, which I’ ll use in the following demo, or you can use a distributed store like Zookeeper, but in the demo, I’ ll use Token Service.
What does a scheduler do? Well, when a manager receives a request to launch a container, how does it decide which agent to launch the container on? Or if I want to launch this container and make sure it has certain memory requirements, how does it filter nodes? A scheduler takes care of all these decisions.
In Docker Swarm, there are three different types of scheduling strategies:
The first is the spread strategy, which is the default strategy that says whenever you launch a container, it will deploy the container with the fewest number of nodes running.
The binpack strategy optimizes the node which is most packed, so instead of spreading the containers across nodes, it will try to fill one node first before moving on to the next. While this avoids fragmentation, you will lose a lot of information if that node fails where you have deployed containers.
And the third strategy is random which — as the name suggests — selects containers randomly.
Swarm filters allow the manager to eliminate some of the nodes when it has to launch a container. The example under “Swarm Filters” in the slide above shows that we want to run two containers on the same host on nodes that meet certain constraints: health checks and storage.
In this example here, I’ m showing this filter for affinity, which says, “I will launch my container logger and make sure that it runs along with the front end container, ” so the manager will take that into account by launching the container.
Below is the environment I’ m using for the Docker Swarm demo. I have a virtual box running along with one master and two slave nodes:
Since I mentioned discovery surveys and the different strategies, I also want to show how you can change the strategies. By default, Docker Swarm follows the spread strategy so it will just launch your container on the node which has the least number of containers.
But if you want to change that, you can change launch containers based on binpack. This also shows how to use the discovery surveys using tokens.
Now let’s dive into the demo:
In a nutshell, I think the biggest advantage of Docker Swarm is its simplicity. If somebody is familiar with Docker, the commands we use with Swarm are the same except for a few simple constructs you need in order to set up the cluster. Other than that, if you’ re familiar with Docker, the learning curve isn’ t steep.
And with Docker 1.12, there are several features such as auto-scaling that are now much simpler. As an example, if you want to scale your application, you can just say Docker Service Create, the name of the application, and the new workload.
Docker Swarm 1.12 also leverages some of the same concepts as Kubernetes, such as the levels of abstraction.

Continue reading...