Home United States USA — software Microservices Patterns With Envoy Sidecar Proxy: Part I Microservices Patterns With Envoy...

Microservices Patterns With Envoy Sidecar Proxy: Part I Microservices Patterns With Envoy Sidecar Proxy: Part I

347
0
SHARE

This tutorial series shows how to connect and manage microservices with the Envoy Sidecar Proxy and Istio.io. In Part 1, we deal with circuit breaking.
This blog is looking deeper at Envoy Proxy and Istio.io and how it enables a more elegant way to connect and manage microservices. Follow me @christianposta to stay up with these blog post releases. I think the flow for what I cover over the next series will be something like:
This first blog post introduces you to Envoy Proxy’s implementation of circuit-breaking functionality. These demos are intentionally simple so that I can illustrate the patterns and usage individually. Please download the source code for this demo and follow along!
This demo is comprised of a client and a service. The client is a Java HTTP application that simulates making HTTP calls to the “upstream” service (note, we’ re using Envoy’s terminology here, and throughout this repo) . The client is packaged in a Docker image named docker.io/ceposta/http-envoy-client: latest. Alongside the HTTP-client Java application is an instance of Envoy Proxy. In this deployment model, Envoy is deployed as a sidecar alongside the service (the HTTP client in this case) . When the HTTP-client makes outbound calls (to the “upstream” service) , all of the calls go through the Envoy Proxy sidecar.
The “upstream” service for these examples is httpbin.org . httpbin.org allows us to easily simulate HTTP service behavior. It’s awesome, so check it out if you’ ve not seen it.
The circuit-breaker demo has it’s own envoy.json configuration file. I definitely recommend taking a look at the reference documentation for each section of the configuration file to help understand the full configuration. The good folks at also put together a nice intro to Envoy and its configuration which you should check out too.
To run the circuit-breaker demo, familiarize yourself with the demo framework and then run:
The Envoy configuration for circuit breakers looks like this (see the full configuration here) :
This configuration allows us to
Let’s take a look at each configuration. We’ ll ignore the max retry settings right now for two reasons:
In any event, the retries setting here allows us to avoid large retry storms – which in most cases can serve to compound problems when dealing with connectivity to all instances in a cluster. It’s an important setting that we’ ll come back to with the retries demo.
Let’s see what envoy does when too many threads in an application try to make too many concurrent connections to the upstream cluster.
Recall our circuit breaking settings for our upstream httbin cluster looks like this (see the full configuration here) :
If we look at the ./circuit-breaker/http-client.env settings file, we’ ll see that initially, we’ ll start by running a single thread which creates a single connection and makes five calls and shuts down:
Let’s verify this. Run the demo:
This sets up the application with its client libraries and also sets up Envoy Proxy. We will send traffic directly to Envoy Proxy to handle circuit breaking for us. Let’s call our service:
We should see output like this:
We can see all five of our calls succeeded!
Let’s take a look at some of the metrics collected by Envoy Proxy:
WOW! That’s a lot of metrics Envoy tracks for us! Let’s grep through that:
This will show the metrics for our configured upstream cluster named httpbin_service. Take a quick look through some of these statistics and lookup their meaning in the Envoy documentation. The important ones to note are called out here:
This tells us we had 1 http/1 connection, with 5 requests (total) and 5 of them ended in HTTP 2xx (and even 200) . Great! But what happens if we try to use two concurrent connections?
First, let’s reset the statistics:
Let’s invoke these calls with 2 threads:
We should see some output like this:
Woah.. one of our threads had 5 successes, but one of them didn’ t! One thread had all 5 of its requests failed! Let’s take a look at the Envoy stats again:
Now our stats from above look like this:
From this output we can see that only one of our connections succeeded! We ended up with 5 requests that resulted in HTTP 200 and 5 requests that ended up with HTTP 503. We also see that upstream_rq_pending_overflow has been incremented to 5. That is our indication that the circuit breaker did it’s job here. It short circuited any calls that didn’ t match our configuration settings.
Note, we’ ve set our max_connections setting to an artificially low number, 1 in this case, to illustrate Envoy’s circuit breaking functionality. This is not a realistic setting but hopefully serves to illustrate the point.
Let’s run some similar tests to exercise the max_pending_requests setting.
Recall our circuit breaking settings for our upstream httbin cluster looks like this (see the full configuration here) :
What we want to do is simulate multiple simultaneous requests happening on a single HTTP connection (since we’ re only allowed max_connections of 1) . We expect the requests to queue up, but Envoy should reject the queued up messages since we have a max_pending_requests set to 1. We want to set upper limits on our queue depths and not allow retry storms, rogue downstream requests, DoS, and bugs in our system to cascade.
Continuing from the previous section, let’s reset the Envoy stats:
Let’s invoke the client with one thread (i.e., one HTTP connection) but send our requests in parallel (in batches of five by default) . We will also want to randomize the delays we get on sends so that things can queue up:
We should see output similar to this:
Damn! Four of our requests failed… let’s check the Envoy stats:
Sure enough, we see that four of our requests were short-circuited:
We’ ve seen what circuit breaking facilities Envoy has for short circuiting and bulkheading threads to clusters, but what if nodes in a cluster go down (or appear to go down) completely?
Envoy has settings for “outlier detection” which can detect when hosts in a cluster are not reliable and can eject them from the cluster rotation completely (for a period of time) . One interesting phenomenon to understand is that by default, Envoy will eject hosts from the load balancing algorithms up to a certain point. Envoy’s load balancer algorithms will detect a panic threshold if too many (ie, > 50%) of the hosts have been deemed unhealthy and will just go back to load balancing against all of them.

Continue reading...