Home United States USA — software Virtual Panel: Microservices Interaction and Governance model

Virtual Panel: Microservices Interaction and Governance model

251
0
SHARE

The recent trend in application architectures is to transition from monolithic applications to a microservices model. This transition without a good service interaction model will most likely result in chaos and a service landscape that’s hard to govern and maintain. InfoQ spoke with domain experts on this topic and compiled their responses in this virtual panel article
The recent trend in application architectures is to transition from monolithic applications to a microservices model. Software development teams are trying to develop a unified architecture that will work for most of the use cases in real world applications: online transactional, batch, IoT and big data processing pipelines.
This transition without a good service interaction model will most likely result in chaos and a service landscape that’s hard to govern and maintain. The teams would have hundreds of microservices communicating with each other without any governance to allow only the authorized microservices to call the other microservices.
The focus of this virtual panel article is to discuss the pros and cons of service orchestration vs. service choreography and best practices on how to implement business process services that require calling multiple microservices.
InfoQ: What are the different options for managing microservices interaction? Please discuss any design considerations for how microservices should communicate with each other.
Chris Richardson: One simple way to think about microservice interaction is in terms of commands and queries. Commands are external requests (i.e. from outside the application) that create, update or delete data. Queries are, as the name suggests, external requests that simply read data. The challenge in a microservice architecture is that in order for services to be loosely coupled each service has its own database, the Database per service pattern. As a result, some queries need to retrieve data from multiple service and some commands must update data in multiple services.
Queries are the usually the easiest to implement. Simpler queries retrieve data from a single service but more complex queries must retrieve data from multiple services. For example, the order details might be scattered across multiple services including the order service and the delivery service. One way to implement this kind of query is through the API Composition pattern: call each service that as the data and join the results together. Another approach is the CQRS pattern, which maintains an easily queried view that pre-joins the data from multiple services. The view is kept up to date by subscribing to the domain events that are published by services when their data changes.
Commands are more challenging. Complex commands need to update data in multiple services. For example, when the user places an order, the application must create an Order in the OrderService, and redeem a Coupon in the CouponService. Distributed transactions are not an option. Instead, an application must use the Saga pattern. A saga is a sequence of local transactions in the participating services that are coordinated using messages or events.
Daniel Bryant: The two most popular methods of communication for microservices are Remote Procedure Calls (RPC) — typically via HTTP and JSON or something like gRPC — and messaging/eventing — often using something like RabbitMQ or Apache Kafka .
The key difference in interaction patterns within any kind of distributed system is in regards to coupling via time and space. For example, RPC tends to be synchronous and point-to-point. Messaging is typically asynchronous, delivered via topic queues or pub/sub, and often there are multiple consumers for a message. In general, synchronous interaction is easier to reason about, design and debug, but is less flexible, more costly to ensure fault tolerance, and more challenging to operationally scale in comparison with asynchronous interaction.
An additional coordination layer is often added on top of the communication mechanism, which deals with things like service discovery or orchestration.
Glenn Engstrand: In order to understand the options for managing microservice interaction, we should first study its history. Let’s look back to a time that is almost a decade before microservices really took off. In the early 2000s, the book Enterprise Integration Patterns was published. The corresponding web site for EIP remains an important reference for service interaction even to this day.
Workflow engines were a popular option back in the days of Service Oriented Architecture, Business Process Management, and the Enterprise Service Bus. The promise at that time was that you could create orchestration style APIs without needing to employ a fully trained engineer. They are still around but there isn’t much interest in this option for microservice interaction anymore, primarily because they could not deliver on that promise.
The most natural way to break a monolith up into microservices is to carve out the data access parts from the code base and replace them with calls to RESTful data centric APIs. This was the concept of the original API gateway but that term now refers to something else (see below). Though this has been going on for quite some time, the technology focused media started to recognize this trend as BfFs (Backends for Frontends) a couple of years ago. A slight variation with BfFs is that you can have a different orchestration service for desktop and mobile experiences.
The first article on Staged Event Driven Architecture was originally published December 2001. SEDA started gaining in popularity at about the same time as workflow engines. Since then SEDA has been eclipsed by reactive programming which has become very popular in the past few years. It is possible to write a single program in a reactive style but the most common use cases for reactive programming are distributed and involve a message broker.
BfFs were originally homegrown but the more framework oriented parts, such as authentication, request logging, and rate limiting, became a vendor category known as API Gateways.
Block chains are most famous for their crypto currency capability but IBM and the Linux Foundation believe that the consensus and smart contracts parts could become a popular option for microservice interaction.
There are a lot of design considerations to take into account when formulating architectures for microservice interaction; dependency management, data integrity, eventual consistency, clustering, service definition, objects vs aspects, authentication, monitoring, and resiliency.
When you split a monolithic application up into microservices, dependency management becomes a thing. There is a growing trend in IT right now, called monorepo, which basically is a response to this issue.
In the time of the monolith with a relational database, data integrity was taken for granted because you ran every change needed to maintain data integrity in a single transaction which either failed or succeeded atomically. That is not really a viable option in the world of microservices. How can you maintain consistent state in the case of partial failure?
Reactive systems are popular now because they have a lot of advantages but they do come with a price which is eventual consistency. What users want emotionally and what is easiest for GUI developers to code is that, when you call an API that mutates state and the API returns a success status, then you can count on that change as being immediately in effect. That is not the case with eventual consistency. In my personal experience, eventual consistency is not really an issue with social applications but is a deal breaker with financial applications.
In this 12 factor world, no service is run by a single host. Rather a cluster of hosts, each running an instance of the service, sits behind a load balancer which proxies each request to a different node in the pool. Some technologies, such as memcached and cassandra, require client side load balancing. Compound this with the ephemeral nature of the cloud, and you realize why technologies such as Kubernetes are quickly gaining in popularity.
When you have to manage over a dozen rapidly evolving APIs, you come to appreciate any mechanism that can keep the clients (usually written in different programming languages) and the server code in sync. That is the problem that service definition technologies, such as Swagger, Apache Thrift, and GRPC, are attempting to solve.
Study modern enterprise microservices written in Java and you will find a mixture of both Object Oriented Programming and Aspect Oriented Programming. Conventional wisdom dictates that AOP is best for cross cutting concerns. It is not clear yet whether or not microservice interaction is best considered as a cross cutting concern. Many technologies that facilitate microservice interaction offer both. You have to choose which one is right for your team. Right now, there is better support for OOP than AOP in most IDEs (Integrated Development Environment).
Authentication is always an important design consideration. For microservices, you will most likely want to adopt the oauth 2 standard. Many API gateways make it easy to integrate with oauth 2. There are a few situations where that doesn’t really work such as when a web browser needs to launch a spreadsheet.
Monitoring becomes a very important design consideration in a microservice world. Application and access logs need to be aggregated and surfaced in a way that is easily searchable by both developers and user ops folks. Performance related metrics, such as per minute latency (average and percentiles), throughput, error rate, and utilization, need to be aggregated and surfaced in a way that is easily searchable by developers, Quality Engineers, and technical operations support. An alerting system needs to be in place where uncharacteristic patterns and Service Level Agreement violations in logging and metrics are detected with corresponding notifications sent to the right personnel in a way that minimizes alert fatigue.
The last design consideration is resiliency. Study the code from most junior developers and you will find lots of assumptions in the code that dependent services are always available. That is not necessarily the case and it would be best if your service did not destabilize when dependent services become degraded or unresponsive. Techniques such as rate limiting, circuit breaking, and bulkheading become relevant here.
Josh Long: There are a lot of different ways for services to interact. It’s useful to think about interaction patterns and what qualities you want in those interactions. What level of robustness? What levels of decoupling?
Messaging – where systems communicate through a messaging fabric like Apache Kafka or RabbitMQ – supports a higher degree of service decoupling while introducing another moving part into a system. Messaging forms the backbone of a number of other patterns like CQRS, the Saga pattern and eventual consistency. Messaging gives you locational decoupling – it doesn’t matter if the consumer knows where the producer is, so long as it knows where the broker is. It gives you temporal decoupling – the consumer can process requests at whatever time or pace. A producer and a consumer are coupled by the payload (its schema) of the message. They’re also, technically, constrained by the broker itself but it is a well-understood recipe to stand up highly available message brokers or fabrics that cluster so that were any node to fail the system would correctly endure.
RPC – where clients and services interact in terms of function or procedure or method invocations. In RPC, remote services are made to appear like local invocations of functions or methods on objects. There are countless options for building RPC-centric applications including gRPC, RMI, XML-RPC and RPC-style SOAP (which is different from document literal-style SOAP).
REST – where clients and services interact in terms of HTTP requests and replies using the qualities of HTTP, like request parameters, headers, and content-negotiation – to build services. REST has some of the benefits and constraints of both messaging and RPC. It’s main benefit is that it’s suitable for the open-web where all languages and platforms have support for talking to HTTP services.
There are a lot of other ways for nodes running in separate processes to communicate, but I usually limit the discussion to these three options when talking about microservices as other approaches are more readily described as integration.
Microservices come in systems. It’s a natural consequence of moving to the architecture: you need to address the complexity implied in building distributed systems. The complexity arises in their interactions, their connections, which may be imperiled for any number of reasons including latency, overwhelming demand, or network partitions. There is no perfect solution, only solutions that support the trade-offs that you are comfortable making. I talk about some of the trade-offs for different types of communication styles in the first question. Consider levels of reliability.
Alex Silva: A few options are: Transport data over HTTP, using a serialization format such as JSON, Avro or protobuf. Use some type of message broker software, such as RabbitMQ. Use a distributed log that supports data replication and streaming semantics, such as Kafka.
InfoQ: What are the pros and cons of service orchestration approach? What type of use cases are better candidates for service interaction?
Chris Richardson: Orchestration-based sagas use an orchestration object, which tells the participants what actions to perform. For example, a CreateOrderSaga object would tell the OrderService to create an order and tell the CouponService to redeem the coupon. If one of those steps fails, the CreateOrderSaga would tell the participants to execute compensating transactions to undo the work that they had already committed.
There are several benefits of this approach. The logic of the saga is centralized and so it is easy to understand. It also simplifies the participants. Another benefit is that eliminates cyclic design-time dependencies. The orchestrator depends on the participants but not vice versa.
One potential drawback of orchestration is that there is a risk of the participants being anemic because the orchestrator implements too much logic. Care must be taken to ensure that the orchestrator is as simple as possible. Another drawback, is that you need to implement the orchestrator. As I describe below, some event event-based, choreography-based sagas can be very simple to implement.
Daniel Bryant: Service orchestration is generally easier to implement, and often gives better visibility into the processes and data flow involved, both at design time and runtime. Platforms that support orchestration also tend to offer cross-cutting concerns as part of the deployment fabric or framework, such as service discovery, flow control and fault tolerance.
On the flip side, orchestration can be somewhat inflexible as you have to describe every interaction, and it doesn’t always scale well when dealing with large and complex processes. Depending on the implementation, it can be the case that adding new task into an existing process can result in having to deploy the entire system (i.e. monolithic deploys). This can be problematic if the process is continually changing and evolving.
Glenn Engstrand: To understand why these pros and cons are intrinsic to service orchestration, let’s imagine an illustrative yet typical sample service interaction. There are three services, B, C, and D. Service B needs something from service C in order to successfully finish an API call. Service C needs something from service D in order to successfully finish an API call. In the orchestration approach, there would be a fourth service, called A, that first calls D and takes the response from that call and includes it in the call to service C then takes the response from that call and includes it in the call to service B. Service B doesn’t know how to call C nor does it depend on C directly. Service C doesn’t know how to call D nor does it depend on D directly.
The biggest advantage to service orchestration is less code complexity. Of the four services described above, only service A has to concern itself with resiliency. If you want to understand how service B depends on service C or how service C depends on service D, then all you have to do is study service A. This is a concept known as clear separation of concerns.
The biggest disadvantage to service orchestration is greater release complexity. Sometimes a new feature will require changes to multiple services. Each service has its own independent release schedule. You have to coordinate the changes in such a way that services that are depended on are released prior to services that depend on them. No release can be backwards breaking. The orchestration service may have to go through multiple interim releases in order for correct behavior to occur at all times. Imagine a new feature that requires changes to all four services of our sample service interaction. The worst case would be that you would have to release D first, then A, then C, then A, then B, then A.
The most common use case for service orchestration is the BfF.
Josh Long: Orchestration refers to integrating disparate services towards a desired result. The idea is that there is a single actor that involves other services. These services may not be aware of the desired goal. The actor guards the process state. In choreography, all actors in the system are aware of the global outcome desired and play their role towards that outcome. If something goes wrong in an orchestration, the orchestrator (a business process management engine like Activiti or a saga coordinator, for example) is responsible for recovering. In choreography, individual actors are aware of what must be done to recover. individual actors work more to be robust because there is nothing else – no other actor – that will compensate and recover.
Orchestration works well if you need to be explicit about process state because the individual actors in the process aren’t aware of the encompassing process. The drawback, of course, is that you need to involve another actor which in theory introduces another moving part which may fail. In orchestration, each individual component may be blissfully focused on doing one thing, ignorant of the encompassing process or desired result. Choreography, on the other hand, works well when you have full control over the actors involved in the process and they all work towards a common goal; you don’t need an extra moving part in the system and that moving part doesn’t need to be so robust as to ensure that every other actor in the system works.

Continue reading...