Home United States USA — software Designing High-Volume Systems Using Event-Driven Architectures

Designing High-Volume Systems Using Event-Driven Architectures

December 14, 2021

121

Learn how to design and build a system meant to handle heavy loads while embracing cloud-native architectures.
Join the DZone community and get the full member experience. Microservices style application architecture is taking root and rapidly growing in population that are possibly scattered in different parts of the enterprise ecosystem. Organizing and efficiently operating them on a multi-cloud environment organizing data around microservices, making the data as real-time as possible are emerging to be some of the key challenges. Thanks to the latest development in Event-Driven Architecture (EDA) platforms such as Kafka and data management techniques such as Data Meshes and Data Fabrics, designing microservices-based applications is now much easier. However, to ensure these microservices-based applications perform at requisite levels, it is important to ensure critical Non-Functional Requirements (NFRs) are taken into consideration during the design time itself. In a series of blog articles, my colleagues Tanmay Ambre, Harish Bharti along with myself are attempting to describe a cohesive approach on Design for NFR. We take a use-case-based approach. In the first installment, we describe designing for “performance” as the first critical NFR. This article focuses on architectural and design decisions that are the basis of high-volume, low-latency processing. To make these decisions clear and easy to understand, we describe their application to a high-level use case of funds transfer. We have simplified the use case to focus mainly on performance. Electronic Fund Transfer (ETF) is a very important way of sending and receiving money these days by consuming through digital channels. We consider this use case as a good candidate for explaining performance-related decisions as it involves a high volume of requests, coordinating with many distributed systems, and has no margin for error (i.e. the system needs to be reliable and fault-tolerant). In the past, fund transfers would take days. It would involve a visit to a branch or writing a cheque. However, with the emergence of digital, new payment mechanisms, payment gateways. and regulations, fund transfer has become instantaneous. For example, in September 2021,3.6 billion transactions worth 6.5 trillion INR were executed on the UPI network in real-time. Customers are expecting real-time payments across a wide variety of channels. Regulations such as PSD2, Open Banking, and country-specific regulations have made it mandatory to expose their payment mechanisms to trusted third-party application developers. Typically, a customer of a bank would place the request for fund transfer using one of the channels available (e.g., mobile app, online portal, or by visiting the institution). Once the request is received, the following needs to be performed: The below figure gives an overview of this: Note: For the sake of this article, this use case comprises operations done within the financial institution where the request originates. It relies upon already established payment gateways that execute the actual fund transfer. Operations of the payment gateway are outside the scope of this use case. Here are the most critical NFRs that we must address: The implementation model of this use case will be through a cloud-native style – Microservices, API, Containers, Event Streams, and Distributed Data Management with eventual consistency style data persistence for integrity. Please note that this architecture is based on the architectural best practices that are outlined in Architectural Considerations for Event-Driven Microservices-Based Systems. Following is the set of key architectural patterns considered to implement this user story: The following diagram provides an overview of the solution architecture: The application architecture is organized through a set of independently operable microservices. In addition, an orchestrator service (another microservice) coordinates the full transaction ensuring the end-to-end process execution is in place. The different services of fund transfer are wired together as a set of producers, processors, and consumers of events. They are 4 main processors: The API publishes an event to the input topic of Fund Transfer Orchestrator, which is the primary coordinator for Fund Transfer requests. Events are first-class citizens and are persistent. The events accumulate in an event store (enabling the event sourcing architectural pattern). Based on the event context and the payload, the orchestrator will transform the event and publish the state of fund transfer to another topic. The fund transfer state transitions are also recorded in a state store which can be used to regenerate the state in case of system-level failures. This state is consumed by the Fund Transfer Request Router which will then make routing decisions and route it to other systems (either single or multiple systems simultaneously). Other systems will do their processing and publish the outcome as an event to the input topic. Which are then correlated and processed by the Fund Transfer Orchestrator resulting in a state change of the Fund Transfer request. Functional exceptions are processed by the Fund Transfer Orchestrator and the Fund Transfer Request state is updated accordingly. Fund Transfer state changes are also consumed by the real-time Fund Transfer Statistics Service which aggregates the statistics of fund transfer across multiple different dimensions – so that the operations team can have a nearly real-time view of the Fund Transfer statistics. To implement the above application architecture, we decided on key technical building blocks: Following is an indicative technology stack that can be used to build this system. Most of them are open-source. It is possible to use other technologies. For e.g., Quarkus. Capability Implementation Choices DevOps Capability DevOps tool choices Programming Language Java Spring-boot, Quarkus, Golang CI /CD Jenkins / Tekton, Maven/ Gradle, Nexus / Quay Event Backbone Event Backbone – Apache Kafka, rabbit mq Deployment Automation Ansible, Chef Event / Message Format Avro, JSON Monitoring & Visualization Prometheus, Grafana, Micrometer, spring-boot actuator In-Memory Cache Apache Ignite, Redis, Hazelcast Service Mesh (Auto heal, autoscale, Canary++) Istio NoSQL Database Mongo, Cassandra, CouchDB Log Streams and Analytics EFK (Elastic Search, Filebeat, Kibana) Relational DB Postgres, Maria, MySQL Continuous Code Quality management Sonarqube, Cast New SQL (Distributed RDBMS) Yugabyte, CockroachDB Config & Source Code Management Git & Spring Cloud Config Following are the top three critical Design Decisions addressing the highly dynamic and complex NFRs.