Home United States USA — software Microsoft Ignite Data and Analytics roundup: Platform extensions are the key theme

Microsoft Ignite Data and Analytics roundup: Platform extensions are the key theme

179
0
SHARE

Microsoft is introducing a new Apache Cassandra Azure service, while adding new onramps to Azure Synapse Analytics. And it’s the first third-party cloud to cleanroom engineer its own MongoDB 4.0 API.
The shift toward online digital conferences has prompted Microsoft to reconvene Ignite about six months early this year. Scanning the data and analytics announcements, the overriding theme is of extending the reach of the portfolio of Azure data platforms. For data and analytics, the headlines on this go round include a new Azure Managed Instance for Apache Cassandra; support for a MongoDB 4.0 API in Azure Cosmos DB; the general availability of Azure Synapse Link for Cosmos DB; and some enhancements to Azure Cache for Redis offering. And Microsoft is introducing new tools for data warehouse users to automate their migration to Azure Synapse Analytics. On the hybrid cloud front, there are several announcements for the software-defined hybrid platform Azure Arc, including support of Kubernetes (K8s) and addition of Azure Machine Learning to the small, but growing stable of Azure services available on Arc. We’re splitting data and analytics coverage into two parts. We’ll focus on the data platforms and the K8s support on Arc, while Big on Data bro Andrew Brust will turn the spotlight on Power BI, Azure Purview, and Azure Machine Learning. Now let’s get down to business. Microsoft is announcing the preview of a new lift-and-shift option for Cassandra customers: Azure Managed Instance for Apache Cassandra. It mimics a similar offering for SQL Server customers, with Azure SQL Managed Instance in that it is designed to replicate the customer’s environment with a single tenant implementation but with a partially managed cloud service where Azure picks up the server provisioning, software maintenance, and automatic backups. Managed Instance joins Azure Cosmos DB in presenting a second path for Cassandra users. The two are very different services, although Microsoft is also providing a migration path that could allow managed instance to act as a steppingstone to Cosmos DB if the customer wants it. There is a baseline similarity, however as both support multi-AZ and multi-region deployment. The differences start at the storage engine: Managed Instance is a pure implementation of Apache Cassandra, whereas Cosmos DB has its own canonical storage engine that supports a compatible implementation via API, in a manner akin to how AWS delivers Amazon Keyspaces. In fact, Cassandra is one many data models available through Cosmos DB, where Microsoft offers a selection of APIs. ACID consistency is another differentiator: in Managed Instance, customers set consistency the way they would with the Cassandra tooling that they already use, whereas in Cosmos DB, there are five preset consistency options. And of course, there is deployment environment: designed to replicate the customer’s on-premises environment. Managed Instance is a single-tenant (or bare metal) implementation, whereas Cosmos DB is multi-tenant. There are other subtle differences as well. As noted, Managed Instance is designed for customers who either want to take advantage of the cloud to help simplify the running of Cassandra, or provide a waystation for moving their implementation to Cosmos DB. For the latter scenario they can use a managed replication connector to populate Cassandra data into Cosmos DB. The addition of Managed Instance is the latest example of the growing richness of Cassandra cloud services, which is a very recent phenomenon. Despite the fact that Apache Cassandra has been one of the most popular databases as ranked by db-Engines, until the past year, it had not gotten much love in the cloud. Apache Cassandra was known as a highly robust, scalable, write-centric operational database suited for global deployments.

Continue reading...