Integrated Azure Synapse Workspace helps handle the security of data in one place for all data lakes, data analytics, and warehousing needs, but also …
Join the DZone community and get the full member experience. Azure Synapse Analytics is a new product in the Microsoft Azure portfolio. It brings a whole new layer of control plane over well-known services as SQL Warehouse (rebranded to SQL Provisioned Pool), integrated Data Factory Pipelines, and Azure Data Lake Storage, as well as add new components such as Serverless SQL and Spark Pools. Integrated Azure Synapse Workspace helps handle security and protection of data in one place for all data lake, data analytics, and warehousing needs, but also requires learning some new concepts. At GFT, working with financial institutions all over the world, we pay particular attention to the security aspects of solutions that we provide to our customers. Synapse Analytics is a welcome new tool in this area. The first visible difference, when compared to other services, is that Synapse Analytics has a separate Workspace: https://web.azuresynapse.net/ that provides access to code, notebooks, SQL, pipelines, monitoring, and management panels. The portal is available on the public Internet using Azure AD Access controls for controlling access to any Synapse Analytics instance in any tenant that we have access to. However, Synapse Analytics introduces a new way to connect to the portal from Internet-isolated, on-premises networks, and offices using Private Link Hubs. Compared to Private Links that protect access to services and databases, this solution is used for routing traffic to a web portal. In conjunction with the Azure AD Conditional Access policy, the new Synapse Analytics Workspace can be protected with network and authentication policies. In terms of authorization, Synapse Analytics introduces a new concept — Synapse Roles. This is another layer of role assignment on the Analytics Workspace and internal workspace items. There are: The granularity of role assignments allows for detailed access control for all administrators, support specialists as well as data scientists and developers. Creating custom Synapse Roles is not supported now and most Synapse Roles are in preview which means that users must be prepared for changes. What’s worth mentioning is that these roles do not protect access to data sets in ADLS. Azure Synapse Analytics is a PaaS solution, and this is most apparent when using Serverless SQL Pools, Spark Pools, and Integration Runtimes. The compute part of the platform is provided by Microsoft and never exists inside owners’ subscriptions. The same applies even to a more “traditional” data warehouse with SQL Provisioned Pools. As a result, in the default configuration, Synapse Analytics will use public endpoints for communication and will not be able to connect to isolated vNETs. The alternative deployment model with Managed Virtual Network should be considered when using Synapse in an environment that requires higher network isolation standards. With this solution, all Synapse components (SQL Pools, Spark Pools, and Integration Runtimes) will be able to use Managed Private Endpoints to connect to other services (databases, storage accounts) that allow access through private IPs only. Enabling additional Data Exfiltration protection will allow Synapse runtimes to be deployed to a virtual network to communicate over private endpoints at all times and prevent accessing any external resources.