Databricks provides a powerful, spark-centric, cloud-based analytics platform that enables users to rapidly process, transform and explore data. However, its preconfigured security can be insufficient in regulating or monitoring confidential information due to the flexibility it offers. This can be of particular concern to highly regulated enterprise, such a financial and health-care companies. Policy-as-code is a new paradigm that can help manage the additional technical overhead required for compliance and governance for these companies as they migrate more and more sensitive data to Databricks.

Policy-as-code is a way to automate data security and privacy controls, allowing organizations to apply centralized policies to data science and analytics projects. Introducing Immuta’s Policy-As-Code feature: a simple and efficient way to equip Databrick’s security model with even more protection, while granting the entire organization improved visibility into data use and access. With Immuta and Databricks, policy-as-code can be implemented quickly and easily. For example, you can create data sources, policies, projects, and purposes with endpoints, methods, query parameters, and payload definitions. Additionally, Immuta allows you to audit data access for compliance and create data governance policies without writing code.

To set up the integration, Databricks provides an API that can be used to write and deploy policies using JSON. These policies can be used to define who has access to what datasets and for how long, as well as the granularity of access granted on a dataset. Immuta also allows users to easily audit data usage and enforce rules on data governance across all Databricks clusters. A sample policy might look something like this:

{

“name”: “My Databricks Dataset”,

“actions”: {

“read”: [

“group: Databricks”,

],

“write”: []

},

“time_restrictions”: {

“start”: <start time>,

“end”: <end time>

}

}

Immuta’s Policy-As-Code feature provides Databricks with a comprehensive and industry-tested security model. allowing enterprises to ensure that confidential data is properly managed and secured. Organizations can swiftly and effectively deploy centralized policies for data science and analytics projects using Immuta’s Policy-as-code for Databricks security. It gives users the capacity to simply construct data sources, policies, projects, purposes as well as determine endpoints, methods query parameters and payloads. Additionally, it can be used to audit data access for compliance and create data governance policies without having to write code.