Managing Databricks Secrets using Windows.

Databricks has a method of Secret Management which allows users to save sensitive data such as JDBC connection or Snowflake credentials. Rather than saving credentials in plaintext, databricks would save the credential for you in an internal database and you get the refer the value using the ‘scope’.
While the official documentation lists out the commands for implementation, this documentation is specifically for installation on windows.

Most of the work is to be done using command prompt of your Windows system (Windows 10 is used in this documentation), rather than the Databricks UI. However, one needs access to the UI as well as a premium account.

Installing Python on Windows

Download the executable from the Python.org. While executing the file, make sure to check the ‘Add Python to PATH’ option. With default settings, this step is a breeze.

In command prompt, some simple commands can verify the installation.

C:\>python --version
Python 3.10.0 

C:\>pip --version 
21.2.3 
Setting up authentication to Databricks from Command Prompt

Using an authentication token is the best way to access your databricks account from command prompt. To set that up, one has to generate a token from the ‘User Settings -> Access Tokens‘ section of the UI.

Then, install the databricks-CLI and configure the access token. Commands to test the connections (error case as well) is also provided.

C:\>pip install databricks-cli

Collecting databricks-cli
Downloading databricks-cli-0.16.2.tar.gz (58 kB)
Downloading click-8.0.3-py3-none-any.whl (97 kB)            
Downloading requests-2.26.0-py2.py3-none-any.whl (62 kB)
Downloading tabulate-0.8.9-py3-none-any.whl (25 kB)
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Downloading colorama-0.4.4-py2.py3-none-any.whl (16 kB)
Downloading urllib3-1.26.7-py2.py3-none-any.whl (138 kB)
Downloading idna-3.3-py3-none-any.whl (61 kB)
Downloading charset_normalizer-2.0.7-py3-none-any.whl (38 kB)
Downloading certifi-2021.10.8-py2.py3-none-any.whl (149 kB)

Installing collected packages:...
Running setup.py install for databricks-cli … done
Successfully installed.

C:\>databricks configure --token

Databricks Host (should begin with https://):
https://piyushroutray.cloud.databricks.com
Token: wrong_token_provided

C:\>databricks clusters list
Error: b'<html>\n<head>\n<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>\n<title>Error 403 Invalid access token.</title>\n</head>\n<body><h2>HTTP ERROR 403</h2>\n<p>Problem accessing /api/2.0/clusters/list. Reason:\n<pre>    Invalid access token.</pre></p>\n</body>\n</html>\n'

C:\>databricks configure --token

Databricks Host (should begin with https://):
https://piyushroutray.cloud.databricks.com
Token: N0tTh3@ctu@lT0K3N

C:\>databricks clusters list
1234-567890-qwerty  job-123-run-123   RUNNING
1234-567890-asdfgh  job-123-run-4567  RUNNING
1234-567890-zxcvbn  job-089-run-3038  TERMINATED
1234-567890-yuiop[  job-089-run-1111  TERMINATED
Creating new Scope and storing Secrets!

The official documentation lists out the commands for implementation and should be referred for updated commands.
Here are some of the commands used to manage scope and secrets in our case.

C:\>databricks secrets create-scope --scope tableau
C:\>databricks secrets list-scopes
Scope      Backend     
Key        Vault       URL
---------  ----------  --------------
demo       DATABRICKS  N/A
tableau    DATABRICKS  N/A

C:\>databricks secrets delete-scope --scope demo

C:\>databricks secrets put --scope tableau --key tableau_pd

C:\>databricks secrets list --scope tableau
Key name        Last updated
------------  --------------
tableau_pd     1637122617123

When you execute the last command mentioned above a new notepad window opens to save the actual password!

In the case above, ‘tableau_pd‘ is the variable being used and NOT the actual password. This variable will be referred while using the credential in Databricks notebooks.
If you issue a write request with a key that already exists, the new value overwrites the existing value.

If any user (including the owner) tries to print the ‘secret‘, it shows as [REDACTED].

Users’ principals also need to be provided with proper permissions.

C:\>databricks secrets list-acls --scope tableau
Principal             		Permission
--------------------  		------------
proutray@piyushroutray.com  MANAGE

C:\>databricks secrets put-acl --scope tableau --principal amohanty@piyushroutray.com --permission READ

C:\Users\proutray>databricks secrets list-acls --scope tableau
Principal             		Permission
--------------------  		------------
proutray@piyushroutray.com  MANAGE
amohanty@piyushroutray.com  READ

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s