Getting started
Getting started with CipherStash Encrypt
CipherStash Encrypt enables you to encrypt your data in-use by using searchable encryption. This tutorial walks you through how to get started with CipherStash Encrypt. You will:
- Install the prerequisites
- Define the data to encrypt
- Prepare for encryption
- Encrypt your data
- Query your encrypted data
At the end of the guide, you will have encrypted your data using searchable encryption.
Installing prerequisites
This guide assumes you have a CipherStash account, CipherStash Proxy running, the CipherStash CLI, a PostgreSQL database, and a table with data you want to encrypt. If you haven't already, follow the Getting started with CipherStash Proxy guide, and/or create a CipherStash account by visiting https://dashboard.cipherstash.com/.
PostgreSQL
CipherStash CLI
Defining the data to encrypt
Before you can encrypt your data, you need to define what data you want to encrypt. This could be a single column, multiple columns, or an entire table. For this guide, we'll encrypt a single column called email
in a table called users
.
A dataset holds configuration for one or more database tables that contain data to be encrypted. A client allows an application to programmatically access a dataset. A dataset can have many clients (for example, different applications working with the same data), but a client belongs to exactly one dataset.
Step 1: Define a dataset
We'll use the CipherStash CLI to define a dataset that will be used to encrypt and decrypt data. The dataset will define which database columns should be encrypted and how the data should be indexed. Make sure you are logged in to the CipherStash CLI before continuing.
1stash login
Step 2: Create a dataset
Next, we need to create a dataset for tracking what data needs to be encrypted. To create our first dataset run the following command:
1stash datasets create users --description "my_dataset"
The output will look like this:
1Dataset created:
2ID : <a UUID style ID>
3Name : users
4Description: my_dataset
Note down the dataset ID, as you'll need it in the next step. You can also set a local environment variable to make it easier to use the dataset ID in the next step:
1export CS_DATASET_ID=<your dataset ID>
Workspace regions
If you are using a region other than the default, you will need to set the region in the CipherStash CLI. You can set the region by using the --vitur-host
option in the CLI. See the Dataset reference for more information.
Step 3: Create a client
Clients are used to authenticate with the CipherStash API. We will use a client key to authenticate with the CipherStash API when we deploy CipherStash Proxy. To create a client, use the dataset ID from Step 2: Create a dataset
to create a client (making sure you substitute your own dataset ID, if you didn't set the environment variable):
1stash clients create --dataset-id $CS_DATASET_ID "application_name"
The output will look like this:
1Client created:
2Client ID : <a UUID style ID>
3Name : application_name
4Description:
5Dataset ID : <your provided dataset ID>
6
7#################################################
8# #
9# Copy and store these credentials securely. #
10# #
11# THIS IS THE LAST TIME YOU WILL SEE THE KEY. #
12# #
13#################################################
14
15Client ID : <a UUID style ID>
16
17Client Key [hex] : <a long hex string>
Note down the client key somewhere safe, like a password vault. You can also set local environment variables to make it easier to use the key info in the next steps:
1export CS_CLIENT_ID=<your client ID>
2export CS_CLIENT_KEY=<your client key>
Step 4: Upload the dataset configuration
A dataset is a collection of tables and fields that you want to encrypt. We want to encrypt the email
field in the users
table. Create a file called dataset.yml
with the following configuration or the configuration that fits your use case.
1tables:
2 - path: users
3 fields:
4 - name: email
5 in_place: false
6 cast_type: utf8-str
7 mode: plaintext-duplicate
8 indexes:
9 - version: 1
10 kind: match
11 tokenizer:
12 kind: ngram
13 token_length: 3
14 token_filters:
15 - kind: downcase
16 k: 6
17 m: 2048
18 include_original: true
19 - version: 1
20 kind: ore
21 - version: 1
22 kind: unique
You can view the full list of configuration options and descriptions in the reference section.
If you want help defining a dataset for your application, book a demo to connect with one of our Solutions Engineers.
Step 5: Upload the dataset to CipherStash
Now that you have a dataset configuration, we need to upload it to CipherStash. You will use the CipherStash CLI to upload the dataset configuration. Run the following command to upload the dataset configuration and replace $CS_CLIENT_ID
and $CS_CLIENT_KEY
with the client ID and client key from Step: 3 Create a client
if you didn't set the environment variables:
1stash datasets config upload --file dataset.yml --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY
Adding PostgreSQL dependencies
CipherStash requires additional columns to be added to the database tables that you want to encrypt. These columns are used to store the encrypted data and the indexes that are used to search the encrypted data (searchable metadata).
Column | Type | Description |
---|---|---|
__email_encrypted | text | Encrypted source value for email |
__email_ore | public.ore_64_8_v1 | Encrypted ORE index for email |
__email_match | integer[] | Encrypted match index for email |
__email_unique | text | Encrypted unique index for name |
Postgres custom type
A custom type is used for the ORE index. The custom type is public.ore_64_8_v1
. You can create this custom type by running this SQL.
Step 1: Add the searchable metadata columns
Once you have added the custom type to your database, you can add the columns to the users
table or the table you want to encrypt. The following SQL command is an example of how to add the columns to the users
table, but you will need to adjust the table name and column names to match your database schema for your use case.
1ALTER TABLE users
2 ADD COLUMN __email_encrypted text,
3 ADD COLUMN __email_ore public.ore_64_8_v1,
4 ADD COLUMN __email_match integer[],
5 ADD COLUMN __email_unique text;
Based on your application architecture, you will either need to apply these changes manually or use a migration tool that is best suited for your application.
Step 2: Update the CipherStash Proxy configuration
You will need to have CipherStash Proxy running to encrypt and decrypt your data, so make sure you have followed the Getting started with CipherStash Proxy guide. You will need the client ID and client key from Step 3: Create a client
to update the CipherStash Proxy configuration.
For this example, we will be modifying the configuration file rather than using environment variables:
| Setting | Description | Default | Environment Variables | | encryption.mode | Encryption mode; can be encrypted
or passthrough
| passthrough
| CS_ENCRYPTION__MODE
| | encryption.client_id | Client ID for encryption, required if mode is passthrough
| None (Required) | CS_ENCRYPTION__CLIENT_ID
| | encryption.client_key | Client key for encryption, required if mode is passthrough
| None (Required) | CS_ENCRYPTION__CLIENT_KEY
|
1workspace_id = "workspace_id"
2client_access_key = "client_access_key"
3
4[encryption]
5mode = "encrypted"
6client_id = "client_id"
7client_key = "client_key"
8
9[database]
10username = "username"
11password = "password"
12name = "database_name"
13host = "localhost"
14port = 5432
You can see all the configuration options for CipherStash Proxy in the CipherStash Proxy documentation.
Step 3: Restart CipherStash Proxy
After updating the configuration file, you will need to restart CipherStash Proxy to apply the changes. Depending on how you have the proxy running, you will need to restart the Docker container or the service.
Encrypting your data
At this point, you have CipherStash Proxy running and configured to encrypt and decrypt data. Every time data is sent to the database, it will be encrypted and stored in the encrypted columns and all data retrieved from the database will be decrypted in the CipherStash Proxy.
Step 1: Encrypt the existing data
Before you can use CipherStash Proxy to encrypt and decrypt your data, you need to encrypt the existing data in the fields you have defined in the dataset configuration.
CipherStash Encryption Migrator is bundled into the cipherstash-proxy Docker image.
Assuming that your running container is called cipherstash-proxy
and that the primary key of your table is id
, to encrypt existing values in the email
column in the users
table, run:
1docker exec cipherstash-proxy cipherstash-migrator --table users --columns email
To specify a different primary key use the --primary-key
option.
See CipherStash Encryption Migrator for further details.
Output
There will be no output when calling docker exec
. When executing via a container, the output of the cipherstash-migrator
command will be inside the container runtime. If you view the output of the Docker container, you will see the the output of the command, including any errors messages.
Step 2: Move to full encryption
Once you have encrypted the existing data, you can now move to full encryption. Full encryption mode means that all data will be encrypted in the database and all queries will be executed on the encrypted data rather than the plaintext data.
Step 2.1: Update your dataset configuration
Update your dataset configuration to change the mode of the fields you want to encrypt from plaintext-duplicate
to encrypted
.
1tables:
2 - path: users
3 fields:
4 - name: email
5 in_place: false
6 cast_type: utf8-str
7 mode: encrypted
8 indexes:
9 - version: 1
10 kind: match
11 tokenizer:
12 kind: ngram
13 token_length: 3
14 token_filters:
15 - kind: downcase
16 k: 6
17 m: 2048
18 include_original: true
19 - version: 1
20 kind: ore
21 - version: 1
22 kind: unique
Now push the updated config to CipherStash and restart our example application:
1stash datasets config upload --file config/dataset.yml --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY
Step 3: Drop the plaintext columns
Once you've updated the dataset configuration, you can modify your database to set all the plaintext columns to a string literal value like redacted
. You must execute this step with caution, as it will remove the plaintext data from your database and it must be executed directly on the database without being routed through CipherStash Proxy.
1UPDATE users SET email = 'redacted' WHERE email IS NOT NULL;
Data loss warning
This operation will remove the plaintext data from your database. Ensure you have a backup of your data before executing this command.
Querying your encrypted data
Now that you have encrypted your data, you can query the encrypted data using any PostgreSQL client that is connected to the CipherStash Proxy. When you query the data, the CipherStash Proxy will decrypt the data before returning it to you.
Here is a psql
example of how you can query the users
table:
1SELECT * FROM users;
When you run this query, you will see the decrypted data returned to you.
Querying encrypted data
When you query the data, the CipherStash Proxy will decrypt the data before returning it to you and the data remains completely encrypted in the database.
That's it
You have now successfully encrypted your data using CipherStash Encrypt!