Getting started

Getting started with CipherStash Encrypt

CipherStash Encrypt enables you to encrypt your data in-use by using searchable encryption. This tutorial walks you through how to get started with CipherStash Encrypt. You will:

  • Install the prerequisites
  • Define the data to encrypt
  • Prepare for encryption
  • Encrypt your data
  • Query your encrypted data

At the end of the guide, you will have encrypted your data using searchable encryption.

Installing prerequisites

This guide assumes you have a CipherStash account, CipherStash Proxy running, the CipherStash CLI, a PostgreSQL database, and a table with data you want to encrypt. If you haven't already, follow the Getting started with CipherStash Proxy guide, and/or create a CipherStash account by visiting https://console.cipherstash.com/.

PostgreSQL

CipherStash CLI

PostgreSQL Note

This guide assumes you have a PostgreSQL instance running locally on port 5432. If you want to use a hosted PostgreSQL service, we highly recommend using Supabase or AWS RDS.

Defining the data to encrypt

Before you can encrypt your data, you need to define what data you want to encrypt. This could be a single column, multiple columns, or an entire table. For this guide, we'll encrypt a single column called email in a table called users.

A dataset holds configuration for one or more database tables that contain data to be encrypted. A client allows an application to programmatically access a dataset. A dataset can have many clients (for example, different applications working with the same data), but a client belongs to exactly one dataset.

Step 1: Define a dataset

We'll use the CipherStash CLI to define a dataset that will be used to encrypt and decrypt data. The dataset will define which database columns should be encrypted and how the data should be indexed. Make sure you are logged in to the CipherStash CLI before continuing.

1stash login
2

Step 2: Create a dataset

Next, we need to create a dataset for tracking what data needs to be encrypted. To create our first dataset run the following command:

1stash datasets create users --description "my_dataset"
2

The output will look like this:

1Dataset created:
2ID         : <a UUID style ID>
3Name       : users
4Description: my_dataset
5

Note down the dataset ID, as you'll need it in the next step. You can also set a local environment variable to make it easier to use the dataset ID in the next step:

1export CS_DATASET_ID=<your dataset ID>
2

Step 3: Create a client

Clients are used to authenticate with the CipherStash API. We will use a client key to authenticate with the CipherStash API when we deploy CipherStash Proxy. To create a client, use the dataset ID from Step 2: Create a dataset to create a client (making sure you substitute your own dataset ID, if you didn't set the environment variable):

1stash clients create --dataset-id $CS_DATASET_ID "application_name"
2

The output will look like this:

1Client created:
2Client ID  : <a UUID style ID>
3Name       : application_name
4Description:
5Dataset ID : <your provided dataset ID>
6
7#################################################
8#                                               #
9#  Copy and store these credentials securely.   #
10#                                               #
11#  THIS IS THE LAST TIME YOU WILL SEE THE KEY.  #
12#                                               #
13#################################################
14
15Client ID          : <a UUID style ID>
16
17Client Key [hex]   : <a long hex string>
18

Note down the client key somewhere safe, like a password vault. You can also set local environment variables to make it easier to use the key info in the next steps:

1export CS_CLIENT_ID=<your client ID>
2export CS_CLIENT_KEY=<your client key>
3

Step 4: Upload the dataset configuration

A dataset is a collection of tables and fields that you want to encrypt. We want to encrypt the email field in the users table. Create a file called dataset.yml with the following configuration or the configuration that fits your use case.

1tables:
2  - path: users
3    fields:
4      - name: email
5        mode: plaintext-duplicate
6        indexes:
7          - version: 1
8            kind: match
9            tokenizer:
10              kind: ngram
11              token_length: 3
12            token_filters:
13              - kind: downcase
14            k: 6
15            m: 2048
16            include_original: true
17          - version: 1
18            kind: ore
19          - version: 1
20            kind: unique
21

You can view the full list of configuration options and descriptions in the reference section.

If you want help defining a dataset for your application, book a demo to connect with one of our Solutions Engineers.

Step 5: Upload the dataset to CipherStash

Now that you have a dataset configuration, we need to upload it to CipherStash. You will use the CipherStash CLI to upload the dataset configuration. Run the following command to upload the dataset configuration and replace $CS_CLIENT_ID and $CS_CLIENT_KEY with the client ID and client key from Step: 3 Create a client if you didn't set the environment variables:

1stash datasets config upload --file dataset.yml --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY
2

Adding PostgreSQL dependencies

CipherStash requires additional columns to be added to the database tables that you want to encrypt. These columns are used to store the encrypted data and the indexes that are used to search the encrypted data (searchable metadata).

ColumnTypeDescription
__email_encryptedtextEncrypted source value for email
__email_orepublic.ore_64_8_v1Encrypted ORE index for email
__email_matchinteger[]Encrypted match index for email
__email_uniquetextEncrypted unique index for name

Postgres custom type

A custom type is used for the ORE index. The custom type is public.ore_64_8_v1. You can create this custom type by running the following SQL command definied in this GitHub Gist.

Step 1: Add the searchable metadata columns

Once you have added the custom type to your database, you can add the columns to the users table or the table you want to encrypt. The following SQL command is an example of how to add the columns to the users table, but you will need to adjust the table name and column names to match your database schema for your use case.

1ALTER TABLE users
2  ADD COLUMN __email_encrypted text,
3  ADD COLUMN __email_ore public.ore_64_8_v1,
4  ADD COLUMN __email_match integer[],
5  ADD COLUMN __email_unique text;
6

Based on your application architecture, you will either need to apply these changes manually or use a migration tool that is best suited for your application.

Step 2: Update the CipherStash Proxy configuration

You will need to have CipherStash Proxy running to encrypt and decrypt your data, so make sure you have followed the Getting started with CipherStash Proxy guide. You will need the client ID and client key from Step 3: Create a client to update the CipherStash Proxy configuration.

For this example, we will be modifying the configuration file rather than using environment variables:

| Setting | Description | Default | Environment Variables | | encryption.mode | Encryption mode; can be encrypted or passthrough | passthrough | CS_ENCRYPTION__MODE | | encryption.client_id | Client ID for encryption, required if mode is passthrough | None (Required) | CS_ENCRYPTION__CLIENT_ID | | encryption.client_key | Client key for encryption, required if mode is passthrough | None (Required) | CS_ENCRYPTION__CLIENT_KEY |

1username = "username"
2password = "password"
3
4workspace_id = "workspace_id"
5client_access_key = "client_access_key"
6
7[encryption]
8mode = "encrypted"
9client_id = "client_id"
10client_key = "client_key"
11
12[database]
13name = "database_name"
14host = "localhost"
15port = 5432
16

You can see all the configuration options for CipherStash Proxy in the CipherStash Proxy documentation.

Step 3: Restart CipherStash Proxy

After updating the configuration file, you will need to restart CipherStash Proxy to apply the changes. Depending on how you have the proxy running, you will need to restart the Docker container or the service.

Encrypting your data

At this point, you have CipherStash Proxy running and configured to encrypt and decrypt data. Every time data is sent to the database, it will be encrypted and stored in the encrypted columns and all data retrieved from the database will be decrypted in the CipherStash Proxy.

Step 1: Encrypt the existing data

Before you can use CipherStash Proxy to encrypt and decrypt your data, you need to encrypt the existing data in the fields you have defined in the dataset configuration. At the moment, the most efficient way to do this is to touch every row in the table to trigger the encryption process. This process will be dependent on your application architecture.

An example SQL statement to "touch" every row in the users table to trigger the encryption process is:

1UPDATE users SET email = plain.email FROM users plain WHERE plain.id = users.id;
2

This will update every row in the users table with the same value that is already there, and it will trigger the encryption process to populate the encrypted columns.

Improvements coming soon

In the future, we will provide a more efficient way to encrypt existing data. If you'd like to know more about this, please reach out to our Solutions Engineers via our book a demo page.

Step 2: Move to full encryption

Once you have encrypted the existing data, you can now move to full encryption. Full encryption mode means that all data will be encrypted in the database and all queries will be executed on the encrypted data rather than the plaintext data.

Step 2.1: Update your dataset configuration

Update your dataset configuration to change the mode of the fields you want to encrypt from plaintext-duplicate to encrypted.

1tables:
2  - path: users
3    fields:
4      - name: email
5        mode: encrypted
6        indexes:
7          - version: 1
8            kind: match
9            tokenizer:
10              kind: ngram
11              token_length: 3
12            token_filters:
13              - kind: downcase
14            k: 6
15            m: 2048
16            include_original: true
17          - version: 1
18            kind: ore
19          - version: 1
20            kind: unique
21

Now push the updated config to CipherStash and restart our example application:

1stash datasets config upload --file config/dataset.yml --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY
2

Step 3: Drop the plaintext columns

Once you've updated the dataset configuration, you can modify your database to set all the plaintext columns to NULL. You must execute this step with caution, as it will remove the plaintext data from your database and it must be executed directly on the database without being routed through CipherStash Proxy.

1UPDATE users SET email = NULL WHERE email IS NOT NULL;
2

Data loss warning

This operation will remove the plaintext data from your database. Ensure you have a backup of your data before executing this command.

Querying your encrypted data

Now that you have encrypted your data, you can query the encrypted data using any PostgreSQL client that is connected to the CipherStash Proxy. When you query the data, the CipherStash Proxy will decrypt the data before returning it to you.

Here is a psql example of how you can query the users table:

1SELECT * FROM users;
2

When you run this query, you will see the decrypted data returned to you.

Querying encrypted data

When you query the data, the CipherStash Proxy will decrypt the data before returning it to you and the data remains completely encrypted in the database.

That's it!

You have now successfully encrypted your data using CipherStash Encrypt!

Previous
CipherStash Audit