Getting started

Getting started with CipherStash Encrypt

CipherStash Encrypt enables you to encrypt your data in-use by using searchable encryption. This tutorial walks you through how to get started with CipherStash Encrypt. You will:

  • Install the prerequisites
  • Define the data to encrypt
  • Prepare for encryption
  • Encrypt your data
  • Query your encrypted data

At the end of the guide, you will have encrypted your data using searchable encryption.

Installing prerequisites

This guide assumes you have a CipherStash account, CipherStash Proxy running, the CipherStash CLI, a PostgreSQL database, and a table with data you want to encrypt. If you haven't already, follow the Getting started with CipherStash Proxy guide, and/or create a CipherStash account by visiting https://dashboard.cipherstash.com/.

PostgreSQL

CipherStash CLI

PostgreSQL Note

This guide assumes you have a PostgreSQL instance running locally on port 5432. If you want to use a hosted PostgreSQL service, we highly recommend using Supabase or AWS RDS.

Defining the data to encrypt

Before you can encrypt your data, you need to define what data you want to encrypt. This could be a single column, multiple columns, or an entire table. For this guide, we'll encrypt a single column called email in a table called users.

A dataset holds configuration for one or more database tables that contain data to be encrypted. A client allows an application to programmatically access a dataset. A dataset can have many clients (for example, different applications working with the same data), but a client belongs to exactly one dataset.

Step 1: Define a dataset

We'll use the CipherStash CLI to define a dataset that will be used to encrypt and decrypt data. The dataset will define which database columns should be encrypted and how the data should be indexed. Make sure you are logged in to the CipherStash CLI before continuing.

1stash login

Step 2: Create a dataset

Next, we need to create a dataset for tracking what data needs to be encrypted. To create our first dataset run the following command:

1stash datasets create users --description "my_dataset"

The output will look like this:

1Dataset created:
2ID         : <a UUID style ID>
3Name       : users
4Description: my_dataset

Note down the dataset ID, as you'll need it in the next step. You can also set a local environment variable to make it easier to use the dataset ID in the next step:

1export CS_DATASET_ID=<your dataset ID>

Workspace regions

If you are using a region other than the default, you will need to set the region in the CipherStash CLI. You can set the region by using the --vitur-host option in the CLI. See the Dataset reference for more information.

Step 3: Create a client

Clients are used to authenticate with the CipherStash API. We will use a client key to authenticate with the CipherStash API when we deploy CipherStash Proxy. To create a client, use the dataset ID from Step 2: Create a dataset to create a client (making sure you substitute your own dataset ID, if you didn't set the environment variable):

1stash clients create --dataset-id $CS_DATASET_ID "application_name"

The output will look like this:

1Client created:
2Client ID  : <a UUID style ID>
3Name       : application_name
4Description:
5Dataset ID : <your provided dataset ID>
6
7#################################################
8#                                               #
9#  Copy and store these credentials securely.   #
10#                                               #
11#  THIS IS THE LAST TIME YOU WILL SEE THE KEY.  #
12#                                               #
13#################################################
14
15Client ID          : <a UUID style ID>
16
17Client Key [hex]   : <a long hex string>

Note down the client key somewhere safe, like a password vault. You can also set local environment variables to make it easier to use the key info in the next steps:

1export CS_CLIENT_ID=<your client ID>
2export CS_CLIENT_KEY=<your client key>

Step 4: Upload the dataset configuration

A dataset is a collection of tables and fields that you want to encrypt. We want to encrypt the email field in the users table. Create a file called dataset.yml with the following configuration or the configuration that fits your use case.

1tables:
2  - path: users
3    fields:
4      - name: email
5        in_place: false
6        cast_type: utf8-str
7        mode: plaintext-duplicate
8        indexes:
9          - version: 1
10            kind: match
11            tokenizer:
12              kind: ngram
13              token_length: 3
14            token_filters:
15              - kind: downcase
16            k: 6
17            m: 2048
18            include_original: true
19          - version: 1
20            kind: ore
21          - version: 1
22            kind: unique

You can view the full list of configuration options and descriptions in the reference section.

If you want help defining a dataset for your application, book a demo to connect with one of our Solutions Engineers.

Step 5: Upload the dataset to CipherStash

Now that you have a dataset configuration, we need to upload it to CipherStash. You will use the CipherStash CLI to upload the dataset configuration. Run the following command to upload the dataset configuration and replace $CS_CLIENT_ID and $CS_CLIENT_KEY with the client ID and client key from Step: 3 Create a client if you didn't set the environment variables:

1stash datasets config upload --file dataset.yml --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY

Adding PostgreSQL dependencies

CipherStash requires additional columns to be added to the database tables that you want to encrypt. These columns are used to store the encrypted data and the indexes that are used to search the encrypted data (searchable metadata).

ColumnTypeDescription
__email_encryptedtextEncrypted source value for email
__email_orepublic.ore_64_8_v1Encrypted ORE index for email
__email_matchinteger[]Encrypted match index for email
__email_uniquetextEncrypted unique index for name

Postgres custom type

A custom type is used for the ORE index. The custom type is public.ore_64_8_v1. You can create this custom type by running this SQL.

Step 1: Add the searchable metadata columns

Once you have added the custom type to your database, you can add the columns to the users table or the table you want to encrypt. The following SQL command is an example of how to add the columns to the users table, but you will need to adjust the table name and column names to match your database schema for your use case.

1ALTER TABLE users
2  ADD COLUMN __email_encrypted text,
3  ADD COLUMN __email_ore public.ore_64_8_v1,
4  ADD COLUMN __email_match integer[],
5  ADD COLUMN __email_unique text;

Based on your application architecture, you will either need to apply these changes manually or use a migration tool that is best suited for your application.

Step 2: Update the CipherStash Proxy configuration

You will need to have CipherStash Proxy running to encrypt and decrypt your data, so make sure you have followed the Getting started with CipherStash Proxy guide. You will need the client ID and client key from Step 3: Create a client to update the CipherStash Proxy configuration.

For this example, we will be modifying the configuration file rather than using environment variables:

| Setting | Description | Default | Environment Variables | | encryption.mode | Encryption mode; can be encrypted or passthrough | passthrough | CS_ENCRYPTION__MODE | | encryption.client_id | Client ID for encryption, required if mode is passthrough | None (Required) | CS_ENCRYPTION__CLIENT_ID | | encryption.client_key | Client key for encryption, required if mode is passthrough | None (Required) | CS_ENCRYPTION__CLIENT_KEY |

1workspace_id = "workspace_id"
2client_access_key = "client_access_key"
3
4[encryption]
5mode = "encrypted"
6client_id = "client_id"
7client_key = "client_key"
8
9[database]
10username = "username"
11password = "password"
12name = "database_name"
13host = "localhost"
14port = 5432

You can see all the configuration options for CipherStash Proxy in the CipherStash Proxy documentation.

Step 3: Restart CipherStash Proxy

After updating the configuration file, you will need to restart CipherStash Proxy to apply the changes. Depending on how you have the proxy running, you will need to restart the Docker container or the service.

Encrypting your data

At this point, you have CipherStash Proxy running and configured to encrypt and decrypt data. Every time data is sent to the database, it will be encrypted and stored in the encrypted columns and all data retrieved from the database will be decrypted in the CipherStash Proxy.

Step 1: Encrypt the existing data

Before you can use CipherStash Proxy to encrypt and decrypt your data, you need to encrypt the existing data in the fields you have defined in the dataset configuration.

CipherStash Encryption Migrator is bundled into the cipherstash-proxy Docker image.

Assuming that your running container is called cipherstash-proxy and that the primary key of your table is id, to encrypt existing values in the email column in the users table, run:

1docker exec cipherstash-proxy cipherstash-migrator --table users --columns email

To specify a different primary key use the --primary-key option.

See CipherStash Encryption Migrator for further details.

Output

There will be no output when calling docker exec. When executing via a container, the output of the cipherstash-migrator command will be inside the container runtime. If you view the output of the Docker container, you will see the the output of the command, including any errors messages.

Step 2: Move to full encryption

Once you have encrypted the existing data, you can now move to full encryption. Full encryption mode means that all data will be encrypted in the database and all queries will be executed on the encrypted data rather than the plaintext data.

Step 2.1: Update your dataset configuration

Update your dataset configuration to change the mode of the fields you want to encrypt from plaintext-duplicate to encrypted.

1tables:
2  - path: users
3    fields:
4      - name: email
5        in_place: false
6        cast_type: utf8-str
7        mode: encrypted
8        indexes:
9          - version: 1
10            kind: match
11            tokenizer:
12              kind: ngram
13              token_length: 3
14            token_filters:
15              - kind: downcase
16            k: 6
17            m: 2048
18            include_original: true
19          - version: 1
20            kind: ore
21          - version: 1
22            kind: unique

Now push the updated config to CipherStash and restart our example application:

1stash datasets config upload --file config/dataset.yml --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY

Step 3: Drop the plaintext columns

Once you've updated the dataset configuration, you can modify your database to set all the plaintext columns to a string literal value like redacted. You must execute this step with caution, as it will remove the plaintext data from your database and it must be executed directly on the database without being routed through CipherStash Proxy.

1UPDATE users SET email = 'redacted' WHERE email IS NOT NULL;

Data loss warning

This operation will remove the plaintext data from your database. Ensure you have a backup of your data before executing this command.

Querying your encrypted data

Now that you have encrypted your data, you can query the encrypted data using any PostgreSQL client that is connected to the CipherStash Proxy. When you query the data, the CipherStash Proxy will decrypt the data before returning it to you.

Here is a psql example of how you can query the users table:

1SELECT * FROM users;

When you run this query, you will see the decrypted data returned to you.

Querying encrypted data

When you query the data, the CipherStash Proxy will decrypt the data before returning it to you and the data remains completely encrypted in the database.

That's it

You have now successfully encrypted your data using CipherStash Encrypt!

Previous
CipherStash Audit