Reference
Dataset configuration
A dataset is a collection of tables and fields that you want to encrypt. The configuration includes:
- The types of indexes set for each column in the table
- The mode for each index
- The data type
- Settings for match indexes, eg tokenization settings.
We suggest creating a separate dataset for each environment you are handling sensitive data in. This allows the dataset configuration to be updated and tested without affecting another environment. When creating the dataset, make sure you specify a clear and unique description that identifies what the dataset is used for.
Dataset management
Creating a dataset
To create a dataset run the following command in the CipherStash CLI. If you don't have the CLI installed, please follow the getting started guide.
1stash datasets create my_dataset_name --description "Test application"
Uploading a dataset
Use the CipherStash CLI to upload a dataset configuration to your account. Note you will need to have created a client key before you can upload a dataset configuration.
1stash datasets config upload --file dataset.yml --client-id $CS_CLIENT_ID --client-key $CS_CLIENT_KEY
Setting the Workspace region
You may have multiple workspaces in different regions. To set the region for the workspace you are working in, use the following option in the CLI.
1--vitur-host https://us-east-1.aws.viturhosted.net/
Where us-east-1
is the region you are working in. You can find the region you are working in by checking the workspace overview in the CipherStash dashboard.
Configuration reference
Option | Description | Example Setting |
---|---|---|
tables | List of tables to encrypt | users |
tables.path | Name of the table to encrypt | users |
tables.fields | List of fields to encrypt | name, email |
tables.fields.name | Name of the field to encrypt | name, email |
tables.fields.in_place | Whether encrypted data is stored in the same column as plaintext | false |
tables.fields.cast_type | Type of data stored in the column | utf8-str |
tables.fields.mode | Encryption mode | plaintext |
tables.fields.indexes | List of indexes to create for the field | |
tables.fields.indexes.version | Version of the index | |
tables.fields.indexes.kind | Type of index | match |
tables.fields.indexes.tokenizer | Tokenizer used to tokenize the data | ngram |
tables.fields.indexes.tokenizer.kind | Type of tokenizer | ngram |
tables.fields.indexes.tokenizer.token_length | Length of the tokens generated by the tokenizer | 3 |
tables.fields.indexes.token_filters | List of filters applied to the tokens | downcase, stop |
tables.fields.indexes.token_filters.kind | Type of filter | |
tables.fields.indexes.k | Number of tokens generated for each value | 6 |
tables.fields.indexes.m | Number of buckets used to store tokens | 2048 |
tables.fields.indexes.include_original | Whether the original value is stored in the index | true |