Dataset config
A dataset is a collection of tables and fields that you want to encrypt. The configuration includes:
- The types of indexes set for each column in the table
- The mode for each index
- The data type
- Settings for match indexes, eg tokenization settings.
We suggest creating a separate dataset for each environment you are handling sensitive data in. For details on how to do this, see Configuring datasets.
Option | Description | Example Setting |
---|---|---|
tables | List of tables to encrypt | users |
tables.path | Name of the table to encrypt | users |
tables.fields | List of fields to encrypt | name, email |
tables.fields.name | Name of the field to encrypt | name, email |
tables.fields.in_place | Whether encrypted data is stored in the same column as plaintext | false |
tables.fields.cast_type | Type of data stored in the column | utf8-str |
tables.fields.mode | Encryption mode | plaintext |
tables.fields.indexes | List of indexes to create for the field | |
tables.fields.indexes.version | Version of the index | |
tables.fields.indexes.kind | Type of index | match |
tables.fields.indexes.tokenizer | Tokenizer used to tokenize the data | ngram |
tables.fields.indexes.tokenizer.kind | Type of tokenizer | ngram |
tables.fields.indexes.tokenizer.token_length | Length of the tokens generated by the tokenizer | 3 |
tables.fields.indexes.token_filters | List of filters applied to the tokens | downcase, stop |
tables.fields.indexes.token_filters.kind | Type of filter | |
tables.fields.indexes.k | Number of tokens generated for each value | 6 |
tables.fields.indexes.m | Number of buckets used to store tokens | 2048 |
tables.fields.indexes.include_original | Whether the original value is stored in the index | true |