Dataset config

A dataset is a collection of tables and fields that you want to encrypt. The configuration includes:

  • The types of indexes set for each column in the table
  • The mode for each index
  • The data type
  • Settings for match indexes, eg tokenization settings.

We suggest creating a separate dataset for each environment you are handling sensitive data in. For details on how to do this, see Configuring datasets.

OptionDescriptionExample Setting
tablesList of tables to encryptusers
tables.pathName of the table to encryptusers
tables.fieldsList of fields to encryptname, email
tables.fields.nameName of the field to encryptname, email
tables.fields.in_placeWhether encrypted data is stored in the same column as plaintextfalse
tables.fields.cast_typeType of data stored in the columnutf8-str
tables.fields.modeEncryption modeplaintext
tables.fields.indexesList of indexes to create for the field
tables.fields.indexes.versionVersion of the index
tables.fields.indexes.kindType of indexmatch
tables.fields.indexes.tokenizerTokenizer used to tokenize the datangram
tables.fields.indexes.tokenizer.kindType of tokenizerngram
tables.fields.indexes.tokenizer.token_lengthLength of the tokens generated by the tokenizer3
tables.fields.indexes.token_filtersList of filters applied to the tokensdowncase, stop
tables.fields.indexes.token_filters.kindType of filter
tables.fields.indexes.kNumber of tokens generated for each value6
tables.fields.indexes.mNumber of buckets used to store tokens2048
tables.fields.indexes.include_originalWhether the original value is stored in the indextrue