Troubleshooting
Diagnose and resolve issues with CipherStash Proxy, including ZeroKMS connection problems and slow statement performance.
Troubleshooting ZeroKMS connections
Recommended log settings
For a quick check on ZeroKMS latency and connection issues, set:
CS_LOG__ZEROKMS_LEVEL=debug
CS_LOG__ENCRYPT_LEVEL=debugFor a deeper investigation, also enable slow statement detection:
CS_LOG__ZEROKMS_LEVEL=trace
CS_LOG__SLOW_STATEMENTS=true
CS_LOG__SLOW_STATEMENT_MIN_DURATION_MS=500
CS_LOG__SLOW_DB_RESPONSE_MIN_DURATION_MS=50What to look for in logs
| Signal | Meaning |
|---|---|
Initializing ZeroKMS ScopedCipher (cache miss) | A network call to ZeroKMS is about to happen. Frequent occurrences indicate cache churn. |
Connected to ZeroKMS with high init_duration_ms | Slow cipher init. Healthy values are under 200ms. Values over 1s trigger a warning. |
Use cached ScopedCipher | Cache hit (fast path, no network call). |
ScopedCipher evicted from cache with cause: Size | Cache too small for workload. Increase cipher_cache_size. |
Error initializing ZeroKMS with high init_duration_ms | Network timeout to ZeroKMS. |
Error initializing ZeroKMS with low init_duration_ms | Credential or configuration error. |
Key metrics
Enable Prometheus with CS_PROMETHEUS__ENABLED=true and watch these metrics:
| Metric | Why |
|---|---|
cipherstash_proxy_keyset_cipher_init_duration_seconds | Distribution of ZeroKMS init times including network latency. |
cipherstash_proxy_keyset_cipher_cache_hits_total / cache_miss_total | Cache hit ratio. Should be above 95% in steady state. |
cipherstash_proxy_statements_session_duration_seconds minus execution_duration_seconds | Encryption overhead per statement. |
cipherstash_proxy_encryption_error_total / decryption_error_total | Spikes indicate ZeroKMS connectivity issues. |
Useful PromQL queries
# P99 cipher init latency
histogram_quantile(0.99, rate(cipherstash_proxy_keyset_cipher_init_duration_seconds_bucket[5m]))
# Cache hit ratio
rate(cipherstash_proxy_keyset_cipher_cache_hits_total[5m])
/ (rate(cipherstash_proxy_keyset_cipher_cache_hits_total[5m])
+ rate(cipherstash_proxy_keyset_cipher_cache_miss_total[5m]))Quick checklist
- Cache hit ratio low? Tune
CS_SERVER__CIPHER_CACHE_SIZE(default: 64) andCS_SERVER__CIPHER_CACHE_TTL_SECONDS(default: 3600). init_duration_msabove 1s? Network latency to ZeroKMS. Check DNS, firewall rules, and regional proximity.- Large gap between session and execution duration? Overhead is in encrypt/decrypt, not the database.
- Frequent evictions? Increase
cipher_cache_sizeto match your workload's keyset count.
Slow statement logging
CipherStash Proxy includes built-in slow statement logging for diagnosing performance issues.
Configuration
Enable slow statement logging via environment variables:
# Enable slow statement logging (required)
CS_LOG__SLOW_STATEMENTS=true
# Optional: Set minimum duration threshold (default: 2000ms)
CS_LOG__SLOW_STATEMENT_MIN_DURATION_MS=500
# Optional: Set log level (default: warn when enabled)
CS_LOG__SLOW_STATEMENTS_LEVEL=warn
# Recommended: Use structured logging for parsing
CS_LOG__FORMAT=structuredSlow statement log format
When a statement exceeds the threshold, the proxy logs a detailed breakdown:
{
"client_id": 1,
"duration_ms": 10500,
"statement_type": "INSERT",
"protocol": "extended",
"encrypted": true,
"encrypted_values_count": 3,
"param_bytes": 1024,
"query_fingerprint": "a1b2c3d4",
"keyset_id": "uuid",
"mapping_disabled": false,
"breakdown": {
"parse_ms": 5,
"encrypt_ms": 450,
"server_write_ms": 12,
"server_wait_ms": 9800,
"server_response_ms": 233
}
}Query fingerprints
Query fingerprints are ephemeral and instance-local. Each proxy instance generates a unique random key at startup to compute query_fingerprint values. Fingerprints change when the proxy restarts and cannot be correlated across different proxy instances. This is intentional for security. Use fingerprints for correlation within a single proxy instance's runtime only.
Prometheus histogram labels
Duration histograms include labels for filtering:
statement_type:insert,update,delete,select,otherprotocol:simple,extendedmapped:true,falsemulti_statement:true,false
Example PromQL queries:
# Average INSERT duration
histogram_quantile(0.5, rate(cipherstash_proxy_statements_session_duration_seconds_bucket{statement_type="insert"}[5m]))
# Compare encrypted vs passthrough
histogram_quantile(0.99, rate(cipherstash_proxy_statements_session_duration_seconds_bucket{mapped="true"}[5m]))ZeroKMS cipher init metric
cipherstash_proxy_keyset_cipher_init_duration_seconds measures time for cipher initialization including the ZeroKMS network call. High values indicate ZeroKMS connectivity issues.
Interpreting results
| Symptom | Likely Cause |
|---|---|
High encrypt_ms | ZeroKMS latency or large payload. |
High server_wait_ms | Database latency. |
High cipher_init_duration | ZeroKMS cold start or network issues. |
High parse_ms | Complex SQL or schema lookup. |