AWS KMS for Envelope Encryption

April 24, 2021

I think many AWS users first encounter the KMS service as a way to encrypt other AWS resources. It has a really good usability story for things like block storage: here’s a managed encryption service where you enable a checkbox at instance launch and now your disk is encrypted. The resulting encrypted snapshots are no use if exfiltrated without access to the KMS key.

That’s why what I say next requires some nuance: encrypting AWS resources is Good Actually because of the resource-level protection it offers, but it does not (for example) magically encrypt columns in an RDS database; they are still in plaintext to anyone with access to the database. Encrypting the columns requires changes to the app the manages the database.

This is hard – but, KMS makes it a little less hard! Beyond encrypting AWS resources for compliance wins, KMS offers an API for symmetrical keys that can transform plaintext into ciphertext and back again (it offers other things including asymmetrical keys but we’re going to focus on symmetrical encryption). Those encryption APIs are what this post will cover.

I’m going to walk through three ways to use KMS to accomplish this, each with trade-offs in terms of cost, simplicity, and latency. For the sake of brevity, all examples will use python with zero error checking (but, y’know, definitely error check in production).


KMS Encrypt + Decrypt

KMS provides an encrypt and decrypt API for every symmetrical key, which is useful for handling small amounts of data. The official docs hint that you should maybe not use the encrypt/decrypt APIs directly outside of simple use cases, and at the end of this example we’ll see why.

Let’s assume you want to protect some PII columns in your users table:

postgres=> SELECT name, email FROM users;
     name     |       email
--------------+--------------------
 Bobby Tables | devnul@example.org
 Duke Caboom  | keanu@example.org
(2 rows)

Here’s how we would use the encrypt API to turn plaintext into ciphertext prior to storing it:

import base64
import boto3

# Get a KMS client
client = boto3.client("kms", "ca-central-1")

# Encrypt the plaintext
response = client.encrypt(KeyId="<your-kms-key-arn>", Plaintext=b"Bobby Tables")

# Encode into base64 for easy storage
encoded = base64.b64encode(response["CiphertextBlob"])

# Don't actually print it -- store it in the database!
print(encoded)
# Output:
b'YjExYzg2NjEtYzMwNS00MDJlLWJmMmEtYTRiOTg0OThiOGJiy29t+lPlRTdVYudcruNdAYlEh8gw/Ys/mRKdH7wNBf0ye4g+sZktHw=='

Note that we don’t have to base64 encode the resulting ciphertext, but it can make handling the ciphertext easier. Anyway, if you encrypt your values prior to storing them in the database and you’ll get tables that look like this:

postgres=> SELECT name, email FROM users;
     name     |       email
--------------+--------------------
 YjExYzg2N... | ZjgzZmQ0M2YtOGE...
 YTIzMzE2O... | NTQ1ZjM4ZGQtN2N...
(2 rows)

As stolen database dumps go, this one’s not too valuable.

There are some design trade-offs though. For example: your app would no longer be able to perform lookups by email address when the address is encrypted. Unlike hashing, KMS encryption produces different ciphertext each time you feed it the same plaintext. This is especially inconvenient if the app uses the email address as a login credential. If the app doesn’t need to lookup by email address, great, no problem, but if it does, you would need to store a second version of the address deterministically hashed and salted so that your app can look that up.

When your app needs to decrypt a column, run the encryption steps in reverse:

import base64
import boto3

# Get a KMS client
client = boto3.client("kms", "ca-central-1")

# Decode the base64 to ciphertext
encrypted = base64.b64decode(b'YjExYzg2NjEtYzMwNS00MDJlLWJmMmEtYTRiOTg0OThiOGJiy29t+lPlRTdVYudcruNdAYlEh8gw/Ys/mRKdH7wNBf0ye4g+sZktHw==')

# Decrypt to plaintext
response = client.decrypt(CiphertextBlob=encrypted)

# Don't actually print it -- just do what the app needs to do and then throw it away!
print(response["Plaintext"])
# Output:
b'Bobby Tables'

Notice that for the decrypt call we didn’t need to provide a KMS key ID! That’s because KMS encodes the key ID into the ciphertext itself during encryption. AWS recommends specifying the KeyId during decrypt if you know it to make sure you use the key you intended to, but it’s not required. This is so that users can (in their own words) “decrypt ciphertext decades after it was encrypted, even if they’ve lost track of the CMK ID.”

Score card

Ok, that’s it for the simple example, let’s see how we did:

This workflow looks perfect for simple use cases like simple and infrequently accessed secrets.

Let’s look next at envelope encryption.


Envelope Encryption

Let’s say we like the simplicity of KMS, but we want to reduce the number of trips we’re making to the KMS API. We could do that by using our own non-KMS private key to encrypt columns, and then protect that key with a single call to KMS. This is what’s known a envelope encryption.

Envelope encryption is the practice of encrypting plaintext data with a data key, and then encrypting the data key under another key. – Via KMS concepts

So you encrypt your plaintext with a run-of-the-mill private key (“data key” in KMS termns) and then protect that private key by encrypting it with KMS via its encrypt/decrypt APIs. KMS does not divulge its backing keys, so the plaintext data key only ever exists in memory when your app decrypts it with KMS. The KMS backing key can’t be stolen which makes it a good top-level key.

You can choose to generate these data keys yourself and then make a call to KMS to encrypt it for you, but KMS has a convenience API to cover this for you. The generate_data_key KMS API will quickly generate a brand new data key in both a plaintext format (which you can use immediately to encrypt stuff!) and a second version of the data key already pre-encrypted by the KMS backing key.

Here’s an example that uses a per-row data key:

import base64
import boto3
from cryptography.fernet import Fernet

# Get a KMS client
client = boto3.client("kms", "ca-central-1")

# Generate a new data key
response = client.generate_data_key(KeyId="<your-kms-key-arn>", KeySpec="AES_256")

# The response includes both plaintext and encrypted versions of the data key
plain_data_key = base64.b64encode(response["Plaintext"])
encrypted_data_key = base64.b64encode(response["CiphertextBlob"])

# Use the plaintext key to encrypt our data, and then throw it away
f = Fernet(plain_data_key)
crypted = f.encrypt(b"Bobby Tables")

# Don't print this -- store it!
print(crypted)
print(encrypted_data_key)
# Output:
b'gAAAAABggL2WGRqo7SUnsorKhlIRMOv10M09_YNj0v9tYTSKCtNTP7V8G6BMmma44_vtXKRGCTz5XdQZ4CUSWMJkVLsceF363Q=='
b'OWFhNjdhOTktM2I0Yy00MmIyLTllYjEtNzE5NjQwMjk3ZTM2asaLbT2LseatVNtdMi+EtBpijsIC834zTrdIIGem6eRtBpYp+cJQYQ=='

The idea here is to generate a data key for each new row, use the plaintext key to encrypt all of the columns for that row, and then store the encrypted columns alongside the KMS-encrypted data key. The tables should look something like this:

postgres=# SELECT name, email, key FROM users;
     name     |       email        |    data_key
--------------+--------------------+----------------
 gAAAAABgg... | gAAAAABggL_38lg... | OWFhNjdhOTk...
 gAAAAABgg... | gAAAAABggL-d_gH... | N2ZhZjhhZmY...
(2 rows)

The name and email are encrypted via the data key, and the data key is encrypted via KMS.

As before, decrypting the data means running everything in reverse:

import base64
import boto3
from cryptography.fernet import Fernet

# Get a KMS client
client = boto3.client("kms", "ca-central-1")

# Decrypt the data key from the data_key column
decrypted_key = client.decrypt(CiphertextBlob=base64.b64decode(b'OWFhNjdhOTktM2I0Yy00MmIyLTllYjEtNzE5NjQwMjk3ZTM2asaLbT2LseatVNtdMi+EtBpijsIC834zTrdIIGem6eRtBpYp+cJQYQ=='))

# Encode decrypted key to base64
plain_data_key = base64.b64encode(response["Plaintext"])

# Use the plaintext key to decrypt the user name data
f = Fernet(plain_data_key)
plaintext = f.decrypt(b'gAAAAABggL2WGRqo7SUnsorKhlIRMOv10M09_YNj0v9tYTSKCtNTP7V8G6BMmma44_vtXKRGCTz5XdQZ4CUSWMJkVLsceF363Q==')
# Output:
b'Bobby Tables'

Score card

Envelope encryption is a bit more involved. We needed to get the encrypted data key from the database, decrypt it using KMS, and then use that decrypted key to decrypt the other columns. Let’s see how this stacks up to the first example:

We have a bit of additional complexity here but with lower cost and lower latency. Also to reiterate: every database row has its own data key. An attacker with a database dump must have access to the KMS key to work around this because there is no scenario where the KMS backing key can be exported to decrypt the data keys offline.

For the last example, what if we have an app moving to AWS that has been doing column encryption all along using its own private key, and how can we start using KMS without a giant storage re-write?


Envelope Encryption: Startup Edition

Apps that grow up through the twelve factor methodology usually store their configuration in environment variables. The Startup Edition of column encryption often involves a single static private key will be generated on someone’s laptop and then deployed as an environment variable when the app goes to prod.

Encryption and decryption will be handled something like this:

from cryptography.fernet import Fernet
import os

# Get encryption key from environment variable
secret_key = os.environ["SECRET_KEY"]
f = Fernet(secret_key)

# Encrypt and decrypt things with it
f.encrypt(...)
f.decrypt(...)

A lot of noise has been made about configuration in environment variables. Put that aside for a sec and pat yourself on the back for encrypting columns. That’s a win. It’s also a pretty hard sell to completely rewrite the way this app stores its data if there’s a lot of it (it’s doable, but that’s for another time). Even if you are going to commit to rewriting the data, you might want a temporary split-the-difference solution until you get there.

Envelope Encryption: Startup Edition to the rescue.

This temporary solution involves a couple manual steps, but consider that security improvements are often incremental, and it’s hard to draw the whole owl without a starting point. As a one-time action, take that private key and then push it through KMS:

import base64
import boto3
import os

# Get encryption key from environment variable
secret_key = os.environ["SECRET_KEY"]

# Get a KMS client
client = boto3.client("kms", "ca-central-1")

# Encrypt the secret key
response = client.encrypt(KeyId="<your-kms-key-arn>", Plaintext=secret_key.encode())

# Encode into base64 for easy storage
encoded = base64.b64encode(response["CiphertextBlob"])

# Output it somewhere safe, where it won't be logged, but only this one time!
print(encoded)
# Output:
b'N2RkMjE1ZmQtMmQ3MS00ZjkxLTgxYjgtMDRjNTFiMWI2ZTQ4gg5ItuHJgV38W7/mwtU72QbTa2/WJPH2kTaoy7ipEIvZ79Pp6BL3CBzjYmZURPfI4xlD/gGFqa3oPk66'

Tada.

Now you set the app’s SECRET_KEY environment variable to the encrypted blob that was output above. “Isn’t that a new key?” Only if you use it that way! The idea is that when your app starts up, it will use KMS to decrypt the SECRET_KEY ciphertext back into its original secret key which it can then use to carry on encrypting and decrypting database columns.

import base64
import boto3
from cryptography.fernet import Fernet
import os

# Get encrypted key from environment variable
encrypted_secret_key = os.environ["SECRET_KEY"]

# Get a KMS client
client = boto3.client("kms", "ca-central-1")

# Decode the base64 to ciphertext
encrypted = base64.b64decode(encrypted_secret_key.encode())

# Decrypt to plaintext
response = client.decrypt(CiphertextBlob=encrypted)

# Retrieve secret key, which can now be used the same as before
secret_key = response["Plaintext"]
f = Fernet(secret_key)

# Encrypt and decrypt things with it
f.encrypt(...)
f.decrypt(...)

This method used to be how Lambdas gained access to secrets, since Lambda configuration was typically done through environment variables. To add a secret environment variable, one would encrypt it with KMS first, and the Lambda would decrypt that secret during its cold start.

That was in the days before Encrypted SSM Parameters though. Encrypted SSM Parameters can be much simpler for storing one-off are static secrets, and the encryption is still handled by KMS but transparently. The use case for static secrets in KMS is far different than database column encryption, but if you only have static secrets, it takes a lot less boiler plate to retrieve them:

import boto3
from cryptography.fernet import Fernet

# Get an SSM client
client = boto3.client("ssm", "ca-central-1")

# Get the parameter; KMS will decrypt it transparently
response = client.get_parameter(Name="/path/to/secret", WithDecryption=True)

# Retrieve secret key, which can now be used the same as before
secret_key = response["Parameter"]["Value"]
f = Fernet(secret_key)

# Encrypt and decrypt things with it
f.encrypt(...)
f.decrypt(...)

Score card

High marks for the developer experience and opex. The lower score for “Protection” reveals what a sham these score cards are, since it’s pretty contextual, and you are still encrypting your database columns at least.


Wrap-Up

I’ve been trying to write shorter posts so this write-up is a bit of a failure. KMS is complicated though, so for this post to be useful, it needs to fail at least a little bit.

As always there are a zillion gotchas around encryption and not enough words to cover them. In particular:

And probably more.

Happy encrypting, everyone.