It’s been a fun rabbit hole. Maybe I could have saved myself a lot of trouble by paying attention to this throw-away line about the JWT header emitted when using OIDC Authentication on an Application Load Balancer:
Standard libraries are not compatible with the padding that is included in the Application Load Balancer authentication token in JWT format.
JWTS are neat and this blag post won’t exhaustively detail them; instead, this post is mostly for folks who kind of sort of mostly know what a JWT is, and are getting weird verification errors with ALB JWTs and want to know why.
In brief though, a common JWT format comprises of a header, a payload, and a signature. Each is encoded into a base64 string and then joined with a period. E.g.
header = {"alg":"ES256","typ":"JWT"}
payload = {"user":"wolfman","exp":1742765592}
header/encoded = eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9Cg==
payload/encoded = eyJ1c2VyIjoid29sZm1hbiIsImV4cCI6MTc0Mjc2NTU5Mn0K
signature = SnVzdCAzMiBjaGFyYWN0ZXJzLCBoYW5naW5nIG91dC4K
jwt = eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9Cg==.eyJ1c2VyIjoid29sZm1hbiIsImV4cCI6MTc0Mjc2NTU5Mn0K.SnVzdCAzMiBjaGFyYWN0ZXJzLCBoYW5naW5nIG91dC4K
This example is wrong though. The terminology section of the JWT RFC specifies that the trailing =
characters will be omitted from the encoded components, whereas I’ve left in the padding in the above example’s header. This JWT is technically not compliant and will cause fun problems depending on the strictness of the JWT library that tries to parse it.
AWS Application Load Balancers also emit non-spec-compliant JWTs because, as the throw-away line in the docs indicate, they too include padding in their format. This leads to an interesting (and also confusing) failure: successful parsing of the JWT’s claims, but failure to validate it, even as the PyJWT library from the AWS example succeed.
Ergonomics, Refactoring, and Surprises
The reason for this specific failure is not strictly due to how correctly a library implement the spec. It’s certainly not pettiness on the maintainer’s part. Rather, it has a lot to do with how code is organizer for maintainability and simplicity. The spec plays a role all right, but more as a secondary effect of the code organization.
It might seem reasonable that when a library parses and verifies a JWT, it looks at the <header>.<payload>
, compares it to the signature, and then goes about decoding the claims. That’s the thing – most of the time it’s really convenient to parse claims and perform validation as a single step. Something like:
token, err := jwt.Parse(data, pubkey)
You could split this into two separate actions, but you will almost always want your claims parsed if validation succeeds, and you will almost never want bother wth the claims if validation fails. It’s also convenient for this function to return an error if decoding fails for other reasons, or if the token is expired, or if you want to quickly reject a token lacking specific claims:
token, err := jwt.Parse(data, pubkey, jwt.WithExpectedClaim("group"))
Hold up. Expired? We’d have to extract the claims to know that. Is it more important to first extract the claims, or verify it? Trick question: it’s an implementation detail that mostly affects the maintainer’s life. The way the library is organized and how different parts interact contributes to the resulting order of operations. You imported the library because you didn’t want to re-invent this stuff, but someone has to!
And that’s how we arrive at the funny edge case of successfully parsing the claims of an Application Load Balancer’s JWT, while also failing to verify it. Consider:
- The library parses the claims first because the caller might be asking about specific claims
- Now that the payload has been successfuly decoded, it throws away the base64 representation
- The internal model of the JWT is passed around to various other helper functions
- Eventually, the header and payload are re-encoded to base64 and sent for verification
Any padding that used to exist will have been stripped by the last step. The signature was generated for the padded version, so verification of the re-encoded version fails.
In Summary
AWS ALB JWTs are not spec-compliant, which causes the padding to be “lost” by JWT libraries that can successfully decode non-spec-compliant JWTs, but then correctly re-encode them prior to validation. The re-encoded header and payload are different from the original ones (the ones which were signed!) causing validation to fail.
I’m certainly not the first person to encounter this. I will certainly not be the last – and because of that, I’m eternally hopeful that this post will help someone. As for what to do about it, well, that’s tricky.
I’m currently noodling on a few different thoughts. I’ll get back to y’all.