It’s been a fun rabbit hole. Maybe I could have saved myself a lot of trouble by paying attention to this throw-away line about the JWT header emitted when using OIDC Authentication on an Application Load Balancer:

Standard libraries are not compatible with the padding that is included in the Application Load Balancer authentication token in JWT format.

JWTS are neat and this blag post won’t exhaustively detail them; instead, this post is mostly for folks who kind of sort of mostly know what a JWT is, and are getting weird verification errors with ALB JWTs and want to know why.

In brief though, a common JWT format comprises of a header, a payload, and a signature. Each is encoded into a base64 string and then joined with a period. E.g.

header  = {"alg":"ES256","typ":"JWT"}
payload = {"user":"wolfman","exp":1742765592}

header/encoded  = "eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9Cg=="
payload/encoded = "eyJ1c2VyIjoid29sZm1hbiIsImV4cCI6MTc0Mjc2NTU5Mn0K"
signature       = "SnVzdCAzMiBjaGFyYWN0ZXJzLCBoYW5naW5nIG91dC4K"

jwt = "eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9Cg==.eyJ1c2VyIjoid29sZm1hbiIsImV4cCI6MTc0Mjc2NTU5Mn0K.SnVzdCAzMiBjaGFyYWN0ZXJzLCBoYW5naW5nIG91dC4K"

This example is wrong though. The terminology section of the JWT RFC specifies that the trailing = characters will be omitted from the encoded components, whereas I’ve left in the padding in the above example’s header. This JWT is technically not compliant and will cause fun problems depending on the strictness of the JWT library that tries to parse it.

AWS Application Load Balancers also emit non-spec-compliant JWTs because, as the throw-away line in the docs indicate, they too include padding in their format. This is where I encountered an interesting (and also confusing) failure: a library that successfully parsed a JWT’s claims, but then failed to validate it, even as a different library (PyJWT library from the AWS example) succeed.

Ergonomics, Refactoring, and Surprises Link to heading

The reason for this specific failure is not strictly due to how correctly a library implement the spec. Rather, it has a lot to do with how the code in these libraries is organized for maintainability and simplicity. The spec plays a role all right, but it’s more as a secondary effect here.

It might seem reasonable that when a library parses and verifies a JWT, it looks at the <header>.<payload>, compares it to the signature, and then goes about decoding the claims. That may be multiple steps inside the library, but for someone consuming it we’re almost always interested in both steps, so libraries make it convenient to do both in a single call. Something like:

token, err := jwt.Parse(data, pubkey)

Sure, you could split this into two separate actions, but you will almost always want your claims parsed if validation succeeds, and you will almost never want bother wth the claims if validation fails. If one of these two steps fails, the err can tell you it was either a decoding issue or a validation issue or something else. This is also safe because a consumer of the library has to go out of their way to successfully parse a jwt while failing to check its signature, validity, or expiration.

It’s also convenient for this function to return an error if decoding fails for other reasons, or if the token is expired, or if you want to quickly reject a token lacking specific claims:

token, err := jwt.Parse(data, pubkey, jwt.WithExpectedClaim("group"))

Back up a sec. “Expired?” We’d have to extract the claims to know that. Is it more important for us to extract the claims first, or verify the signature? Trick question: it’s an implementation detail that mostly affects the maintainer’s life. The way the library is organized and how different parts interact contributes to the resulting order of operations. You imported the library because you didn’t want to re-invent this stuff, but someone has to!

And that’s how we arrive at the funny edge case of successfully parsing the claims of an Application Load Balancer’s JWT, while also failing to verify it. Consider:

  1. The library parses the claims first because the caller might be asking about specific claims
  2. Now that the payload has been successfuly decoded, it throws away the base64 representation
  3. The internal (decoded) model of the JWT is passed around to various other helper functions
  4. Eventually, the header and payload are re-encoded to base64 and sent for verification

Any padding that previously existed in step 1 will have been stripped by by step 4. The signature was generated for the padded version, so when the jwt is re-encoded without padding, the signature check fails.

In Summary Link to heading

AWS ALB JWTs are not spec-compliant, which causes the padding to be “lost” by JWT libraries that can successfully decode non-spec-compliant JWTs, but then re-encode them to a differet (spec-compliant) string prior to validation. The re-encoded header and payload are different from the original ones (the ones which were signed!) causing validation to fail.

I’m certainly not the first person to encounter this. I will certainly not be the last, which is to say I hope this blag post will help the next person who encounters it. As for what to do, well, that’s tricksy.

I’m currently noodling on a few different thoughts. I’ll get back to y’all.

Update postscript! Link to heading

My favourite library for this sort of jazz, jwx, now supports a WithBase64Encoder() option for handling ALB JWT shenanigans! More here and here.