• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

View
 

Introduction

Page history last edited by Chris Messina 15 years, 10 months ago

Introduction

I am not a very bright person. While trying to implement a Lua based OpenID server, I found that the two primary references left me scratching my head. The flow described in the official specification was too high-level for me to get a clear understanding of the protocol in action. The OpenID specification at this site was too concise for me to glean a clear understanding of the protocol flow and the security implications. After finally finishing the first pass of my Lua OpenID server, I decided it would be wise to put down what I learned about the protocol on the way.

This page is targeted for those who want a full understanding of the protocol flow without having to resort to cross-referencing the API official specification to an implementation to figure out what is actually going on.

Act I, Scene 1

To begin, let's assign some of our favorite cryptographic names to the [actors][] in the digital play that is OpenID. **Alice** will be our *End User*. She is your average user, browsing the web with her favorite browser, Operfox Explorer. **Operfox Explorer** is the *User-Agent*, the program that handles all of Alice's clicking, typing, and webby needs. Alice has just finished reading the latest entry on her uncle's blog and wants to leave a comment. Her uncle, **Bob**, has recently upgraded his site (Bob's Blog) to support OpenID. He paid a high-school student $20 to integrate an OpenID *Consumer* with his blog's comment system, so Bob's Blog now supports the use of OpenID. Luckily, Alice is already a registered user with **Carol**, her favorite blogging site. Carol is big into this whole OpenID thing, and she provides Alice with an URL, **http://carol.example.com/Alice**, that Alice can use as her OpenID *Identity*. Carol is a real tech-whiz, and she has set things up so that this identity URL can be verified through Carol's OpenID *Server*.

[actors]: http://www.openid.net/specs.bml#terms "OpenID Glossary"

Oh, one last thing. Alice has a twin sister, Eve. Eve is a real prankster, and she loves to play practical jokes on both Alice and Bob. Her favorite trick is to leave comments on Bob's Blog, pretending to be Alice, and getting them all into trouble. She likes to meddle in other ways too. Neither Alice nor Bob really enjoy this behavior, and would love to be able to put a stop to it once and for all. The only reason Bob parted with a hefty $20 is because he read on the web that OpenID could keep Eve from playing her tricks. So we'll have to watch out for her too.

Let's recap the story so far:

- **Alice**: *End User*
- **Operfox Explorer**: *User-Agent*
- **Bob**: *Consumer*
- **Carol**: *Server*
- **http://carol.example.com/Alice**: *Identity*
- **Eve**: special guest-appearance as the *malicious attacker*

As the curtains draw back, Alice has just finished typing up her comment to Bob, and is found staring at a screen that is prompting her for an OpenID identity to authenticate at Bob's site.

What happens when Alice enters her OpenID identity? How does Bob really know it was Alice leaving the comment (and not Eve)? What did Carol have to do to make this all happen? Where can I download a copy of Operfox Explorer? All these questions and more will be answered next!

Who you callin' dumb?

The high-school kid that Bob paid to setup an OpenID consumer for him was pretty lazy. He told Bob it would take him a week to do, took the $10 down payment, and played Counter-Doom II for six days. On the last day he hurriedly threw together a stateless, form-based web page with no bells or whistles, grabbed his remaining $10, and went to buy burritos and a six-pack of Coca-Dew from the gas station down the street.

In his wake, he left Bob with a dumb consumer. No, seriously. His consumer operates in what the OpenID protocol calls "dumb mode". That is, his consumer is unable to track state. That's OK though, OpenID is down with that. The kid was so lazy, in fact, that he didn't even bother with JavaScript or anything fancy. He just threw together a few server-side CGIs and called it good. So not only was his consumer dumb, but it was old-school too.

This kid's laziness is our lucky break, however, as we can now watch what happens in this simplest of cases.

Down the Rabbit Hole

Alice, full of trepidation, types in her OpenID identity and clicks the "Authenticate" button. Her screen begins to swirl with the magical energies of the seventh planar domain of Cartesia, where Lord Bitrot holds sway over the lesser demons of...

Wait, that's not what happens. So what does happen? Well, Operfox Explorer processes Bob's comment form and submits it to one of the CGIs that the lazy kid wrote for the consumer. This CGI cleans up the identity URL that Alice provided, makes it canonical, and then runs off and fetches the document from that URL. Absolutely nothing special here. Just a regular old HTTP request of a plain HTML file. Now, roll up your sleeves... the magic starts here.

Once the consumer gets the HTML file from the identity URL provided, it scans it over, looking for a specific tag in the head section:

`<link rel="openid.server" href="/http://carol.example.com/openid-server.cgi">`

In Alice's case, the HTML file at her OpenID identity URL contains a link to the OpenID server at Carol's site. Now, as established, Bob's consumer isn't that bright, and it tends to avoid work like the plague. So now that it knows the URL of Carol's OpenID server, it's going to offload as much work as possible. Using a redirect, it sends Alice's Operfox Explorer browser straight over to Carol's OpenID server, providing some extra parameters along the way. In particular, Alice's browser gets redirected to http://carol.example.com/openid-server.cgi, with the following GET parameters:

- openid.mode = checkid_setup
 + This is one of the possible OpenID modes. This particular mode means that we want to check an identity, and 
    we're passing control of the User-Agent off to the server in order to do this. We expect the server to get 
    back to us once it's done.
- openid.identity = http://carol.example.com/Alice
 + This is the identity URL provided by Alice. It is what Bob wants to verify that she really owns, via Carol.
- openid.return_to = http://bob.example.com/comment.cgi?session_id=Alice&nonce=123456
 + This is where Bob wants Alice's browser to go after Carol gets done authenticating her. Bob expects Carol to 
   tack on some extra info to this URL to validate Alice's claim on this identity.

Eve Takes The Stage

OK, let's take a closer look at the openid.return\_to parameter. This is where Bob expects Alice to go once Carol has authenticated her. Somewhere in his blogging software, Bob is tying the session\_id of "Alice" to her pending comment, or whatever it is she's trying to do. We'll just ignore the fact that this is a horrible session identifier, and move on... we're not here to nitpick Bob's blogging software, after all. In addition to the return URL Bob already provided, Bob also expects Carol to tack on some extra info when all is said and done so that he can trust that it's really on the up-and-up.

But, we need more than just a simple session identifier. If Eve is on her game (and she is), all she would have to do to fake Alice's identity is to listen in on this entire conversation, record it, and then replay back the same sequence of messages later. Even if they're cryptographically signed and sealed and she can't see what's inside, she'll still be able to spoof Alice's identity. So, we need something extra to make sure Eve can't do this, and it's the extra nonce parameter. Bob's consumer puts a unique, random identifier in the return\_to URL for each authentication request he makes. Thanks to the nonce, this return\_to URL changes with every authentication request made by Bob, so Eve cannot just replay back a previous sequence of messages. As long as the cryptographic signing of further replies also signs this URL, Eve's replay of a previous session will fail because the nonce for the new authentication request will be different than the one used in the request she is trying to replay.

In the words of Swiper the Fox... "awww man!"

Back to Carol

Now, back to Carol. She's got a job to do... Bob has asked her to validate that Alice really owns the URL she claims to own, and has sent Alice's User-Agent back to Carol so they can work it all out. So Carol now has control of Alice's User-Agent and must authenticate that it's really Alice in control. How does Carol do that?

We. Don't. Know. We also don't care. Alice and Carol just have to work that out themselves. Carol's got Alice's User-Agent, so she can do whatever she wants. Maybe Carol has a public-key crypto system installed with two-factor authentication and smart card readers and a DNA scanner. Or, maybe she's hooked her OpenID server to a monkey that randomly says "valid" or "invalid". It's all black-box to us. Not our problem.

Weird, isn't it? But that's part of the mission of OpenID: it's about *identity* and not *trust*. All we know is that, according to Carol, Alice is who she says she is. If we don't trust Carol, then we really can't trust what she has to say about Alice, can we? But that's OK. What we DO know is that Alice is the same Alice that Carol always says she is, assuming Carol is consistent about this kind of thing. Further, Alice can't pretend to be someone she isn't, at least not without Carol's help. And *even then*, she can only pretend to be someone that also claims Carol as their official identity server. If things ever got this bad, we'd probably just stop paying attention to Carol altogether, as well as anyone who tried to use her as their identity server.

So through carefully implemented user accounts, or a random roll of the dice, Carol decides that it's really Alice on the other end of the User-Agent. Now, Carol just has to convince Bob that it's really Alice on the other end, and also convince him that it's really Carol doing the convincing. Easy, right? The first step in this persuasive process is for Carol to send Alice's User-Agent back to Bob's return\_to URL, along with some extra parameters. So now Alice is getting bounced back to http://bob.example.com/comment.cgi?session\_id=Alice&nonce=123456, with the following extra GET parameters tacked on by Carol:

- openid.mode = id_res
 + This value indicates that Carol asserts that Alice really does own this identity. It could also be "cancel", 
   indicating that Alice decided she really didn't want to go through with it, but that's boring.
- openid.return_to = http://bob.example.com/comment.cgi?session_id=Alice&nonce=123456
 + The same URL that Bob sent to Carol for his return path, echoed right back to him.
- openid.identity = http://carol.example.com/Alice
 + Again, the identity URL that Alice is claiming to own and that Carol is asserting she does.
- openid.signed = mode,identity,return_to
 + This is the list of parameters that we are going to provide a signature for. Obviously we'll want to sign the 
   mode and identity that we're authenticating, but we also sign the return_to URL to prevent replay attacks, as 
   discussed before.
- openid.assoc_handle = *opaque handle*
 + We'll cover this in detail next.
- openid.sig = *base 64 encoded HMAC signature*
 + We'll cover this too.

Whoa. What kind of strange voodoo magic is Carol tacking on to this return URL? Most of the items are pretty obvious, but the two that we'll want to take a close look at are the assoc\_handle and the sig(nature). The first, the assoc\_handle, is merely defined in the OpenID spec as "an opaque handle". But an opaque handle to what? Well it's basically a handle to a secret. A cryptographic secret. Carol needs to be able to take this opaque handle and lookup, internally, what secret she used when signing this assertion for Bob. How she does this is entirely up to Carol, but the one thing she does have to differentiate between is normal assoc\_handles and stateless ones. We'll talk about why that is later, but for right now, we're only dealing with stateless ones (because Bob is a dumb, AKA stateless, consumer, remember?)

So Carol makes up this secret and ties it to the assoc\_handle, but what's the secret good for? Well, it's used to create the second item of interest, the signature. Carol slaps together the fields that she said she was going to sign (in our case the mode, identity, and return\_to parameters), and then runs the HMAC-SHA1 signing algorithm on it, using the secret we've tied to the assoc\_handle as the cryptographic key for HMAC. This generates a signature that she then encodes in base 64 to be sent as a parameter back to Bob's return\_to URL.

Showtime Bob! Or is it?

Great! Now Bob has a signed assertion that Alice really is who she claims to be, and everything's good to go, right? Well there's still a few loose ends. First, Bob can't actually check the signature himself. He doesn't know the secret, after all. All he's got is this opaque handle that he can't really do anything with, and a signature that may or may not be valid. Second, all this came straight from Alice's User-Agent anyway, so really, the entire set of bits may just be made up. Bob really has no way of knowing at this point that Carol had anything to do with any of this. So what does he do?

Why, Bob offloads all the work back to Carol again, of course! I *told* you Bob's consumer was lazy. So the CGI that Alice bounced back to at Bob's site, http://bob.example.com/comment.cgi, is going to open up an HTTP session to Carol one last time before it trusts anything Alice has to say. This HTTP session is done server-side as part of the consumer script, so Alice's User-Agent has nothing to do with it. Bob's going to POST to http://carol.example.com/openid-server.cgi with the following parameters:

- openid.mode = check_authentication
 + This tells Carol we want to confirm what Alice said that Carol said.
- openid.signed = mode,identity,return_to
 + The list of items that Alice claims have been signed by the signature.
- openid.assoc_handle = *opaque handle*
 + This is the same handle that Alice's User-Agent provided to Bob's return_to URL.
- openid.sig = *base 64 encoded HMAC signature*
 + The signature Alice claims asserts her identity.
- openid.* = everything else
 + Everything that is in our list of signed items (minus the mode). In this case, we'll need to echo back the
   identity and return_to parameters that we've been bouncing around for a while now.

Once Carol gets this request, she's going to do all the work she already did once, all over again, just to please Bob. She'll tack the list of signed parameters together again, lookup the secret that she tied to the assoc\_handle provided, and create an HMAC-SHA1 signature of the parameters using this secret as the cryptographic key, just like she did before. Then she'll compare this signature with the signature that Bob provided in his message. If they match, then Carol knows that it must have been her that sent the original assertion (or else, someone that knows her secrets), and she'll return a plain text file to Bob with her final answer: "is_valid:true".

Of course if the signatures don't match (or if it is not a stateless assoc\_handle), then Carol will return to bob with "is\_valid:false", and Bob will know that someone is trying to trick him. Maybe Eve, maybe Alice, or maybe that lazy kid who wrote his consumer. I mean, $20?! How cheap can you get?

Bob Cleans House With AJAX

Ah, [AJAX][]. Same old stuff, brand new name. Well, trendy catch-phrase or not, Bob was sitting at his favorite internet cafe thinking about how awesome his blog was and how happy he was now that Eve couldn't trick him anymore, when he overheard two web-developers at the next table talking about AJAX. Curious, he listened in more closely, and started to wonder if maybe his OpenID authentication couldn't be a little more spiffy. After all, his current implementation redirected the user agent in circles, and it was kind of annoying when all you wanted to do was sign a simple comment.

[AJAX]: http://en.wikipedia.org/wiki/AJAX "AJAX entry at Wikipedia"

Ever the opportunist, Bob offered the women $50 on the spot if they could somehow make his OpenID authentication use AJAX. Being cutting-edge developers, they of course knew all about OpenID and were quick to agree to the deal. A scant forty-three minutes later (ten of which were spent waiting for Bob to remember how to login to his test site), they had made his blog commenting authentication as modern and seamless as he had ever hoped. So what mysterious secrets did these two women know that earned them a quick half-Benj?

Well, it was no dark arts that these two ladies applied. No, they were just down with the alternate OpenID protocol command of "checkid\_immediate". Just like its cousin, "checkid\_setup", this openid.mode is sent to Carol's OpenID server with all the trimmings in order to attempt to authenticate an OpenID identity. In fact, it's exactly like the other command and sent point in the flow as the other would be, with one behavioral difference. "checkid\_immediate", as the name implies, always returns immediately and does not take control of the user agent. If the server can authenticate on the spot then everything continues like normal, and the consumer must follow up with a "check\_authentication" command.

However, if Carol's OpenID server cannot make a positive assertion on the identity, then "checkid\_immediate" returns instantly with a failed assertion and a single "openid.user\_setup\_url" parameter instead of taking control of the user agent. At this point, the consumer can decide what to do. It may redirect the user agent to the "openid.user\_setup\_url" provided, or do this in a popup, or start playing a Beatles song, or whatever it wants. The key point, however, is that the consumer is in control of how the user agent should behave, which makes this command suitable for asynchronous usage in an HTTPRequest-style decoupled architecture.

Which is exactly what the two savvy ladies did to Bob's Blog. When a user attempts to authenticate their comment, instead of submitting the form directly to the OpenID server via the user agent, they use JavaScript to open an HTTPRequest to the OpenID server instead. And rather than send a "checkid\_setup" mode, which expects to be able to take control of the user agent, they send a "checkid\_immediate" mode instead. This allows them to popup a new web window with the "openid.user\_setup\_url" that they (might) get in reply, where the user can finalize the necessary steps at their OpenID server to allow the authentication to take place. Follow this all up with another HTTPRequest to send the "check\_authentication" command, and the entire process has taken place without disrupting the user agent in the slightest.

Now wasn't that an easy $50? And we didn't even bring Alice along for that one!

Bob Needs a Loan

Bob's Blog is so slick and easy to use with his OpenID authentication that it's started to become quite popular. In fact, he's having a hard time keeping up with all the bandwidth being generated by Alice and her many friends. While he's pulling hairs trying to pay the bills for his hosting and bandwidth costs one breezy summer afternoon, Carol drops him an email complaining that Alice and her buddies are pegging her OpenID server CPUs generating all these signatures and secrets and whatnot. Bob suspects there must be a way to trim down on all those authentication messages flying back and forth, and maybe save Carol's poor servers some work in the deal too.

He decides to call up those web-developers he met a while back to see if they can do anything about it. He offers them $200 (half his monthly blogging bill) if they can cut down on his authentication traffic. They pledge to do their best as they try not to snicker in the background. Why are they so happy? Because they expect to hit another quick and easy payday, using stateful handles and the OpenID "associate" mode.

Don't Procrastinate... Associate!

It's high-time that Bob's OpenID consumer scripts started handling their share of the work, and that's exactly what our heroic web-developers plan to do. You see, if Bob's consumer could just remember what it was talking about for a little while, he can save Carol a lot of work and both of them a lot of bandwidth. The basic idea is simple: in "dumb" mode, Bob's consumer simply parrots back the stateless handles he's been given, and he never actually knows what the secret is that this handle is supposed to represent. But what if Bob could actually remember the shared secret that he and Carol are using, and reuse this shared secret for a reasonable amount of time whenever he needs to talk to her? What would that do for us?

Well, it turns out he can, and it helps out quite a bit. This is the essence of "smart" mode. In smart mode, Bob's consumer establishes a shared secret with Carol's server ahead of time, and remembers it for some reasonable period of time. He establishes this shared secret by sending a POST request to Carol's server, http://carol.example.com/openid-server.cgi, with these parameters:

- openid.mode = associate
 + This tells Carol he wants to establish a shared secret with her.
- openid.assoc_type = HMAC-SHA1
 + The type of secret he wants to share; only HMAC-SHA1 is currently supported.
- openid.session_type = *blank*
 + The session type indicates how the secret should be established. An empty value means that it should be 
   established in the clear, which isn't 100% secure. A type of "DH-SHA1" means that Diffie-Hellman key 
   negotiation should be used instead.
- openid.dh_* = *meh*
 + If Diffie-Hellman key exchange is requested then a few more parameters are required. It's all very mathematical
    and, uh... Bob failed math.

When does Bob do this? Well, probably he'll want to do this the very first time that someone asks him to authenticate an identity with Carol, or if the shared secret he got the last time he did this has expired. But the important thing to note here is that, while this event may be triggered by Alice requesting authentication with Carol's OpenID server, it is not strictly part of their transaction. It is really a completely separate transaction between Bob and Carol that could happen at any time.

So, now that Bob has sent off this associate request, what's Carol going to do about it? She needs to generate a shared secret just like she would have otherwise and tie it to an assoc\_handle. But, instead of a stateless handle, she ties it to a stateful assoc\_handle. She then replies to Bob with a *key=value* formatted document in response to his POST request that contains all the usual sundry parameters. The two magical parameters, however, are the *assoc\_handle* and the *mac\_key* (or *enc\_mac\_key* if Diffie-Hellman was used). With these two items, Bob can now keep track, internal to his consumer, of that handle and shared secret key.

So what do these two items buy us? Quite a lot, actually. Now, whenever Bob needs to authenticate an identity with Carol, he sends along the handle with his "checkid\_setup" or "checkid\_immediate" requests instead of just expecting her to generate a new stateless one each time. This saves Carol the overhead of generating a new shared secret for each stateless transaction. Once Bob gets back a signed response from Carol (assuming a positive assertion), he no longer has to send a followup "check\_authentication" request because he already knows the shared secret that was used to sign the assertion and can check it himself. Remember, before he had to double-check with Carol because he had no way of knowing whether the signature was authentic or forged by Alice without going back to Carol directly to find out. But now, because only he and Carol know the shared secret (and particularly not Alice), he can check the signature immediately without talking to Carol again. This cuts his (and Carol's) authentication traffic almost in half in the optimal case.

Eve Snorts Packets

Now let's just say that Bob and Carol are both relatively lazy, despite everything they've done so far, and decide not to use Diffie-Hellman key exchange with their "associate" requests. What sort of exposure can they expect from Eve, our industrious and persistent practical joker?

From Eve's point of view, what she really wants is that shared secret that was established during the "associate" command and sent in plain-text from Carol to Bob. That means she has to intercept that one command as it happens and snort it from the wire. If Eve has complete, unfettered access to the network to do this, then Bob and Carol are sunk. Eve will know all the shared secrets and can spoof any identity she wants.

Since we didn't get very far with that scenario, let's consider a more limited case, where Eve can only snort network traffic for a finite period. With the amount of authentication traffic Bob is sending in and out, and the fact that his shared secrets (and their assoc\_handles) are set to expire at reasonable periods, Eve may only get a small set of shared secrets that she can use to spoof with. That may still be enough to cause mischief, but what Eve really wants is to spoof Alice in particular. That means she needs the specific shared secret that Bob and Carol share so that she can spoof Alice's identity. That's a pretty narrow window, so in practice Carol and Bob's shared secret is probably safe.

Eve's best bet would be to try and force the "associate" command to happen while she's able to listen in. For instance, if Eve pretends to authenticate a Carol-managed identity with Bob, she could claim that the shared assoc\_handle being used was invalid which forces Bob and Carol into dumb mode. That is no advantage to Eve, but now that Bob has invalidated his shared secret, the next time an authentication request comes for a Carol-managed identity (which will be Eve's very next move), Bob may now be tempted to send a new "associate" command to Carol first to get a new shared secret. This is exactly what Eve wants to listen for, so the OpenID protocol specifically guards against this by mandating that Bob should check with Carol directly before he invalidates an assoc\_handle that they share. Because of this, Eve cannot force Bob's cached handles to become invalid, which prevents her from forcing an "associate" command to happen during her narrow attack window.

Of course, if Bob and Carol simply use Diffie-Hellman key exchange to setup their shared secret in the first place, then no amount of packet sniffing will do Eve any good, and their shared secret will be secure.

Another thing Eve could listen in for is a successful authentication attempt made by Alice using a shared secret. If she captures the signed assertion that was sent back to Bob by Carol, she can brute-force the shared secret in her own time. Once she has this key, she can spoof an authentication attempt with Bob on Alice's identity. Carol and Bob will fall back to dumb, stateless mode when there is a problem with their current shared handle, which Eve can easily create by simply lying to Bob about Carol's response. Eve can then create and sign her own assertion using the broken key and send it back to Bob as part of the stateless transaction. Bob, now in dumb mode, will then send it directly back to Carol in a "check\_authentication" request for validation. If Carol doesn't know the difference between stateless and stateful assoc\_handles, she will check the message forwarded by Bob, find that the shared secret used to sign the assertion and the assoc\_handle match (because it was one Carol had truly used in the past), and respond to Bob that it is a valid assertion. Thus it is important that Carol be able to tell the difference between stateless and stateful assoc\_handles, and she should never respond to a "check\_authentication" request for a stateful handle. This is also mandated by the OpenID specification.

Comments (0)

You don't have permission to comment on this page.