Public key infrastructure

Stanislav KobylanskyJanuary 20th, 2012Last Updated: October 21st, 2012

0 69 8 minutes read

Some time ago I was asked to create presentation for my colleagues which describes Public Key Infrastructure, its components, functions, how it generally works, etc. To create that presentation, I’ve collected some material on that topic and it would be just dissipation to throw it out. That presentation wasn’t technical at all, and that post is not going to be technical as well. It will give just a concept, high-level picture, which, I believe, can be a good base knowledge before start looking at details.

I will start with cryptography itself. Why do we need it? There are at least three reasons for that – Confidentiality, Authentication and Integrity. Confidentiality is the most obvious one. It’s crystal clear that we need cryptography to hide information from others. Authentication confirms that message is send by subject which we can identify and our claims about it are true. And finally, Integrity ensures that message wasn’t modified or corrupted during transfer process.

We may try to use Symmetric Cryptography to help us to achieve our aims. It uses just one shared key, which is also called secret. The secret is used for encryption and for decryption of data. Let’s have a look how it can help us to archive our aims. Does it encrypt messages? Yes. Well, Confidentiality is solved, as soon as nobody else, except communicating parties, knows the secret. Does it provide Authentication? Mmm… I would say, no. If there are just two parties in conversation, is seems ok, but if there are hundreds, then should be hundreds secrets, which is hard to manage and distribute. What about Integrity?

Yes, it works fine – it’s very hard to modify encrypted message. As you can guess, symmetric cryptography has one big problem – and that problem is “shared secret”. These two words… they don’t even fit one to other. If something is known by more that one person, it is not a secret any more. Moreover, to be shared, that secret somehow has to be transferred and during that process there are too many way for secret to be stolen. This means that such type of cryptography hardly solves our problems. But it is still in use and works quite well for its purposes. It’s very fast and can be used for encryption/decryption of big amounts of data, e.g. you hard drive. Also, as far as it hundreds or even thousands times faster that asymmetric cryptography, it’s used in hybrid schemas (like TLS aka SSL), where asymmetric cryptography is used for just for transferring symmetric key and encryption/decryption is done by symmetric algorithm.

Let’s have a look at Asymmetric Cryptography. It was invented very recently about 40 years ago. The first paper (“New Directions in Cryptography”) was published in 1976 by Whitfield Diffie and Martin Hellman. Their work was influenced by Ralph Merkle, who believed to be the one who created the idea of Public Key Cryptography in 1974 (http://www.merkle.com/1974/) and suggested it as project to his mentor – Lance Hoffman, who rejected it. “New Directions in Cryptography” describes algorithm of key exchange known as “Diffie–Hellman key exchange”. Interesting fact that the same key exchange algorithm was invented earlier, in 1974 in Government Communication Headquarters, UK by Malcolm J. Williamson, but that information was classified and fact was disclosed just in 1997.

Asymmetric Cryptography uses pair of keys – one Private Key and one Public Key. Private Key has to be kept secret and not shared with anybody. Public Key can be available to public; it doesn’t need to be secret. Information encrypted with public key can be decrypted only with corresponding private key. As far as Private Key is not shared, there is no need to distribute it, and there is reasonably small chance that it will be compromised. So such way of exchanging information can solve Confidentiality problem. What about Authentication and Integrity? These problems are solvable as well and utilise mechanism called Digital Signature. The simplest variant if Digital Signature can use following scenario – subject creates a hash based on message, encrypt that hash with Private Key and attach it to message.

Now if recipient wants to verify the subject who created a message, he will encrypt that hash using subject’s public key (that’s Authentication) and compare it with hash generated on recipient side (Integrity). In reality hash is not exactly encrypted, instead it used in special signing algorithm, but the overall concept is the same. It’s important to notice that in Asymmetric Cryptography each pair of keys serves just one purpose, e.g. if pair is used for signing, it can’t be used for encryption.

Digital Signature, also, is the base for Digital Certificate AKA Public Key Certificate. Certificate is pretty much the same as your passport. It has identity information, which is similar to name, date of birth, etc. in passport. Owner of certificate has to have Private Key which matches Public Key stored in certificate, similar passport has photo of the owner, which matches owner’s face. And, finally, certificate has a signature, and its meaning is the same, as meaning of stamp in passport. Signature proves that certificate was issued by organization which made that signature. In Public Key Infrastructure world such organizations are called Certificate Authorities. If one system discovers that Certificate is signed by “trusted” Certificate Authority, it means that system will trust to information in certificate.

Last paragraph may not be obvious, especially “trust” part of it. What does “trust” mean in that context? Let have a look at simple example. Every site on Web which makes a use of encrypted connection does it via TLS (SSL) protocol, which is based on Certificates. When you go to https://www.amazon.co.uk and it sends its certificate back to your browser. In that certificate there is information about website and reference to Certificate Authority who signed that certificate. First browser will look at the name in certificate – it has to be exactly the same as website domain name, in our case, that’s “www.amazon.co.uk”. Then browser will verify that certificate is signed by Trusted Certificate Authority, which is VeriSing in case of Amazon. You browser already has a list of Certificate Authorities (this is just a list of certificates with public keys) which are known as trusted ones, so it can verify that certificate is issued by one of them. There are some other verification steps, that these two are the most important ones.

Assume in our case verification was successful (if it’s not browser will show is big red warning message, like that one) – certificate has proper name in it and was signed by Trusted Certificate Authority. What does it give to us? Just one thing – we know that we are on www.amazon.co.uk and the server behind that name is Amazon server, not some dodgy website, which just looks like Amazon. When we enter our credit card details and we can be relatively sure that they will be sent to Amazon, but not to hacker’s database. Our hope here based on assumption that such Certificate Authorities like VeriSign do not give dodgy certificates and Amazon server is not compromised. Well, better than nothing J

Another example are severs in organization, which use certificates to verify that they can trust one to other. The schema there is very similar to browser’s ones, except two differences:

Mutual authentication. Certificates are, usually, verified but both sides, not just by client. Client has to send his certificate to server.
Certificate Authority, is hosted inside the company.

When CA is inside the company we can be almost sure that certificates are going to be issued only to properly validated subjects. It gives some confidence that hacker can’t inject his server, even if he has access to network infrastructure. Attack is possible only if CA is compromised or some server’s Private Key is compromised.

We already know, Certificate Authority is the organization which issues certificate and in the Internet, an example of such organization is VeriSing. If certificate is created to be used just inside organization (intranet), it can be issued by Information Security Department which can act as Certificate Authority.

When someone wants to have certificate, he has to send certificate request which is called Certificate Signing Request to Certificate Authority. That certificate consists of subject’s identity information, subject’s public key and signature, created by subject’s private key to ensure, that subject who sent request has appropriate private key. Before signing Certificate Authority passes that request to Registration Authority who verifies all details, ensures that proper process is followed, etc. It’s possible that Certificate Authority can also act as Registration Authority. After all, if everything is ok, Certificate Authority creates new certificate signed by its private key and send it back to subject which requested certificate.

I’ve already mentioned Certificate validation process. Here are some details of it; worth mentioning theirs details are still high-level. Validation consists of several steps which, broadly speaking can be described as:

Certificate data validation – validity date, presence of required fields, their values, etc.
Verify that certificate is issued by Trusted Certificate Authority. If you are browsing internet that list if already built-in in your browser. If that’s communication between two systems, each system has a list of trusted Certificate Authorities; usually that is just a file with certificates.
Certificate’s signature is valid and made by Certificate Authority who signed that certificate.
Verify that certificate is not revoked.
Key verification – proves that servers can decode messaged encrypted by certificate’s Public Key.

Mentioned above certificate revocation can happen because of many reasons – certificate could be compromised, or, in corporate world, employee, which owned certificate, left company, or sever which had certificate was decommissioned, etc. On order to verify certificate revocation, browser or any other piece of software, has to use one or both of following techniques:

Certificate Revocation List (CRL). That’s just a file, which can be hosted on http server. It contains list of revoked certificate IDs. That’s method is simple and straightforward, it doesn’t require lots of efforts for implementation, but has three disadvantages – that’s just a file, which means, that it’s not real-time, it can use significant network traffic and it’s not checked by default by the most of the browsers (I would even say by all browsers), even if certificate has a link to CRL.
Online Certificate Status Protocol (OCSP). That is preferable solution, which utilizes dedicated server, which implements protocol which will return back revocation status of certificate by its id. If browser (at least FireFox > v.3.0) will find link to that server in certificate, it will make a call to verify that certificate is not revoked. Only disadvantage is that OCSP server has to be very reliable and be able to answer on requests all the time.

In internet certificate usually contains links to CRL or OCSP inside it. When certificates are used in corporate network these links are usually known by all parties and there is no need to have them in certificate.

So, finally, what is Public Key Infrastructure? That’s infrastructure, which supports everything which was described above and generally consists of following elements:

Subscribers. Users of certificates. Clients and ones who owns certificates.
Certificates.
Certificate Authority and Registration Authority.
Certificate Revocation Infrastructure. Server with Certificate Revocation list or OCSP Server.
Certificate Policy and Practices documents. Describe format of certificate, format of certificate request, when certificated have to be revoked, etc. Basically all procedures related to infrastructure.
Hardware Security Modules, which are usually used to protect Root CA’s private key.

And that entire infrastructure support following functions, which we’ve just discussed: