IKE actually uses other protocols to perform peer authentication and key generation:

ISAKMP (Internet Security Association and Key Management Protocol)

The Internet Security Association and Key Management Protocol defines procedures on how to establish, negotiate, modify, and delete SAs.  All parameter negotiation is handled through ISAKMP, such as header authentication and payload encapsulation (headers and modes were discussed earlier).  ISAKMP performs peer authentication, but it does not involve key exchange. It uses both source and destination port 500 and is referred to as isakmp in the Cisco IOS software

Oakley (OKLEY Key Determination Protocol)

The Oakley protocol uses the Diffie-Hellman algorithm to manage key exchanges across IPsec SAs.  Diffie-Hellman is a cryptographic protocol that permits two end points to exchange a shared secret over an insecure channel

IKE Protocol Functionality

Let us study a few terms associated with the protocol functionality of IKE as follows

Authentication Header or AH refers to the protocol which offers data integrity and authentication without the bother of confidentiality.

Encapsulating Security Payload or ESP refers to the protocol which offers confidentiality on top of integrity and authentication to the IPSec data.

There is a functionality known as CIA which stands for confidentiality, integrity and authentication. Depending on how much portion of the C-I-A trio is to be used for a particular data exchange, we can make use of either AH or ESP in combination with AH to achieve the same.

IKE Phases

The IKE protocol/process is broken into two phases, which create a secure communications channel between two IPsec endpoints.  Although there are two primary and mandatory IKE phases, there is an optional third phase.  The three phases are described below, but before that I suggest you take a close look at this high level diagram which depicts the phases of IKE at a glance.

Explain IKE Protocol Functionality and Phases Fig 1

IKE Phases – High Level View

As you can make out there are actually two tunnels rather than one as we normally assume and these tunnels are known as internet key exchange security associations

Phase I: IKE phase 1 is one of the mandatory IKE phases. A bidirectional SA is established between IPsec peers in phase 1.  This means that data sent between the end devices uses the same key material. Phase 1 may also perform peer authentication to validate the identity of the IPsec endpoints.  There are two IKE modes available for IKE phase 1 to establish the bidirectional SA main mode and aggressive mode.  IKE modes are described in the next section. Phase 1 consists of parameter negotiation, such as hash methods and transform sets.  The two IPsec peers must agree on these parameters or the IPsec connection cannot be established.

One good way to remember what all is happening during the first IKE phase is the use of the acronym HAGLE.  As you must be aware, “haggle” is the slang term used to describe negotiation.  So if you remember this correlation between this slang and what is taking place during phase I, you cannot miss out on HAGLE and the explanation of the acronym is as follows

Letter                                   Parameter                                             Options

H                                             HMAC or Hashed Message Authentication Mode                       MD5/SHA-1

A                                             authentication                                                                                      RSA signatures etc

G                                             group                                                                                                      DH1, DH2, DH5, DH7

L                                             lifespan                                                                                                  time

E                                             encryption                                                                                             AES, DES or triple DES

The above are also known as ISAKMP policy set parameters and when a parameter is not specified explicitly, certain default values are taken by the Cisco IOS software.  The default values for these parameters are as follows.

H – SHA, A – RSA Signatures, G – DH1, L – one day, E – DES

A lot is going on behind the scenes during this phase one which includes SA parameter negotiation, automatic key generation, key refreshing and so forth.  There are three modes which are used to accomplish these tasks and these are as follows.

  • Phase I – Mode 1- step 1: during this step the process of HAGLE described in the above table actually takes place.  The two peers of the virtual private tunnel agree on various factors such as the encryption, authentication method and hashed message authentication mode. Since the elements of HAGLE are known as policies, they are jointly known as policy sets. It is worth noting that a peer might have multiple policy sets mainly because of its ability or necessity to deal with different peers each requiring different policy set.
  • Phase I – Mode 1- step 2: during this step the shared secret key is established which forms the basis for further key generation and initiates the secure channel. As noted from the table above, there are four options to choose from such as DH1 etc. All these have different levels of security and are used with different options of element E i.e. encryption. It must be noted that groups 1 through 7 represent increased level of security since they represent higher bits namely 768, 1024 and 1536 for group 1, 2 and 5 respectively; the last group namely group 7 is a weak encryption consisting only of 163 bits only used for low processing power devices such as say for example handheld devices and PDAs.
  • Phase I – Mode 1- step 3: the secure channel established in the previous step is either continued or broken down by the initiator depending on whether the authentication succeeds or fails.

The above mode is also known as the main mode or MM for short.

Intermediate Phase: IKE phase 1.5 is an optional IKE phase.  Phase 1.5 provides an additional layer of authentication, called Xauth, or Extended Authentication.  IPsec authentication provided in Phase 1 authenticates the devices or endpoints used to establish the IPsec connection.  However, there is no means of validating the users behind the devices.  A preconfigured IPsec device can be used by both friends and foes.  Xauth forces the user to authenticate before use of the IPsec connection is granted.

This mode is also known as aggressive mode and is considered as a less secure mode by most security experts due to the lesser number of exchanges that take place here.

Phase II: IKE phase 2 is the second mandatory IKE phase and is also known as the quick mode.  We must first understand the meaning of a transform set in order to know what all is going on during phase II or quick mode.  A transform set can be stated as a group of quick mode encryption algorithms and hashed message authentication mode. It must be kept in mind that transform sets are similar to, yet different and unique from the policy sets described during phase I or the main mode.

After learning about transform sets we are in a better position to know what is happening during phase II.  The IPSec transform sets are negotiated during this phase followed by the establishment of minimum two IPSec SAs. These SAs are negotiated periodically to ensure continuous security and sometimes additional Diffe Hellman key exchanges are carried out.

As already studied in the previous section, the negotiation and authentication are initially completed in phase I either through the main mode or the aggressive mode.  Hence when the authentication and negotiation occur during quick mode, they already have an established secure channel for the same.  Several other decision are taken during this second phase such as the networks to be protected, encryption algorithms used, the use of transport or tunnel mode and so forth.

Let us take a brief look at the difference between the transport and tunnel mode.  A transport mode is a mode which has lesser overheads although at the cost of being less secure.  It is normally not used when the peers are connection across some public network such as say the Internet.  The payload of the IP packet is encrypted in case of transport mode while its header is kept as it is.  In the tunnel mode, the payload of the IP packet as well as its header is all enclosed inside a new IP packet and hence it acts something like a tunnel within a tunnel.  Obviously it has more overheads associated with it but is a very secure mode of transport, unless the peers happen to be on private networks where risk is less.  The out IP header has the necessary routing information which helps the devices to send the packet to its required destination.

Take a look at the diagram below which diagrammatically attempts to make you visualize what all is going on during phase II of the IKE. It depicts the phase two process between two VPN peers named A and B located on the left and right hand sides respectively, though not explicitly seen on the picture.

Explain IKE Protocol Functionality and Phases Fig 2

IKE Phase II

As you can see there are two secure channels between the given VPN peers and each of these SA is identified by its SPI number which stands for security parameter index.  These SPI numbers are stored in the security association database of the peers and identified either as inbound or outbound SAs as the case may be.

Can you identify which is the inbound and which is the outbound link for peer B in the above diagram?

Let us now study how the functionality of C-I-A which we studied towards the beginning of this tutorial is implemented in an IP packet traveling from peer A to peer B over the tunnel.  Take a look at the diagram below to understand it more deeply.

Explain IKE Protocol Functionality and Phases Fig 3

C-I-A Functionality

Most the things are self explanatory as the cleartext IP packet is encrypted using a cipher and key combo in the resulting ciphertext.  This process is shown on the top part of the diagram above. Once the encryption is complete, the packet is encapsulated inside another IP packet whose ESP header accounts for the C or confidentiality, ESP trailer accounts for integrity and ESP authentication trailer accounts for authentication.  The ESP trailer above is nothing but a hash or checksum which is added by peer A before the packets gets wrapped in the new wrapper.  The two components of the IP header of the new packet are composed of peer A and peer B – SA and DA respectively.