Wednesday, October 28, 2009

Network Security

Data on the network is analogous to possessions of a person. It has to be kept secure from others with malicious intent. This intent ranges from bringing down servers on the network to using people's private information like credit card numbers to sabotage of major organizations with a presence on a network. To secure data, one has to ensure that it makes sense only to those for whom it is meant. This is the case for data transactions where we want to prevent eavesdroppers from listening to and stealing data. Other aspects of security involve protecting user data on a computer by providing password restricted access to the data and maybe some resources so that only authorized people get to use these, and identifying miscreants and thwarting their attempts to cause damage to the network among other things.

The various issues in Network security are as follows :

  1. Authentication: We have to check that the person who has requested for something or has sent an e-mail is indeed allowed to do so. In this process we will also look at how the person authenticates his identity to a remote machine.
  2. Integrity: We have to check that the message which we have received is indeed the message which was sent. Here CRC will not be enough because somebody may deliberately change the data. Nobody along the route should be able to change the data.
  3. Confidentiality: Nobody should be able to read the data on the way so we need Encryption
  4. Non-repudiation: Once we sent a message, there should be no way that we can deny sending it and we have to accept that we had sent it.
  5. Authorization: This refers to the kind of service which is allowed for a particular client. Even though a user is authenticated we may decide not to authorize him to use a particular service.
For authentication, if two persons know a secret then we just need to prove that no third person could have generated the message. But for Non-repudiation we need to prove that even the sender could not have generated the message. So authentication is easier than Non-repudiation. To ensure all this, we take the help of cryptography. We can have two kinds of encryption :
  1. Symmetric Key Encryption: There is a single key which is shared between the two users and the same key is used for encrypting and decrypting the message.
  2. Public Key Encryption: There are two keys with each user : a public key and a private key. The public key of a user is known to all but the private key is not known to anyone except the owner of the key. If a user encrypts a message in his private key then it can be decrypted by anyone by using the sender's public key. To send a message securely, we encrypt the message in the public key of the receiver which can only be decrypted by the user with his private key.

Symmetric key encryption is much faster and efficient in terms of performance. But it does not give us Non-repudiation. And there is a problem of how do the two sides agree on the key to be used assuming that the channel is insecure ( others may snoop on our packet ). In symmetric key exchange, we need some amount of public key encryption for authentication. However, in public key encryption, we can send the public key in plain text and so key exchange is trivial. But this does not authenticate anybody. So along with the public key, there needs to be a certificate. Hence we would need a public key infrastructure to distribute such certificates in the world.

Key Exchange in Symmetric Key Schemes

We will first look at the case where we can use public key encryption for this key exchange. . The sender first encrypts the message using the symmetric key. Then the sender encrypts the symmetric key first using it's private key and then using the receiver's public key. So we are doing the encryption twice. If we send the certificate also along with this then we have authentication also. So what we finally send looks like this :

Z : Certificatesender + Publicreciever ( Privatesender ( Ek ) ) + Ek ( M )
Here Ek stands for the symmetric key and Ek ( M ) for the message which has been encrypted in this symmetric key.

However this still does not ensure integrity. The reason is that if there is some change in the middle element, then we will not get the correct key and hence the message which we decrypt will be junk. So we need something similar to CRC but slightly more complicated. This is because somebody might change the CRC and the message consistently. This function is called Digital Signature.

Digital Signatures

Suppose A has to send a message to B. A computes a hash function of the message and then sends this after encrypting it using its own private key. This constitutes the signature produced by A. B can now decrypt it, recompute the hash function of the message it has received and compare the two. Obviously, we would need the hash functions to be such that the probability of two messages hashing to the same value is extremely low. Also, it should be difficult to compute a message with the same hash function as another given message. Otherwise any intruder could replace the message with another that has the same hash value and leave the signatures intact leading to loss of integrity. So the message along with the digital signature looks like this :

Z + Privatesender ( Hash ( M ) )

Digital Certificates

In addition to using the public key we would like to have a guarantee of talking to a known person. We assume that there is an entity who is entrusted by everyone and whose public key is known to everybody. This entity gives a certificate to the sender having the sender's name, some other information and the sender's public key. This whole information is encrypted in the private key of this trusted entity. A person can decrypt this message using the public key of the trusted authority. But how can we be sure that the public key of the authority is correct ? In this respect Digital signatures are like I-Cards. Let us ask ourselves the question : How safe are we with I-Cards? Consider a situation where you go to the bank and need to prove your identity. I-Card is used as a proof of your identity. It contains your signature. How does the bank know you did not make the I-Card yourselves? It needs some proof of that and in the case of I-Cards they contain a counter signature by the director for the purpose. Now how does the bank know the signature I claim to be of the director indeed belongs to him? Probably the director will also have an I-Card with a counter signature of a higher authority. Thus we will get a chain of signing authorities. Thus in addition to signing we need to prove that the signatures are genuine and for that purpose we would probably use multiple I-Cards each carrying a higher level of signature-counter signature pair.
So in order to distribute the public key of this authority we use certificates of higher authority and so on. Thus we get a tree structure where the each node needs the certificates of all nodes above it on the path to the root in order to be trusted. But at some level in the tree the public key needs to be known to everybody and should be trusted by everybody too.

Wireless Networks

Introduction

As the need of communication became more and more demanding, new technologies in the field of networks developed. One of them is the use of wireless networks. It is the transmission of data from source to destination without the use of wires as the physical media.

Why to use Wireless?

Three reasons may be stated for the over-growing use of wireless networks across the world:
  1. They are ubiquitous networks. As the do not require messy wires as a medium of communication, they can be used to connect far-off places.
  2. They are cheaper than wired networks specially in the case of long-distance communication.
  3. They are pretty effective and fast, especially with the modern advancements in this field.

Some Terms and Technologies:

ATM-Asynchronous Transfer Mode:
ATM is a connection-oriented switching technology. It was built to support ISDN (Integrated Services Digital Network). ISDN required high speed cables for both its narrow band (64 Kbps) and broad band (155 Mbps) transmission. There were two technologies available for transmitting data-

  1. Circuit Switching: In this technology, when a user makes a call, the resources are reserved for him. The advantage of this technology is that it prevents collisions among various users. But the disadvantage is that it leads to inefficient utilization of bandwidth-- if the user fails to send data or if the transmission speed is faster than the speed of sending data. then most of the bandwidth is wasted.
  2. Packet Switching: In this technology, resources are never reserved for any particular user. The advantage of this technology is that it leads to efficient utilization of bandwidth i.e. the channel is never free until & unless there are no users, But the disadvantage is that it causes many collision.
ATM was built as a combination of the best features of these two. Also ATM provides QoS (Quality of Service) based on the following priority pattern:
  1. CBR-Constant Bit Rate: Jobs that can tolerate no delay are assigned the CBR priority. These jobs are provided same number of bits every frame.. For example, viewing a video reel definitely requires some blocks in every frame.
  2. VBR-Variable Bit Rate: Jobs that may produce different sized packets at different times are assigned VBR priority. They are provided with a variable number of bits varying between a maximum and a minimum in different frames. e.g.. a document may be compressed differently by different machines. Transmitting it will be a variable transmission.
  3. ABR-Available Bit Rate: This is the same as VBR except that it has only the minimum fixed. If there are no CBR or VBR jobs left, it can use the entire frame,
  4. UBR-Unavailable Bit Rate: These jobs are the least priority jobs. The network does not promise anything but simply tries its best to transmit it.
WLAN-Wireless LAN
This is currently being used as dictated by the standards of IEEE 802.11. It can be installed at the medium access layer and the data transmission can occur using a converter to reach the wired LAN network.( IEEE 802.x)

WATM-Wireless ATM
It is the wireless version of ATM. It provides QoS. It is not yet available in market. because installing it will require the simultaneous installation of ATM infrastructure. It is currently being tested thoroughly.

Coupling of Networks:
The alternatives are:

  1. WLAN LAN
  2. WATM LAN
  3. WLAN ATM
  4. WATM ATM
  1. WLAN-LAN is the simplest of the above. According to the IEEE standards, the IEEE 802.11 (WLAN) can be used with IEEE 802.x (LAN) as follows:
  2. WLAN-ATM- NOT FEASIBLE.
  3. WATM-LAN- NOT FEASIBLE because WATM requires an infrastructure of the type ATM
  4. WATM-ATM-this is also a simple scheme because WATM can run on ATM.

Issues involved in Wireless Networks

  • Cost and Speed: As it is being considered as an alternative to wired networks, it should be faster and cheaper.
  • Quality of Transmission: It gives a higher BER (Bit Error Rate). The BER is greater than 10 -6. This is caused because transmission quality depends highly on the physical media including landscape, weather etc.
  • RayLeigh Fading: The data has to travel the distance through a medium like air. Several rays of the same stream cause Rayleigh fading due to interference. This causes poor transmission.
  • Multipath Propagation: Similarly, due to multipath propagation, the signal received at the destination may be garbled.
  • Hand-Offs: If hand-offs are used i.e., hexagonal cells each having a base station and many mobile terminals, two Mobile terminals that are far enough can use the same bandwidth. This reuse of bandwidth is helpful.
  • Dynamic Physical Characteristics: The terminal may be mobile and constantly moving. Thus the distance between the base station and any active terminal may be constantly changing. This has to be taken into account while designing.
  • Practical Implementation: The practical implementation of any wireless network requires CSMA/CD for proper transmission. The range of any terminal is fixed. So, there may be two terminals that are out of range of each other. These are called HIDDEN TERMINALS. Collisions may be caused due to simultaneous sending of data from any two hidden terminals. The HIDDEN TERMINAL PROBLEM should be overcome with the help of Base Station.
  • .Mobility and Network Topologies: Wireless networks should be effective enough to overcome the problems caused by the topology of the area and the mobility of the terminals
  • Frequency Allocation: Licensed & Unlicensed: For licensed networks, permission has to be taken from the authorities that grant you a fixed bandwidth which is not used by anybody else while unlicensed networking does not require any such permissions. It just provides with some unlicensed bands which can be used by anybody. Unlicensed bands may thus, cause collisions.
  • Capture Effect: If there are more than one terminals requiring the attention of the Base Station, the one nearer to the base station may capture it. This unfair access to the base station should be prevented.
  • Power Requirements and Battery: This problem arises for the Mobile Terminals that run battery or cells. Much dissipation of power is caused when switching from receiving mode to sending mode and vice versa.
  • Human Safety: Not all bandwidths can be used . Also, the intensity should not be very high as it may lead to several complications in human body e.g.. cataract.

Wireless Physical Media

In the wireless physical media, three technologies are used:
  1. Transmission at Infrared frequency: This is easier to build and set-up. It is mainly used for indoor purposes because the beam has to be focussed and can't cross opaque media like walls etc.
  2. Transmission through Microwave: This is preferred as it requires low power consumption. (the bandwidth is fixed) But the basic problem is that it requires Line-of-Sight. Also, it requires license.
  3. Transmission at Radio Frequency: This is the one that is most familiar to us. The bandwidth is pretty large.

Integrity and Security of the signal

Spread Spectrum: To reduce the effect of noise signals, the bandwidth of the signal is increased tremendously. This is costly but assures better transmission. This is called SPREAD-SPECTRUM. This is used in two ways:

  • FHSS (Frequency hopping spread spectrum): The entire packet is not sent at the same bandwidth. Say, it is sent at frequency range A for time T1, frequency range B for time T2, A for T1, B for T2 and so on. The receiver also knows this sequence and so, looks at A for time T1, then at B for time T2 and so on. Thus this sort of understanding between the sender and receiver prevents the signal from being completely garbled .
  • DSSS (Direct Sequence Spread Spectrum): This involves sending of coded data instead of the actual data. This code is known to the destination only which can decipher the data now.
The problem still left undealt is that of bursty errors. If there is lot of traffic, interference may hinder the Base Station from receiving data for a burst of time. This is called "Bursty Errors".
Such problem are looked at by MAC-Medium Access Control.

MEDIUM ACCESS CONTROL

To control the traffic, various techniques are used. MAC fulfills the following requirements:
  1. QoS Requirements: It provides Quality of Service according to the priority of jobs.
  2. Error Control: Error handling is done using some codes.
  3. Frame Size: To transmit maximum data, we want the frame-size to be maximum but at the same time, large frame-size highly increases the probability of errors. So, MAC provides a tradeoff between the above two factors determining the size of the frame.
  4. Secure Transmission: The data meant for a particular receiver is secured from others.
  5. Reasonable Transmission: If the number of users increases, each should get reasonable service. MAC prevents unfair access to channel.
  6. Efficient utilization of Power: If a transmitter is always on, it is continuously using power even if there is no data on the channel for it. This is reduced by sending the transmitter to "sleep mode" whenever the battery is going down. In this mode, the transmitter is unable to receive any data.

Architecture for Wireless Network

There are two types of architecture possible:
  1. AD-HOC NETWORK
  2. INFRASTRUCTURE NETWORK
The Ad-Hoc network can be set up anytime. It does not require a Base Station. It is generally used for indoor purposes.
The Infrastructure network involves Base Station and Mobile Terminals. It provides uplink facility ( link from MT to BS) and downlink facility (link from BS to MT).

THE MAC PROTOCOL

This protocol decides how to assign data slots to different users. The various policies it uses are:
  1. Fixed Assignment Policy
  2. Random Assignment Policy
  3. Centrally Controlled Policy
  4. Distributed Controlled Policy
  5. Hybrid Controlled Policy
Fixed Assignment Policy:
In this policy, each terminal is assigned some sort of data slot to speak. It causes a fixed delay. It is done in 3 ways:
  1. TDMA (TIME DIVISION MULTIPLE ACCESS) : Each user is given a fixed time to speak., after which the chance goes to another user. This cycle continues indefinitely.
  2. FDMA (FREQUENCY DIVISION MULTIPLE ACCESS): Each user is given a fixed bandwidth in which he can speak at all times.
  3. CDMA (CODIVISION MULTIPLE ACCESS): Each user is given different frequencies at different times. This ensures that each user gets a fair amount of channel each time.

Also, sometimes, statistical multiple access is used in which a slot is assigned to a user only if it has data to send.

Random Assignment Policy
In this policy, contention slots are provided to all the users. Problem may arise if the number of users increase drastically. The number of contention slots should be variable. This may cause some limiting of data slots but is necessary to prevent the derailment of the service .

Centrally Controlled Policy:
This is used in an infrastructure architecture. It involves the participation of a Base Station which may assign slots and priorities(CNBR,VBR etc.) to all the users.

Distributed Controlled Policy:
This is used in Ad-Hoc architecture. The control is among the terminals which decide among themselves about who is going to speak first.

Hybrid Controlled Policy:
This combines the best features of centrally controlled and distributed controlled policies.

KINDS OF MAC PROTOCOLS:

There are two kinds of Mac protocols:
  1. FDD (Frequency Division Duplex) This provides two separate bandwidths for uplink and downlink transmission. This leads to inefficient utilization of bandwidth as there is more traffic on downlink than uplink
  2. TDD (Time Division Duplex) This provides an adoptive boundary between the uplink and downlink frequency which depends on the what is being used at that particular time. It works as follows:
Any mobile terminal can be in 3 states : empty state, request state and ready-to-transmit state.
  1. uplink-MT1 sends a random-access request to BS to communicate with MT2
  2. downlink: BS sends a b-bit access id to MT2
  3. uplink: MT1 sends the packet
  4. downlink: BS sends the packet to MT2
The TDD is more in use now-a-days.

Firewalls

Introduction

This lecture discusses about security mechanisms in the Internet namely Firewall . In brief, It's a configuration of routers and networks placed between an organization's internal internet and a connection to an external internet to provide security. In other words, Firewall is a mechanism to provide limited access to machines either from the outside world to internal internet or from internal world to outside world. By, providing these security mechanisms, we are increasing the processing time before one can access a machine. So, there is a trade-off between security and ease of use. A firewall partitions an internet into two regions, referred to informally as the inside and outside.

__
| | _________ Firewall
______________________ | | ____________________
| | | | | |
| | | | | |
| Rest of Internet |________ | |_____ | Intranet |
| | | | | |
|_____________________ | | | |___________________|
|_|
Outside Inside

Security Lapses
  • Vulnerable Services - NFS : A user should not be allowed to export certain files to the outside world and from the outside world also, someone should not be allowed to export our files.
  • Routing based attacks : Some kind of ICMP message should not be allowed to enter my network. e.g.. Source routing and change route ICMP's.
  • Controlled access to our systems : e.g.. Mail server and web pages should be accessible from outside but our individual PC's should not be accessible from the outside world.
  • Authentication : Encryption can be used between hosts on different networks.
  • Enhanced Privacy : Some applications should be blocked. e.g.. finger ...
  • PING & SYN attack : Since these messages are send very frequently, therefore you won't be able to do anything except reply to these messages. So, I should not allow these messages to enter my network.

So. whatever I provide for my security is called Firewall. It is a mechanism and not just a hardware or software.

Firewall Mechanisms

1. Network Policy : Here, we take into consideration, what services are allowed for outside and inside users and the services which are allowed can have additional restrictions. e.g.. I might be allowed to download things from the net but not upload i.e.. some outside users cannot download the things from our net. Some exceptional cases might be there which have to be handled separately. And if some new application comes up then , we choose an appropriate network policy. 2. Authentication mechanism : An application can be designed which ask for a password for authentication.

3. Packet Filtering : Router have information about some particular packets which should not be allowed.

4. Application gateways : or proxy servers.

Certain Problems with Firewall

1. Complacency : There are lots of attacks on the firewall from internal users and therefore, it's limitations should be understood.

2. Encapsulated packets : An encapsulated packet is an IP packet within another IP packet. If we ask the router to drop encapsulated packets then, it will drop the multicast packets also.

3. Throughput :So, in order to check which packets are allowed and which are not, we are doing some processing which can be an overhead and thus affects throughput.

Authentication:

We can use the following mechanisms:

  • One time passwords: passwords are used only once and then it changes. But only the user and the machine knows the changing passwords.
  • password aging : User are forced to change passwords after some time on regular intervals.
  • smart cards : swipe through the PC.
  • biometrics : eyes or finger prints are used.

Packet Filtering :

Terms associated:

  • Source IP address
  • Destination IP address
  • Source port #
  • Destination port #
  • protocol
  • interface

Many commercial routers offer a mechanism that augments normal routing and permits a manager to further control packet processing. Informally called a packet filter, the mechanism requires the manager to specify how the router should dispose of each datagram. For example, the manager might choose to filter (i.e.. block) all datagrams that come from a particular source or those used by a particular application, while choosing to route other datagrams to their destination.

The term packet filter arises because the filtering mechanism does not keep a record of interaction or a history of previous datagrams. Instead, the filter considers each datagrams separately. When a datagram first arrives, the router passes the datagram through its packet filter before performing any other processing. If the filter rejects the datagram, the router drops it immediately.

For example, normally I won't allow TFTP, openwin, RPC, rlogin, rsh packets to pass through the router whether from inside or outside and router just discard these packets. But I might put some restrictions on telnet, ftp, http, and smtp packets in order to pass through the router and therefore some processing is to be done before discarding or allowing these packets.

Because TCP/IP does not dictate a standard for packet filters, each router vendor is free to choose the capabilities of their packet filter as well as the interface the manager uses to configure the filter. Some routers permit a manager to configure separate filter actions for each interface, while others have a single configuration for all interfaces. Usually, when specifying datagrams that the filter should block, a manager can list any combination of source IP address, destination IP address, protocol, source protocol port number, and destination protocol port number.
So, these filtering rules may become more tricky with complex network policies.

Since, Filtering rules are based on port numbers, there is a problem with RPC applications. First, the number of well-known ports is large and growing. Thus, a manager would need to update such a list continually because a simple error of omission could leave the firewall vulnerable. Second, much of the traffic on an internet does not travel to or from a well-known port. In addition to programmers who can choose port numbers for their private client-server applications, services like Remote Procedure Call (RPC) assigns port dynamically. Third, listing ports of well-known services leaves the firewall vulnerable to tunneling, a technique in which one datagram is temporarily encapsulated in another for transfer across part of an internet.


Relay Software (proxies) :

I can run multiple proxy on same machine. They may detect misuse by keeping loops. For example, some machine give login to Ph.D.. students. So, in this case it's better to keep proxy servers than to give login on those machines. But the disadvantage with this is that there are two connections for each process.

_________ __________
| | | |
| User |_______________| Proxy |___________ Outside
| ________| 1. |_________ | 2.

Various Firewall Considerations

1. Packet Filtering Firewall
This is the simplest design and it is considered when the network is small and user don't run many Intranet applications.
__________
| |
Intranet __________| Router |__________ Internet
|________ _ |
|
|
Filter

2. Dual home gateway
This gives least amount of flexibility. Instead of router, we have application gateways.
______________
| Application |
Inside ________ _ | level |___________ Outside
| gateway |
|____________ |
proxy

3. Sreened host Firewall
It's the combination of the above two schemes. Some applications are allowed uninterrupted while some have to be screened. For any reasonable size network, Screened host firewall can get loaded.

_________ ___________
| | | |
Inside _________| Router 1 |_______________________ | Router 2 |______ Outside
|_________| | |__________ |
____|______
| |
| Proxy |
|__________|

The problem with this is that there is only one proxy and thus, it may get overloaded. Therefore, to reduce load, we can use multiple screened host firewalls. And this is what normally used.

_________ __________
| | | |
Inside _____ | Router 1 |______________________________ | Router 2 |_____Outside
|_________| | |__________ |
____|____
| |
| Proxy 1 | Proxy2 .......
|________ |

Modem pool

User can dial and open only a terminal server but he has to give a password. But TELNET and FTP client does not understand proxy. Therefore, people come out with Transparent proxy which means that I have some memory which keeps track of whether this packet was allowed earlier or not and therefore, I need not check this time. Client does not know that there is somebody who is checking my authentication.
So, transparent proxy is used only for checking the IP packets whereas proxy is used when many IP addresses are not available.

Private IP (PIP address)
It is an extension of transparent proxy. Here we also change the IP address (source address) to one of the allocated IP address and send it. So, the client does not know that the IP address has been changed, only the proxy server knows it. The machine that changes the IP address is Network address translator (NAT) . NAT also changes other things like CRC, TCP header checksum ( this is calculated using pseudo IP header). NAT can also change the port number.

e.g.. Port address translation

____________
X -------| |
| NAT |
Y -------|___________ |

X1 , P1 ----> G1 , Pa (IP address, port #)
X1 , P2 ----> G1 , Pb
Y , P3 ----> G1, Pc

I may not like to have global IP address because then, anybody can contact me inspite of these security measures. So, I work with Private IP. In that case, there has to be a one-to-one mapping between private IP and global IP.


Routing in Internet

The Origin of Internet

The response of Internet to the issue of choosing routing tables with complete/par tail information is shown by the following architecture. There are a few nodes having complete routing information and a large number of nodes with partial information. The nodes with complete information, called core gateways, are well connected by a Backbone Network. These nodes talk to each other to keep themselves updated. The non-core gateways are connected to the core gateways. (Historically, this architecture comes from the ARPANET.)

The original internet was structured around a backbone of ARPANET with several core gateways connected to it .These core gateways connected some Local Area Networks (LANs) to the rest of the network. These core gateways talked to themselves and exchanged routing information's. Every core gateway contained complete information about all possible destinations.

How do you do routing ?

The usual IP routing algorithm employs an internet routing table (some times called an IP routing table) on each machine that Stores the information about the possible destinations, and how to reach them.

Default Routes

This technique used to hide information and keep routing table size small consolidates multiple entries into a default case. If no route appears in the routing table, the routing routine sends the data gram to the default router.

Default routing is especially useful when a site has a small set of local addresses and only one connection to the rest of the internet.

Host-Specific Routes

Most IP routing software allows per-host routes to be specified as a special case. Having per-host routes gives the local network administrator more control over network use, permits testing, and can also be used to control access for security purposes. when debugging network connections or routing tables, the ability to specify a special route to one individual machine turns out to be especially useful.

Internet with Two Backbones

As long as there was just one single router connecting ARPANET with NSFNET there was no problem. The core gateways of ARPANET had information about all destinations and the routers inside NSFNET contained information about local destinations and used a default route to send all non-NSFNET traffic to between NSFNET and ARPANET as both of them used different matrices to measure costs. the core gateways through the router between ARPANET and NSFNET. However as multiple connections were made between the two backbones, problems arise. Which route should a packet from net1 to net2 take? Should it be R1 or R2 or R3 or R4 or R5? For this some exchange of routing information between the two backbones was necessary. But, this was again a problem as how should we compare information.

Gateway-To-Gateway Protocol (GGP)

This was the protocol used by the core-routers to exchange routing information among themselves. This is based on Distance Vector Algorithm and uses number of hops as the distance metric. This is a very poor metric as this does not take into account the load on the links and whether a link is slow or fast. A provision is made to manually increment the hop count in case a link is particularly slow.A protocol based on Shortest Path First Algorithm , known as SPREAD ,was also used for the same purpose.

Added Complexity To The Architecture Model

As the number of networks and routers increased, to reduce the load on the core gateways because of the enormous amount of calculations, routing was done with some core gateways keeping complete information and the non-core gateways keeping partial information.

In thisarchitecture, G1 ,G2 ,G3 are all core gateways and G4 and G5 are non-core gateways. We must have a mechanism for someone to tell G2 that it is connected to net2 , net3 and net4 , besides net1. Only G5 can tell this to G2 and so we must provide for a mechanism for G2 to talk to G5 . A concept of one backbone with core gateways connected to Autonomous Systems was developed. An Autonomous system is a group of networks controlled by a single administrative authority. Routers within an autonomous system are free to choose their own mechanisms for discovering , propagating ,validating , and checking the consistency of routes. Each autonomous system must agree to advertise network reachability information to other autonomous systems. Each advertisement propagates through a core router. The assumption made is that most of the routers in the autonomous system have complete information about the autonomous system. One such router will be assigned the task of talking to the core gateway.

Interior Gateway Protocols (IGP)

IGP is a type of protocols used by the routers in an autonomous system to exchange network reachability and routing information. Some of IGPs are given below.

Routing Information Protocol (RIP)

This is one of the most widely used IGP. It was developed at Berkeley. This is also known by the name of the program that implements it, routed .This implements Distance Vector algorithm.Features of RIP:

  • RIP uses a hop count metric to measure the distance to a destination. To compensate for differences in technologies, many RIP implementations allow managers to configure artificially high hop counts when advertising connections to slow networks. All routinfg updates are broadcast. This allows all hosts on the network to know about the routes.
  • To prevent routes from oscillating between two or more equal cost paths, RIP specifies that existing routes should be retained until a new route has strictly lower cost. Since RIP does not explicitly detect routing loops, RIP must either assume participants can be trusted (being part of one autonomous system) or take precautions to prevent such loops.
  • To prevent instabilities, RIP must use a low value for the maximum possible distance.RIP uses 16 as the maximum hop count. This restricts the maximum network diameter of the system to 16.
  • To solve the slow convergence problem arising due to slow propagation of routing information, RIP uses Hold Down. If a particular link is down , any new information about that link is not accepted till some time. This is because the router must wait till the information aboutthe link being down propagates to another router before accepting information from that router about that down link.
  • RIP runs on top of TCP/IP. RIP allows addresses to be of a maximum size of 14 Bytes. The Distance varies from 1 to 16 (where 16 is used to signify infinity). RIP address 0.0.0.0 denotes a default route. There is no explicit size of the RIP message and any number of routes can be advertized.

The message format is as shown:

OSPF(Open Shortest Path First )

This is an Interior Gateway Protocol designed by the Internet Engineering Task Force ( IETF ). This algorithm scales better than the vector distance algorithms. This Protocol tackles several goals:

  • OSPF includes type of service(ToS) routing. So, you can installmultiple routers to a given destination, one for each type of service. When routing a datagram, a router running OSPF uses both the destination address and type of service fields in the IP Header to choose a route.
  • OSPF provides load balancing. If there are multiple routes to a given destination at the same cost, OSPF distributes traffic over all the routes equally.
  • OSPF allows for creation of AREA HIERARCHIES. This makes the growth of the network easier and makes the network at a site easier to manage. Each area is self contained, so, multiple groups within a site can cooperate in the use of OSPF for routing.
  • OSPF protocol specifies that all exchanges between the routers be authenticated. OSPF allows variety of authentication schemes, and even allows one area to choose a different scheme from the other areas.
  • To accomodate multi-access networks like ethernet, OSPF allows every multi-access network to have a designated router( designated gateway).
  • To permit maximum flexibility, OSPF allows the description of a virtual network topology that abstracts away from details of physical connections.
  • OSPF also allows for routers to exchange routing information learned from other sites. The message format distinguishes between information acquired from external sources and information acquired from routers interior to the site, so there is no ambiguity about the source or reliability of routes.
  • It hastoo much overhead of sending LSPs but is gradually becoming popular.

Exterior Gateway Protocol (EGP)

If two routers belonging to two different autonomous systems exchange routing information ,the protocol used is called EGP . EGP consists of:

  • Acquisition Request: A router sends a request to another neighbour router saying 'I want to talk'.
  • Acquisition Confirm: This is a positive reply to the Acquisition request.
  • Acquisition Refuse: This is a negative response to the Acquisition request.
  • Cease Request: This requests termination of neighbour relationship.
  • Cease Confirm: This is a confirmation response to the Cease Request.
  • Hello : This is used to find if the neighbour router is up or down.This requests router to respond if alive.
  • I Heard You: This is a response to the Hello message confirming that the router is alive. Because it is possible for Hello or I Heard You messages to be lost in transit, EGP uses a k-out-of-n rule to determine whether a network is down.At least k of the last n messages must fail for the router to declare its neighbour down.
  • Poll Request: This is a request for network routing update.
  • Routing Update: This conveys routing information about reachable networks to its EGP neighbour. The routing information is the distance vector of the reachable networks.
  • Error: This is a response to an incorrect message.

EGP is used only to find network reachability and not for differentiating between good and bad routes. We can only use distance metric to declare a route plausible and not for comparing it with some other route (unless the two route form part of a same autonomous system). Since there cannot be two different routes to the same network, EGP restricts the topology of any internet to a tree structure in which a core system forms the root. There are no loops among other autonomous systems connected to it. This leads to several problems:

  • Univerasal connectivity fails if the core gateway system fails.
  • EGP can advertise only one path to a given network.
  • EGP does not support load sharing on routers between arbitrary autonomous systems.
  • Multiple backbone networks with multiple connections between them cannot be handled by EGP.

Border Gateway Protocol(BGP)

BGP is a distance-vector protocol used to communicate between different ASes. Instead of maintaining just the cost to each destination,each BGP router keeps track of the exact path used.Similarly,instead of periodically giving each neighbour its estimated cost to each destination, each BGP router tells its neighbours the path it is using.Every BGP router contains a module that examines routes to a given destination and scores them returning a number for destination to each route. Any route violating a policy constraint automatically gets a score of infinity. The router adapts a route with shortest distance.The scoring function is not a part of the BGP protocol and can be any function that the system managers want.BGP easily solves the count to infinity problem that plagues other distance-vector algorithms as whole path is known.

DNS(Continued), BOOTP, DHCP

Resource Record

A Resource Record (RR) has the following:
  • owner which is the domain name where the RR is found.
  • type which is an encoded 16 bit value that specifies the type of the resource in this resource record. It can be one of the following:
    • A a host address
    • CNAME identifies the canonical name of an alias
    • HINFO identifies the CPU and OS used by a host
    • MX identifies a mail exchange for the domain.
    • NS the authoritative name server for the domain
    • PTR a pointer to another part of the domain name space
    • SOA identifies the start of a zone of authority class which is an encoded 16 bit value which identifies a protocol family or instance of a protocol.
  • class One of: IN the Internet system or CH the Chaos system
  • TTL which is the time to live of the RR. This field is a 32 bit integer in units of seconds, an is primarily used by resolvers when they cache RRs. The TTL describes how long a RR can be cached before it should be discarded.
  • RDATA Data in this field depends on the values of the type and class of the RR and a description for each is as follows:
    • for A: For the IN class, a 32 bit IP address For the CH class, a domain name followed by a 16 bit octal Chaos address.
    • for CNAME: a domain name.
    • for MX: a 16 bit preference value (lower is better) followed by a host name willing to act as a mail exchange for the owner domain.
    • for NS: a host name.
    • for PTR: a domain name.
    • for SOA: several fields.
Note: While short TTLs can be used to minimize caching, and a zero TTL prohibits caching, the realities of Internet performance suggest that these times should be on the order of days for the typical host. If a change can be anticipated, the TTL can be reduced prior to the change to minimize inconsistency during the change, and then increased back to its former value following the change. The data in the RDATA section of RRs is carried as a combination of binary strings and domain names. The domain names are frequently used as "pointers" to other data in the DNS.

Aliases and Cannonical Names

Some servers typically have multiple names for convenience. For example www.iitk.ac.in & yamuna.iitk.ernet.in identify the same server. In addition multiple mailboxes might be provided by some organizations. Most of these systems have a notion that one of the equivalent set of names is the canonical or primary name and all others are aliases.

When a name server fails to find a desired RR in the resource set associated with the domain name, it checks to see if the resource set consists of a CNAME record with a matching class. If so, the name server includes the CNAME record in the response and restarts the query at the domain name specified in the data field of the CNAME record.

Name Servers

Name servers are the repositories of information that make up the domain database. The database is divided up into sections called zones, which are distributed among the name servers. Name servers can answer queries in a simple manner; the response can always be generated using only local data, and either contains the answer to the question or a referral to other name servers "closer" to the desired information. The way that the name server answers the query depends upon whether it is operating in recursive mode or iterative mode:
  • The simplest mode for the server is non-recursive, since it can answer queries using only local information: the response contains an error, the answer, or a referral to some other server "closer" to the answer. All name servers must implement non-recursive queries.
  • The simplest mode for the client is recursive, since in this mode the name server acts in the role of a resolver and returns either an error or the answer, but never referrals. This service is optional in a name server, and the name server may also choose to restrict the clients which can use recursive mode.

Recursive Query vs Iterative Query

If the server is supposed to answer a recursive quesry then the response is either the reource record data or a error code. A server operating in this mode will never return the name of any forwarding name server but will contact the appropiate name server itself and try to get the information.

In iterative mode, on the other hand, if the server does not have the information requested locally then it return the address of some name server who might have the information about the query. It is then the responsibility of the contacting application to contact the next name server to resolve its query and do this iteratively until gets an answer or and error.

Relative Names

In place of giving full DNS names like cu2.cse.iitk.ac.in or bhaskar.cc.iitk.ac.in one can give just cu2 or bhaskar.This can be used by the server side as well as the client side.But for this one has to manually specify these extensions in the database of the servers holding the resource records.

BOOTP

The BOOTP uses UDP/IP. It is run when the machine boots. The protocol allows diskless machines to discover their IP address and the address of the server host. Additionally name of the file to be loaded from memory and executed is also supplied to the machine. This protocol is an improvement over RARP which has the follwing limitations:
  1. Networks which do not have a broadcast method can't support RARP as it uses the broadcast method of the MAC layer underneath the IP layer.
  2. RARP is heavily dependent on the MAC protocol.
  3. RARP just supplies the IP address corresponding to a MAC address It doesn't support respond with any more data.
  4. RARP uses the computer hardware's address to identify the machine and hence cannot be used in networks that dynamically assign hardware addresses.

Events in BOOTP

  1. The Client broadcasts its MAC address (or other unique hardware identity number) asking for help in booting.
  2. The BOOTP Server responds with the data that specifies how the Client should be configured (pre-configured for the specific client)
Note: BOOTP doesn't use the MAC layer broadcast but uses UDP/IP.

Configuration Information

The important informations provided are:
  • IP address
  • IP address of the default router for that particular subnet
  • Subnet mask
  • IP addresses of the primary and secondary nameservers
Additionaly it may also provide:
  • Time offset from GMT
  • The IP address of a time server
  • The IP address of a boot server
  • The name of a boot file (e.g. boot image for X terminals)
  • The IP domain name for the client
But the problem with BOOTP is that it again can't be used for the dynamic IP's as in RARP servers.For getting dynamic IP's we use DHCP.

DHCP (Dynamic Host Configuration Protocol)

DHCP (Dynamic Host Configuration Protocol) is a protocol that lets network administrators manage centrally and automate the assignment of Internet Protocol (IP) addresses in an organization's network. If a machine uses Internet's set of protocol (TCP/IP), each machine that can connect to the Internet needs a unique IP address. When an organization sets up its computer users with a connection to the Internet, an IP address must be assigned to each machine. Without DHCP, the IP address must be entered manually at each computer and, if computers move to another location in another part of the network, a new IP address must be entered. DHCP lets a network administrator supervise and distribute IP addresses from a central point and automatically sends a new IP address when a computer is plugged into a different place in the network.

IP Address Allocation Mechanism

DHCP supports three mechanisms for IP address allocation.
  • Automatic allocation: DHCP assigns a permanent IP address to a host.
  • Dynamic allocation: DHCP assigns an IP address to a host for a limited period of time (or until the host explicitly relinquishes the address).
  • Manual allocation: Host's IP address is assigned by the network administrator, and DHCP is used simply to convey the assigned address to the host. A particular network will use one or more of these mechanisms, depending on the policies of the network administrator.

Messages Used by DHCP

  • DHCP Discover - Client broadcast to locate available servers. It is assumed atleast one of the servers will have resources to fulfill the request.( may include additional pointers to specific services required eg. particular subnet, minimum time limit etc ).
  • DHCP Offer - Server to client in response to DHCP Discover with offer of configration parameters.
  • DHCP Request - Client broadcast to servers requesting offered parameters from one server and implicitly declining offers from all others.( also important in case of lease renewal if the alloted time is about to expire ).
  • DHCP Decline - Client to server indicating configration parameters invalid.
  • DHCP Release - Client to server relinquishing network address and cancelling current lease.( in case of a graceful shut down DHCP server is sent a DHCP Release by the host machine).
  • DHCP Ack - Server to client with configration parameters, including committed Network address.
  • DHCP Nack - Server to client refusing request for configratin parameters (eg. requested network address already allocated).

Timers Used

Note that lease time is the time specified by the server for which the services have been provided to the client.
  • Lease Renewal Timer - When this timer expires machine will ask the server for more time sending a DHCP Request.
  • Lease Rebinding Timer - Whenever this timer expires, we have not been receiving any response from the server and so we can assume the server is down. Thus send a DHCP Request to all the servers using IP Broadcast facility. This is only point of difference between Lease renewal and rebinding.
  • Lease Expiry Timer - Whenever this timer expires, the system will have to start crashing as the host does not have a valid IP address in the network.

Timer Configuration Policy

The timers have this usual setting which can be configured depending upon the usage pattern of the network. An example setting has been discussed below. Lease Renewal = 50 % Lease time
Lease Rebinding = 87.5 % Lease time
Lease Expiry = 100 % Lease time

Transport Layer Protocol (TCP)

What is TCP?

TCP was specifically designed to provide a reliable end to end byte stream over an unreliable internetwork. Each machine supporting TCP has a TCP transport entity either a user process or part of the kernel that manages TCP streams and interface to IP layer. A TCP entity accepts user data streams from local processes, breaks them up into pieces not exceeding 64KB and sends each piece as a separate IP datagram. Client Server mechanism is not necessary for TCP to behave properly.

The IP layer gives no guarantee that datagram will be delivered properly, so it is up to TCP to timeout and retransmit, if needed. Duplicate, lost and out of sequence packets are handled using the sequence number, acknowledgements, retransmission, timers, etc to provide a reliable service. Connection is a must for this service.Bit errors are taken care of by the CRC checksum. One difference from usual sequence numbering is that each byte is given a number instead of each packet. This is done so that at the time of transmission in case of loss, data of many small packets can be combined together to get a larger packet, and hence smaller overhead.

TCP connection is a duplex connection. That means there is no difference between two sides once the connection is established.

TCP Connection establishment

The "three-way handshake" is the procedure used to establish a connection. This procedure normally is initiated by one TCP and responded to by another TCP. The procedure also works if two TCP simultaneously initiate the procedure. When simultaneous attempt occurs, each TCP receives a "SYN" segment which carries no acknowledgment after it has sent a "SYN". Of course, the arrival of an old duplicate "SYN" segment can potentially make it appear, to the recipient, that a simultaneous connection initiation is in progress. Proper use of "reset" segments can disambiguate these cases.

The three-way handshake reduces the possibility of false connections. It is the implementation of a trade-off between memory and messages to provide information for this checking.

The simplest three-way handshake is shown in figure below. The figures should be interpreted in the following way. Each line is numbered for reference purposes. Right arrows (-->) indicate departure of a TCP segment from TCP A to TCP B, or arrival of a segment at B from A. Left arrows (<--), indicate the reverse. Ellipsis (...) indicates a segment which is still in the network (delayed). TCP states represent the state AFTER the departure or arrival of the segment (whose contents are shown in the center of each line). Segment contents are shown in abbreviated form, with sequence number, control flags, and ACK field. Other fields such as window, addresses, lengths, and text have been left out in the interest of clarity.

      TCP A                                                TCP B

1. CLOSED LISTEN

2. SYN-SENT --> --> SYN-RECEIVED

3. ESTABLISHED <-- <-- SYN-RECEIVED

4. ESTABLISHED --> --> ESTABLISHED

5. ESTABLISHED --> --> ESTABLISHED

Basic 3-Way Handshake for Connection Synchronisation

In line 2 of above figure, TCP A begins by sending a SYN segment indicating that it will use sequence numbers starting with sequence number 100. In line 3, TCP B sends a SYN and acknowledges the SYN it received from TCP A. Note that the acknowledgment field indicates TCP B is now expecting to hear sequence 101, acknowledging the SYN which occupied sequence 100.

At line 4, TCP A responds with an empty segment containing an ACK for TCP B's SYN; and in line 5, TCP A sends some data. Note that the sequence number of the segment in line 5 is the same as in line 4 because the ACK does not occupy sequence number space (if it did, we would wind up ACKing ACK's!).


Simultaneous initiation is only slightly more complex, as is shown in figure below. Each TCP cycles from CLOSED to SYN-SENT to SYN-RECEIVED to ESTABLISHED.

      TCP A                                            TCP B

1. CLOSED CLOSED

2. SYN-SENT --> ...

3. SYN-RECEIVED <-- <-- SYN-SENT

4. ... --> SYN-RECEIVED

5. SYN-RECEIVED --> ...

6. ESTABLISHED <-- <-- SYN-RECEIVED

7. ... --> ESTABLISHED

Simultaneous Connection Synchronisation
Question: Why is three-way handshake needed? What is the problem if we send only two packets and consider the connection established? What will be the problem from application's point of view? Will the packets be delivered to the wrong application?

Problem regarding 2-way handshake
The only real problem with a 2-way handshake is that duplicate packets from a previous connection( which has been closed) between the two nodes might still be floating on the network. After a SYN has been sent to the responder, it might receive a duplicate packet of a previous connection and it would regard it as a packet from the current connection which would be undesirable.
Again spoofing is another issue of concern if a two way handshake is used.Suppose there is a node C which sends connection request to B saying that it is A.Now B sends an ACK to A which it rejects & asks B to close connection.Beteween these two events C can send a lot of packets which will be delievered to the application..


The first two figures show how a three way handshake deals with problems of duplicate/delayed connection requests and duplicate/delayed connection acknowledgements in the network.The third figure highlights the problem of spoofing associated with a two way handshake.

Some Conventions
1. The ACK contains 'x+1' if the sequence number received is 'x'.
2. If 'ISN' is the sequence number of the connection packet then 1st data packet has the seq number 'ISN+1'
3. Seq numbers are 32 bit.They are byte seq number(every byte has a seq number).With a packet 1st seq number and length of the packet is sent.
4. Acknowlegements are cummulative.
5. Acknowledgements have a seq number of their own but with a length 0.So the next data packet have the seq number same as ACK.

Connection Establish

  • The sender sends a SYN packet with serquence numvber say 'x'.
  • The receiver on receiving SYN packet responds with SYN packet with sequence number 'y' and ACK with seq number 'x+1'
  • On receiving both SYN and ACK packet, the sender responds with ACK packet with seq number 'y+1'
  • The receiver when receives ACK packet, initiates the connection.

Connection Release

  • The initiator sends a FIN with the current sequence and acknowledgement number.
  • The responder on receiving this informs the application program that it will receive no more data and sends an acknowledgement of the packet. The connection is now closed from one side.
  • Now the responder will follow similar steps to close the connection from its side. Once this is done the connection will be fully closed.


Image References

Computer Networks (CS425)- ARP,RARP,ICMP Protocols

Address Resolution Protocol

If a machine talks to another machine in the same network, it requires its physical or MAC address. But ,since the application has given the destination's IP address it requires some mechanism to bind the IP address with its MAC address.This is done through Address Resolution protocol (ARP).IP address of the destination node is broadcast and the destination node informs the source of its MAC address.
  1. Assume broadcast nature of LAN
  2. Broadcast IP address of the destination
  3. Destination replies it with its MAC address.
  4. Source maintains a cache of IP and MAC address bindings
But this means that every time machine A wants to send packets to machine B, A has to send an ARP packet to resolve the MAC address of B and hence this will increase the traffic load too much, so to reduce the communication cost computers that use ARP maintains a cache of recently acquired IP_to_MAC address bindings, i.e. they dont have to use ARP repeatedly. ARP Refinements Several refinements of ARP are possible: When machine A wants to send packets to macine B, it is possible that machine B is going to send packets to machine A in the near future.So to avoid ARP for machine B, A should put its IP_to_MAC address binding in the special packet while requesting for the MAC address of B. Since A broadcasts its initial request for the MAC address of B, every machine on the network should extract and store in its cache the IP_to_MAC address binding of A When a new machine appears on the network (e.g. when an operating system reboots) it can broadcast its IP_to_MAC address binding so that all other machines can store it in their caches. This will eliminate a lot of ARP packets by all other machines, when they want to communicate with this new machine.

Example displaying the use of Address Resolution Protocol:

Consider a scenario where a computer tries to contact some remote machine using ping program, assuming that there has been no exchange of IP datagrams previously between the two machines and therefore arp packet must be sent to identify the MAC address of the remote machine.
The arp request message (who is A.A.A.A tell B.B.B.B where the two are IP addresses) is broadcast on the local area network with an Ethernet protocol type 0x806. The packet is discarded by all the machines except the target machine which responds with an arp response message (A.A.A.A is hh:hh:hh:hh:hh:hh where hh:hh:hh:hh:hh:hh is the Ethernet source address). This packet is unicast to the machine with IP address B.B.B.B. Since the arp request message included the hardware address (Ethernet source address) of the requesting computer, target machine doesn't require another arp message to figure it out.

Reverse Address Resolution Protocol

RARP is a protocol by which a physical machine in a local area network can request to learn its IP address from a gateway server's Address Resolution Protocol table or cache. This is needed since the machine may not have permanently attacded disk where it can store its IP address permanently. A network administrator creates a table in a local area network's gateway router that maps the physical machine (or Medium Access Control - MAC) addresses to corresponding Internet Protocol addresses. When a new machine is set up, its RARP client program requests from the RARP server on the router to be sent its IP address. Assuming that an entry has been set up in the router table, the RARP server will return the IP address to the machine which can store it for future use.

Detailed Mechanism

Both the machine that issues the request and the server that responds use physical network addresses during their brief communication. Usually, the requester does not know the physical address. So, the request is broadcasted to all the machines on the network. Now, the requester must identify istelf uniquely to the server. For this either CPU serial number or the machine's physical network address can be used. But using the physical address as a unique id has two advantages.
  • These addresses are always available and do not have to be bound into bootstrap code.
  • Because the identifying information depends on the network and not on the CPU vendor, all machines on a given network will supply unique identifiers.
Request:
Like an ARP message, a RARP message is sent from one machine to the another encapsulated in the data portion of a network frame. An ethernet frame carrying a RARP request has the usual preamle, Ethernet source and destination addresses, and packet type fields in front of the frame. The frame conatins the value 8035 (base 16) to identify the contents of the frame as a RARP message. The data portion of the frame contains the 28-octet RARP message. The sender braodcasts a RARP request that specifies itself as both the sender and target machine, and supplies its physical network address in the target hardware address field. All machines on the network receive the request, but only those authorised to supply the RARP services process the request and send a reply, such machines are known informally as RARP servers. For RARP to succeed, the network must contain at least one RARP server.
Reply:
Servers answers request by filling in the target protocol address field, changing the message type from request to reply, and sending the reply back directly to the machine making the request.

Timing RARP Transactions
Since RARP uses the physical network directly, no other protocol software will time the response or retransmit the request. RARP software must handle these tasks. Some workstations that rely on RARP to boot, choose to retry indefinitely until the receive a response. Other implementations announce failure after only a few tries to avoid flooding the network with unnecessary broadcast.

Mulitple RARP Servers
Advantage: More reliability. Diadvantage: Overloading may result when all servers respond. So, to get away with disadvantage we have primary and secondary servers. Each machine that makes RARP request is assigned a primary server. Normally, the primary server responds but if it fails, then requester may time out and rebroadcast the request.Whenever a secondary server receives a second copy of the request within a short time of the first, it responds. But, still there might be a problem that all secondary servers respond, thus overloading the network. So, the solution adopted is to avoid having all secondary servers transmit responses simultaneously. Each secondary server that receives the request computes a random delay and then sends a response.

Drawbacks of RARP
  • Since it operates at low level, it requires direct addresss to the network which makes it difficult for an application programmer to build a server.
  • It doesn't fully utilizes the capability of a network like ethernet which is enforced to send a minimum packet size since the reply from the server contains only one small piece of information, the 32-bit internet address.
RARP is formally described in RFC903.

ICMP

This protocol discusses a mechanism that gateways and hosts use to communicate control or error information.The Internet protocol provides unreliable,connectionless datagram service,and that a datagram travels from gateway to gateway until it reaches one that can deliver it directly to its final destination. If a gateway cannot route or deliver a datagram,or if the gateway detects an unusual condition, like network congestion, that affects its ability to forward the datagram, it needs to instruct the original source to take action to avoid or correct the problem. The Internet Control Message Protocol allows gateways to send error or control messages to other gateways or hosts;ICMP provides communication between the Internet Protocol software on one machine and the Internet Protocol software on another. This is a special purpose message mechanism added by the designers to the TCP/IP protocols. This is to allow gateways in an internet to report errors or provide information about unexpecter circumstances. The IP protocol itself contains nothing to help the sender test connectivity or learn about failures.

Error Reporting vs Error Correction
ICMP only reports error conditions to the original source; the source must relate errors to individual application programs and take action to correct problems. It provides a way for gateway to report the error It does not fully specify the action to be taken for each possible error. ICMP is restricted to communicate with the original source but not intermediate sources.
ICMP Message Delivery
ICMP messages travel across the internet in the data portion of an IP datagram,which itself travels across the internet in the data portion of an IP datagram,which itself travels across each physical network in the data portion of a frame.Datagrams carryin ICMP messages are routed exactly like datagrams carrying information for users;there is no additional reliability or priority.An exception is made to the error handling procedures if an IP datagram carrying an ICMP messages are not generated for errors that result from datagrams carrying ICMP error messages.
ICMP Message Format
It has three fields;an 8-bit integer message TYPE field that identifies the message,an 8-bit CODE field that provides further information about the message type,and a 16-bit CHECKSUM field(ICMP uses the same additive checksum algorithm as IP,but the ICMP checksum only covers the ICMP message).In addition , ICMP messages that report errors always include the header and first 64 data bits of the datagram causing the problem. The ICMP TYPE field defines the meaning of the message as well as its format.
The Types include :

TYPE FIELD ICMP MESSAGE TYPE

0 ECHO REPLY
3 DESTINATION UNREACHABLE
4 SOURCE QUENCH
5 REDIRECT(CHANGE A ROUTE)
8 ECHO REQUEST
11 TIME EXCEEDED FOR A DATAGRAM
12 PARAMETER PROBLEM ON A DATAGRAM
13 TIMESTAMP REQUEST
14 TIMESTAMP REPLY
15 INFORMATION REQUEST(OBSOLETE)
16 INFORMATION REPLY(OBSOLETE)
17 ADDRESS MASK REQUEST
18 ADDRESS MASK REPLY TESTING DESTINATION

Reachabilty and Status :
TCP/IP protocols provide facilities to help network managers or users identify network problems.One of the most frequently used debugging tools invokes the ICMP echo request and echo reply messages.A host or gateway sends an ICMP echo request message to a specified destination.Any machine that receives an echo request formulates an echo reply and returns to the original sender.The request contains an optional data area; the reply contains a copy of the data sent in the request.The echo request and associated reply can be used to test whether a destination is reachable and responding.Because both the request and reply travel in IP datagrams,successful receipt of a reply verifies that major pieces of the transport system work.
1.1 : IP software on the source must route the datagram
2.2 : Intermediate gateways between the source and destination must be operating and must route datagram correctly.
3.3 : The destination machine must be running , and both ICMP and IP software must be working.
4.4 : Routes in gateways along the return path must be correct.

Echo Request and Reply
The field listed OPTIONAL DATA is a variable length field that contains data to be returned to the sender.An echo reply always returns exactly the same data as was received in the request.Fields IDENTIFIER and SEQUENCE NUMBER are used by the sender to match replies to request.The value of the TYPE field specifies whether the message is a request(8) or a reply(0).

Reports of Unreachable Destinations

The Code field in a destination unreachable message contains an integer that further describes th problem.Possible values are :

CODE VALUE MEANING

0 NETWORK UNREACHABLE
1 HOST UNREACHABLE
2 PROTOCOL UNREACHABLE
3 PORT UNREACHABLE
4 FRAGMENTATION NEEDED AND DF SET
5 SOURCE ROOT FAILED
6 DESTINATION NETWORK UNKNOWN
7 DESTINATION HOST UNKNOWN
8 SOURCE HOST ISOLATED
9 COMMUNICATION WITH DESTINATION NETWORK ADMINISTRATIVELY PROHIBITED
10 COMMUNICATION WTTH DESTINATION HOST ADMINISTRATIVELY PROHIBITED
11 NETWORK UNREACHABLE FOR TYPE OF SERVICE
12 HOST UNREACHABLE FOR TYPE OF SERVICE

Whenever an error prevents a gateway from routing or delivering a datagram, the gateway sends a destination unreachable message back to the source and then drops the datagram.Network unreachable errors usually imply roting failures ; host unreachable errors imply delivery failures.Because the message contains a short prefix of the datagram that caused the problem, the source will know exactly which address is unreachable. Destinations may be unreachable because hardware is temporarily out of service, because the sender specified a nonexistent destination address, or because the gateway does not have a route to the destination network. Although gateways send destination unreachable messages if they cannot route or deliver datagrams, not all such errors can be detected.If the datagram contains the source route option with an incorrect route, it may trigger a source route failure message.If a gateway needs to fragment adatagram but the "don't fragment" bit is set, the gateway sends a fragmentation needed message back to the source.

Congestion and Datagram Flow Control :
Gateways cannot reserve memory or communication resources in advance of receiving datagrams because IP is connectionless. The result is, gateways can overrun with traffic, a condition known as congestion.Congestion arises due to two reasons :

  1. A high speed computer may be able to generate traffic faster than a network can transfer it .
  2. If many computers sumultaneously need to send datagrams through a single gateway , the gateway can experience congestion, even though no single source causes the problem.
When datagrams arrive too quickly for a host or a gateway to process, it enqueues them in memory temporarily.If the traffic continues, the host or gateway eventually exhausts menory ans must discard additional datagrams that arrive. A machine uses ICMP source quench messages to releive congestion. A source quench message is a request for the source to reduce its current rate of datagram transmission.
There is no ICMP messages to reverse the effect of a source quench.
Source Quench :
Source quench messages have a field that contains a datagram prefix in addition to the usual ICMP TYPE,CODE,CHECKSUM fields.Congested gateways send one source quench message each time they discard a datagram; the datagram prefix identifies the datagram that was dropped.

Route Change Requests From Gateways :
Internet routing tables are initialized by hosts from a configuration file at system startup, and system administrators seldom make routing changes during normal operations.Gateways exchange routing information periodically to accomadate network changes and keep their routes up-to-date.The general rule is , Gateways are assumed to know correct routes; host begin wint minimal routing information and learn new routes from gateways. The GATEWAY INTERNET ADDRESS field contains the address of a gateway that the host is to use to reach the destination mentioned in the datagram header. The INTERNET HEADER field contains IP header plus the next 64 bits of the datagram that triggered the message.The CODE field of an ICMP redirect message further specifies how to interpret the destination address, based on values assigned as follows :

Code Value Meaning
0 REDIRECT DATAGRAMS FOR THE NET
1 REDIRECT DATAGRAMS FOR THE HOST
2 REDIRECT DATAGRAMS FOR THE TYPE OF SERVICE AND NET
3 REDIRECT DATAGRAMS FOR THE TYPE OF SERVICE AND HOST
Gateways only send ICMP redirect requests to hosts and not to other gateways.

Detecting Circular or Excessively Long Routes :
Internet gateways compute a next hop using local tables, errors in routing tables can produce a routing cycle for some destination. A routing cycle can consist of two gateways that each route a datagram for a particular destination to other, or it can consist of several gateways.To prevent datagrams from circling forever in a TCP/IP internet, each IP datagram contains a time-to-live counter , sometimes called a hop count. A gateway decrements the time-to-live counter whenever it processes the datagram and discards the datagram when the count reaches zero. Whenever a gateway discards a datagram because its hop count has reached zero or because a timeout occured while waiting for fragments of a datagram ,it sends an ICMP time exceeded message back to the datagram's source, A gateway sends this message whenever a datagram is discarded because the time-to-live field in the datagram header has reached zero or because its reassembly timer expired while waiting for fragments.
The code field explains the nature of the timeout :
Code Value Meaning
0 TIME-TO-LIVE COUNT EXCEEDED
1 FRAGMENT REASSEMBLY TIME EXCEEDED
Fragment reassembly refers to the task of collecting all the fragments from a datagram.

Reprting Other Problems :
When a gateway or host finds problems with a datagram not covered by previous ICMP error messages it sends a parameter problem message to the original source.To make the message unambigous, the sender uses the POINTER field in the message header to identify the octet in the datagram that caused the problem. Code 1 is used to report that a required option is missing; the POINTER field is not used for code 1.

Clock Synchronization nd Transmit the estimation :
ICMP messages are used to obtain the time from another machine.A requesting machine sends an ICMP timestamp request message to another machine, asking that the second machine return its current value of the time of day. The receiving machine returns a timestamp reply back to the machine making the request. TCP/IP protocol suite includes several protocols that can be used to synchronize clocks. This is one of the simplest techniques used by TCP/IP. The TYPE field idintifies the message as a request (13 ) or a reply ( 14 ); the IDENTIFIER and SEQUENCE NUMBER fields are used by the source to associate replies with requests.The ORIGINATE TIMESTAMP filed is filled in by the original sendet just before the packet is transmitted, the RECEIVE TIMESTAMP field is filled immediately upon receipt of a request, and the TRANSMIT TIMESTAMP field is filled immediately before the reply is transmitted. Hosts use the three timestamp fields to compute estimates of the delay time between them and to synchronize their clock.A host can compute the total time required for a request to travel to a destination, be transformed into a reply, and return. In practice, accurate estimation of round-trip delay can be difficult and substantially restirct the utility of ICMP timestanp messages.To obtain an accurate estimate to round trip delay one must take many measurements and average them.

Obtaining a Subnet Mask:
Subnet addressing is used by the hosts to extract some bits in the hostid portion of their IP address to identify a physical network.To participate in subnet addressing, hosts need to know which bits of the 32-bit internet address correspond to the physical network and which correspond to host identifiers. The information needed to interpret the address is represented in a 32-bit quatity called the subnet mask. To learn the subnet mask used for the local network, a machine can send an address mask request message to a gateway and receive an address mask reply. The TYPE field in an address mask message specifies whether the message is a request ( 17 ) or a reply ( 18 ). A reply contains the nework's subnet address mask in the ADDRESS MASK field.The IDENTIFIER and SEQUENCE NUMBER fields allow a machine to associate replies with requests.

Computer Networks (CS425)- Routing Algorithms

Non-Hierarchical Routing

In this type of routing, interconnected networks are viewed as a single network, where bridges, routers and gateways are just additional nodes.
  • Every node keeps information about every other node in the network
  • In case of adaptive routing, the routing calculations are done and updated for all the nodes.
The above two are also the disadvantages of non-hierarchical routing, since the table sizes and the routing calculations become too large as the networks get bigger. So this type of routing is feasible only for small networks.

Hierarchical Routing

This is essentially a 'Divide and Conquer' strategy. The network is divided into different regions and a router for a particular region knows only about its own domain and other routers. Thus, the network is viewed at two levels:
  1. The Sub-network level, where each node in a region has information about its peers in the same region and about the region's interface with other regions. Different regions may have different 'local' routing algorithms. Each local algorithm handles the traffic between nodes of the same region and also directs the outgoing packets to the appropriate interface.
  2. The Network Level, where each region is considered as a single node connected to its interface nodes. The routing algorithms at this level handle the routing of packets between two interface nodes, and is isolated from intra-regional transfer.
Networks can be organized in hierarchies of many levels; e.g. local networks of a city at one level, the cities of a country at a level above it, and finally the network of all nations.

In Hierarchical routing, the interfaces need to store information about:

  • All nodes in its region which are at one level below it.
  • Its peer interfaces.
  • At least one interface at a level above it, for outgoing packages.
Advantages of Hierarchical Routing :
  • Smaller sizes of routing tables.
  • Substantially lesser calculations and updates of routing tables.
Disadvantage :
  • Once the hierarchy is imposed on the network, it is followed and possibility of direct paths is ignored. This may lead to sub optimal routing.

Source Routing

Source routing is similar in concept to virtual circuit routing. It is implemented as under:
  • Initially, a path between nodes wishing to communicate is found out, either by flooding or by any other suitable method.
  • This route is then specified in the header of each packet routed between these two nodes. A route may also be specified partially, or in terms of some intermediate hops.
Advantages:
  • Bridges do not need to lookup their routing tables since the path is already specified in the packet itself.
  • The throughput of the bridges is higher, and this may lead to better utilization of bandwidth, once a route is established.
Disadvantages:
  • Establishing the route at first needs an expensive search method like flooding.
  • To cope up with dynamic relocation of nodes in a network, frequent updates of tables are required, else all packets would be sent in wrong direction. This too is expensive.

Policy Based Routing

In this type of routing, certain restrictions are put on the type of packets accepted and sent. e.g.. The IIT- K router may decide to handle traffic pertaining to its departments only, and reject packets from other routes. This kind of routing is used for links with very low capacity or for security purposes.

Shortest Path Routing

Here, the central question dealt with is 'How to determine the optimal path for routing ?' Various algorithms are used to determine the optimal routes with respect to some predetermined criteria. A network is represented as a graph, with its terminals as nodes and the links as edges. A 'length' is associated with each edge, which represents the cost of using the link for transmission. Lower the cost, more suitable is the link. The cost is determined depending upon the criteria to be optimized. Some of the important ways of determining the cost are:
  • Minimum number of hops: If each link is given a unit cost, the shortest path is the one with minimum number of hops. Such a route is easily obtained by a breadth first search method. This is easy to implement but ignores load, link capacity etc.
  • Transmission and Propagation Delays: If the cost is fixed as a function of transmission and propagation delays, it will reflect the link capacities and the geographical distances. However these costs are essentially static and do not consider the varying load conditions.
  • Queuing Delays: If the cost of a link is determined through its queuing delays, it takes care of the varying load conditions, but not of the propagation delays.
Ideally, the cost parameter should consider all the above mentioned factors, and it should be updated periodically to reflect the changes in the loading conditions. However, if the routes are changed according to the load, the load changes again. This feedback effect between routing and load can lead to undesirable oscillations and sudden swings.

Routing Algorithms

As mentioned above, the shortest paths are calculated using suitable algorithms on the graph representations of the networks. Let the network be represented by graph G ( V, E ) and let the number of nodes be 'N'. For all the algorithms discussed below, the costs associated with the links are assumed to be positive. A node has zero cost w.r.t itself. Further, all the links are assumed to be symmetric, i.e. if di,j = cost of link from node i to node j, then d i,j = d j,i . The graph is assumed to be complete. If there exists no edge between two nodes, then a link of infinite cost is assumed. The algorithms given below find costs of the paths from all nodes to a particular node; the problem is equivalent to finding the cost of paths from a source to all destinations.

Bellman-Ford Algorithm

This algorithm iterates on the number of edges in a path to obtain the shortest path. Since the number of hops possible is limited (cycles are implicitly not allowed), the algorithm terminates giving the shortest path.

Notation:
d i,j = Length of path between nodes i and j, indicating the cost of the link.
h = Number of hops.
D[ i,h] = Shortest path length from node i to node 1, with upto 'h' hops.
D[ 1,h] = 0 for all h .

Algorithm :

Initial condition : D[ i, 0] = infinity, for all i ( i != 1 )

Iteration : D[i, h+1] = min { di,j + D[j,h] } over all values of j .

Termination : The algorithm terminates when

D[i, h] = D [ i, h+1] for all i .

Principle:
For zero hops, the minimum length path has length of infinity, for every node. For one hop the shortest-path length associated with a node is equal to the length of the edge between that node and node 1. Hereafter, we increment the number of hops allowed, (from h to h+1 ) and find out whether a shorter path exists through each of the other nodes. If it exists, say through node 'j', then its length must be the sum of the lengths between these two nodes (i.e. di,j ) and the shortest path between j and 1 obtainable in upto h paths. If such a path doesn't exist, then the path length remains the same. The algorithm is guaranteed to terminate, since there are utmost N nodes, and so N-1 paths. It has time complexity of O ( N3 ) .

Dijkstra's Algorithm

Notation:
Di = Length of shortest path from node 'i' to node 1.
di,j = Length of path between nodes i and j .

Algorithm
Each node j is labeled with Dj, which is an estimate of cost of path from node j to node 1. Initially, let the estimates be infinity, indicating that nothing is known about the paths. We now iterate on the length of paths, each time revising our estimate to lower values, as we obtain them. Actually, we divide the nodes into two groups ; the first one, called set P contains the nodes whose shortest distances have been found, and the other Q containing all the remaining nodes. Initially P contains only the node 1. At each step, we select the node that has minimum cost path to node 1. This node is transferred to set P. At the first step, this corresponds to shifting the node closest to 1 in P. Its minimum cost to node 1 is now known. At the next step, select the next closest node from set Q and update the labels corresponding to each node using :

Dj = min [ Dj , Di + dj,i ]

Finally, after N-1 iterations, the shortest paths for all nodes are known, and the algorithm terminates.

Principle
Let the closest node to 1 at some step be i. Then i is shifted to P. Now, for each node j , the closest path to 1 either passes through i or it doesn't. In the first case Dj remains the same. In the second case, the revised estimate of Dj is the sum Di + di,j . So we take the minimum of these two cases and update Dj accordingly. As each of the nodes get transferred to set P, the estimates get closer to the lowest possible value. When a node is transferred, its shortest path length is known. So finally all the nodes are in P and the Dj 's represent the minimum costs. The algorithm is guaranteed to terminate in N-1 iterations and its complexity is O( N2 ).

The Floyd Warshall Algorithm

This algorithm iterates on the set of nodes that can be used as intermediate nodes on paths. This set grows from a single node ( say node 1 ) at start to finally all the nodes of the graph. At each iteration, we find the shortest path using given set of nodes as intermediate nodes, so that finally all the shortest paths are obtained.

Notation
Di,j [n] = Length of shortest path between the nodes i and j using only the nodes 1,2,....n as intermediate nodes.

Initial Condition
Di,j[0] = di,j for all nodes i,j .

Algorithm
Initially, n = 0. At each iteration, add next node to n. i.e. For n = 1,2, .....N-1 ,

Di,j[n + 1] = min { Di,j[n] , Di,n+1[n] + Dn+1,j[n] }

Principle
Suppose the shortest path between i and j using nodes 1,2,...n is known. Now, if node n+1 is allowed to be an intermediate node, then the shortest path under new conditions either passes through node n+1 or it doesn't. If it does not pass through the node n+1, then Di,j[n+1] is same as Di,j[n] . Else, we find the cost of the new route, which is obtained from the sum, Di,n+1[n] + Dn+1,j[n]. So we take the minimum of these two cases at each step. After adding all the nodes to the set of intermediate nodes, we obtain the shortest paths between all pairs of nodes together. The complexity of Floyd-Warshall algorithm is O ( N3 ).

It is observed that all the three algorithms mentioned above give comparable performance, depending upon the exact topology of the network.

Followers

Labels

Feedback

Feedback Form

Feedback Form

Enter your name, e-mail and comments about our web-site, services and products. Then click Submit to send your comments. To clear the form, click Reset.

Name
E-mail
Comments