Overview of the Network Layer#
Achieving network interconnection
Mainly solves the following problems:
- Provides services to the transport layer (reliable or unreliable);
- Network addressing issues (network numbering and router interfaces);
- Routing selection issues;
Packet Forwarding and Routing Selection#
The main task of the network layer is to transmit packets from the source host through multiple networks and segments to the destination host. This task can be divided into two important functions: packet forwarding and routing selection.
As shown in the figure above, the router needs to handle how to effectively get from A to B.
The routing table needs to optimize calculations for network topology changes.
The forwarding table is derived from the routing table, and the structure of the forwarding table needs to optimize lookups.
Two Services Provided by the Network Layer#
Connection-Oriented Virtual Circuit Service#
==Look at the picture; the core idea is to establish a direct path, as if pulling a physical line==
- Core idea is that "reliable communication should be guaranteed by the network itself".
- A network layer connection must first be established— a virtual circuit (VC)— to ensure all necessary network resources for both communicating parties.
- Both parties send packets along the established virtual circuit. (This is a logical connection; packets are transmitted along this logical connection using the store-and-forward method, not a true physical connection.)
- The packet header uses the complete destination host address only during the connection establishment phase; thereafter, each packet header only needs to carry a virtual circuit number.
- After communication ends, the previously established virtual circuit needs to be released. (There are two types, including virtual circuits that do not release.)
- Using a virtual circuit + reliable transport network protocol ensures that the sent packets ultimately arrive correctly (==error-free, in order, not lost, not duplicated==) at the receiver.
- Many wide-area packet-switched networks use connection-oriented virtual circuit services. For example, the former X.25 and gradually outdated Frame Relay (FR) and Asynchronous Transfer Mode (ATM).
However, the Internet does not adopt virtual circuit services but uses connectionless datagram services.
Connectionless Datagram Service#
Core idea is that "reliable communication should be guaranteed by the user host" (freely walking, leaving the problem for the receiver to handle).
- No need to establish a network layer connection. (Packets may have errors, be lost, duplicated, and out of order.)
- Each packet can take different paths. Therefore, the header of each packet must carry the complete address of the destination host.
Internet Protocol (IP)#
The Internet Protocol (IP) is the core protocol in the Internet layer of the TCP/IP architecture.
Communication between TCP and IP.
TCP provides reliability for some application layer protocols, while UDP provides... unreliable service.
Four protocols that work with the IP protocol:
ICMP: Internet Control Message Protocol
IGMP: Internet Group Management Protocol
RARP: Reverse Address Resolution Protocol
ARP: Address Resolution Protocol
Thus, the structure under the Internet layer is:
ICMP IGMP
↓
IP
↓
ARP
Interconnecting Heterogeneous Networks#
The Internet layer treats a group of networks as a unified network for processing.
IPv4#
What is IPv4#
A unique 32-bit identifier assigned to each host (or router) interface on the Internet (Internet).
IPv6 has 128 bits.
IPv4 addressing methods: classful addressing, subnetting, and classless addressing.
IPv4 Address Representation (Dotted Decimal Notation)#
Dotted decimal notation: each group of 8 bits is separated by a dot.
Convert binary to decimal and then combine.
Classful Addressing Method#
Class A Address#
Local software loopback test address, used for communication between processes within the host.
- If a host sends an IP datagram with a destination address of loopback, such as 127.0.0.1, then the protocol software of this host processes the data in the datagram and does not send the datagram to any network. IP datagrams with a destination address of the loopback address will never appear on any network because addresses with a network number of 127 are not network addresses.
- Host number all 0—network address: this is the address of all hosts in this network.
- Host number all 1—broadcast address: this is the address for broadcasting to this network.
Addresses starting with 127 are fixed for local use; the binary combinations following 127 are the addresses we can allocate locally.
126 is the largest 8-bit prefix that can connect to the network.
Different prefixes combine to form different "each" Class A addresses, and the specific number of addresses in each Class A address is determined by the following 24 bits.
Class B Address#
Class C Address#
Comparison of Address Classes#
The assignment range of five classes of IPv4 addresses.
18m51s
The point-to-point link between routers can also be seen as a network. This network has 65,534 hosts and one router interface, requiring 65,535 allocatable IPv4 addresses, which exceeds a Class B network. The number of allocatable IPv4 addresses can only assign one Class A network number to this network, which means selecting one from 1-126.
This network has 254 hosts and one router interface, requiring 255 allocatable IPv4 addresses, which exceeds a Class C network. The number of allocatable IPv4 addresses can only assign one Class B network number to this network, which means selecting one from 128.0 to 191.255. This network has 42 hosts and one router interface, requiring 43 allocatable IPv4 addresses, which can assign one Class C network number to this network, which means selecting one from 192.0.0 to 223.25. 255. This network has no hosts but contains two router interfaces, requiring two allocatable IPv4 addresses, which can assign one Class C network number to this network, which means selecting one from 192.0.0 to 23.25, but cannot be the same as the previously assigned Class C network number. 255.
After assigning network numbers to each network, we can assign corresponding IPv4 addresses under the respective network numbers to each host and router interface in each network. Please note that within the same network, the host numbers of different hosts and router interfaces must be different and cannot be all 0 or all 1. Because an address with a host number of all 0 is a network address, and an address with a host number of all 1 is a broadcast address, these two types of addresses cannot be assigned to hosts or router interfaces. This example shows that the biggest drawback of the IPv4 classful addressing method is that it easily leads to a large waste of IPv4 addresses. For example, the network connected to router R1's interface 2 requires 255 allocatable IPv4 addresses. Since a Class C network contains 254 allocatable IPv4 addresses, it cannot meet the network's demand for allocatable IPv4 addresses, so it must assign a Class B network number to this network. However, a Class B network contains 65,534 allocatable IPv4 addresses, while this network only needs 255 addresses, resulting in a ==large waste of addresses==.
IPv4 Subnetting Addressing Method - Phase Two - Three-Level Structure#
When a team applies for a Class B address for all its hosts, but the number of hosts is far below the upper limit of Class B, it leads to waste.
By borrowing some bits from the host number part of the IPv4 address as a subnet number to distinguish different subnets, we can utilize the remaining large number of IPv4 addresses in the original network without applying for new network addresses.
IPv4 Subnetting Addressing Method
- Subnet mask indicates how many bits of the classful IPv4 address's host number part are borrowed as the subnet number.
- Similar to IPv4 addresses, subnet masks are also composed of 32 bits.
- The leftmost consecutive bits of 1 correspond to the network number and subnet number in the IPv4 address;
- The subsequent consecutive bits of 0 correspond to the host number in the IPv4 address.
- Performing a bitwise logical AND operation between the subnetted IPv4 address and the corresponding subnet mask will yield the network address of the subnet where the IPv4 address resides.
- Regarding the default subnet mask: it can satisfy both subnetting and non-subnetting.
Example:
Classless Addressing - CIDR#
Problem: The vast number of Class C networks, due to the small number of addresses contained in each network, are not fully utilized.
Address Mask
The address mask used in classless addressing is similar to the subnet mask, consisting of 32 bits.
- The leftmost consecutive bits of 1 correspond to the network prefix in the IPv4 address;
- The subsequent consecutive bits of 0 correspond to the host number in the IPv4 address.
- The length of the network prefix and host number cannot be determined solely from the IPv4 address itself; it must be used in conjunction with the address mask.
- Using classless addressing, appropriate-sized CIDR address blocks can be allocated according to customer needs, thus allowing for more efficient allocation of IPv4 address space.
Slash Notation:
Example:
Route Aggregation#
- The longer the network prefix, the smaller the address block, and the more specific the route.
- If a router finds multiple routing entries matching when forwarding packets, it selects the one with the longest network prefix, which is called longest prefix match, because such a route is more specific.
Splitting CIDR address blocks, the concept of destination address means that after obtaining the prefix, all parts of the host are set to 0.
Here, a concept is clarified: the address 192.168.9.128/26 mentioned in the question is a subnet, meaning that within its class, the aggregated networks should be of a size less than or equal to 26, meaning a prefix of 25 bits can also be taken.
Application Planning#
Adhering to principles:
- First classify
- Classification means looking at a diagram and seeing how many bits to borrow from the host number to cover the diagram.
- Then divide the numbers
- The remaining bits of the host number are what we can use to allocate addresses to each.
Fixed-Length Subnet Mask#
Using the same subnet mask to divide subnets.
Subnets increase sequentially, and the host number controls the coverage size between each. If the host number is 5 bits, then the interval between their network addresses is 2 to the power of five, which is 32.
As shown below, each subnet is allocated 32, but the actual need is not that many, leading to unnecessary waste.
Network 5 only needs 4 addresses, but it has been allocated 32.
Variable-Length Subnet Mask#
- Using different subnet masks to divide subnets;
- Each subnet can be allocated a different number of IP addresses to minimize the waste of IP addresses;
Sub-block selection principles:
- The starting position of each sub-block cannot be arbitrarily selected; it can only be selected from addresses where the host number part is an integer multiple of the block size.
- It is recommended to select larger sub-blocks first.
Example:
Binary Tree Partitioning Method#
【Fixed-Length Subnetting】
【Variable-Length Subnetting】
There may be more than one method for variable-length subnetting, but it is not arbitrary to select five sub-blocks.
【Example】
IPv4 Address and MAC Address#
The encapsulation location of the IPv4 address and MAC address.
Changes in IPv4 Address and MAC Address During Packet Transmission
During the transmission of packets, the source IP address and destination IP address of the packet remain unchanged (that is, the origin and final destination, as shown in the source MAC 1 and destination MAC 2 in the figure).
During the transmission of packets, the source MAC address and destination MAC address change from link to link (or network to network).
Example:
Relationship Between IPv4 Address and MAC Address
The Internet uses IP addresses and MAC addresses to jointly complete the addressing work.
When a router receives an IP datagram, it forwards it based on the network number part of the destination IP address in its header, using its routing table.
The Address Resolution Protocol (ARP) maps IP addresses to MAC addresses.
Address Resolution Protocol ARP#
Each host maintains its own ARP cache table. Whenever it looks up an IP address, it first checks its cache table to see if there is corresponding ARP cache information for the destination IP address.
If not found, it will send a broadcast frame (all f) to all devices on the same network, carrying the following information:
I am xxx, my destination is xxx, my MAC address is xxx.
If each host checks the IP address and finds that it does not match the destination of the sending host, it will discard it. If it matches, it will send a unicast response message:
I am xxx, my destination is xxx, my MAC address is xxx.
After this response message returns to the sender, the sender will establish the correspondence between this IP address and MAC address in its ARP cache.
Dynamic and Static Types#
-- Dynamic: ARP records are automatically obtained by the host through ARP, with a default lifetime of two minutes. After the lifetime expires, the record is automatically deleted.
-- Static: The record is manually configured by the user or network administrator, and the lifetime varies across different operating systems, for example, it may not exist after a system restart or remain valid after a system restart.
Periodic Cache Table Correspondence#
- The correspondence between MAC addresses and switch interface numbers in the switch forwarding table needs to be periodically deleted because this correspondence is not permanent.
- Similarly, the correspondence between IP addresses and MAC addresses in ARP is also not permanent. For example, if a user changes a network card.
Sending and Forwarding Process of IP Datagrams#
- The host sends an IP datagram.
- The router forwards the IP datagram.
The router maintains the correspondence between the subnet where its interface is located and the subnet verification code and its own interface.
Upon receiving the destination address from the source, the router performs a logical AND operation between the subnet address mask corresponding to its maintained interface and the destination address. If the result matches one of its interfaces, it forwards it from that interface.
Default Gateway: As shown, how does host A know it can communicate in the network? It must specify a router in this network, which helps forward the packets. The specified router is called the default gateway.
How does the router forward the received IP datagram?
❶ Check if the received IP datagram is correct.
Is the lifetime expired? Is the header erroneous?
If not correct, the router discards the IP datagram and sends an error report to the source host that sent the IP datagram.
❷ Look up the routing table based on the destination IP address in the header of the IP datagram.
If a matching routing entry is found, forward according to the instructions of that routing entry; otherwise, discard the IP datagram and send an ICMP error report to the source host that sent the IP datagram.
The router ANDs the destination address with the address mask in the routing entry to obtain the destination network address and checks whether the destination network matches.
Router Handling of Broadcast Datagrams#
Routers isolate broadcast domains; other devices in the same subnet can receive broadcasts, but routers do not forward broadcasts.
Example:
12m35s
Header Format of IP Datagrams#
- The header format and content of IPv4 datagrams are the basis for implementing various functions of this protocol.
- In TCP/IP, various data formats are always described in units of 32 bits, or 4 bytes.
Divided into fixed part and variable part.
Fixed part: the part that every IPv4 datagram must contain (understood as format).
Variable part: the header of certain IPv4 datagrams, in addition to containing the 20-byte fixed part (this is the basis), also contains some optional fields to enhance the functionality of the IPv4 datagram (understood as plugins).
Version:#
4 bits, indicating the version of the IP protocol; both communicating parties must use the same version.
Header Length:#
Indicates the length of the IP datagram header; the field value is in units of 4 bytes.
- The minimum value of the header length is 0101, which is decimal 5, multiplied by 4 bytes per unit, indicating that the ==IPv4 datagram header occupies 20 bytes of fixed part== (no variable part).
- The maximum value of the header length is 1111, which is decimal 15, multiplied by 4 bytes per unit, indicating that the IPv4 datagram header contains 20 bytes of fixed part and a maximum of 40 bytes of variable part.
Optional Fields:#
Length varies from 1 to 40 bytes, used to support error detection, measurement, security, etc.
Used to enhance the functionality of IP datagrams, but also increases the overhead of routers processing IP datagrams, so they are rarely used; (all routers along the entire transmission path of the datagram must check the option fields, reducing the speed of routers processing datagrams).
04m14s
Padding Field:#
Ensures that the header length is a multiple of 4 bytes, using all 0s for padding.
When the length of the header (20 bytes fixed part + variable part) is not a multiple of 4 bytes, the corresponding number of all 0 bytes is padded to ensure that the length of the IPv4 datagram header is a multiple of 4 bytes.
Differentiated Services:#
- Occupies 8 bits, used to obtain better service; this field is generally not used.
Total Length:#
- Occupies 16 bits, indicating the total length of the IP datagram (header + payload), with a maximum of 65535.
- Identification + Flags + Fragment Offset work together on IPv4 datagram fragmentation.
Identification:#
Occupies 16 bits; all fragments of the same datagram should have the same identification. The IP software maintains a counter, incrementing it by 1 for each generated datagram and assigning this value to the identification field.
Flags:#
Occupies 3 bits.
- Lowest bit: MF bit, 1 indicates there are more fragments, 0 indicates this is the last fragment.
- Middle bit: DF bit, 1 indicates fragmentation is not allowed, 0 indicates it is allowed.
- Highest bit: Reserved bit, must be 0.
Fragment Offset:#
Occupies 13 bits, indicating how many units (in 8-byte units) the data payload of the fragmented datagram is offset from its position in the original datagram.
Time to Live:#
Occupies 8 bits, in seconds, with a maximum lifetime of 255. When a router forwards an IP datagram, it subtracts the value of this field from the time consumed by the IP datagram on this router. If it is not 0, it forwards; otherwise, it discards it.
The role of the Time to Live (TTL) field is to prevent incorrectly routed IPv4 datagrams from being trapped in the Internet indefinitely.
Protocol#
Length of 8 bits, used to indicate what type of protocol data unit (PDU) the data part of the IPv4 datagram is.
ICMP protocol: field set to 1.
IGMP protocol: field set to 2.
TCP protocol: field set to 6.
UDP protocol: field set to 17.
IPv6 protocol: field set to 41.
OSPF protocol: field set to 89.
Header Checksum#
Occupies 16 bits, used to check whether errors occurred in the header during transmission.
Each IPv4 datagram recalculates its header checksum as it passes through a router, as certain field values in its header (e.g., Time to Live (TTL), flags, and fragment offset) may change.
Sending end calculation method: add each word in 16-bit units (first set the header checksum field to 0), take the result's one's complement and fill it into the header checksum field.
Receiving end: add each word in 16-bit units, and take the result's one's complement.
Since the Internet layer does not provide reliable transmission services to its upper layers, and calculating the header checksum is a time-consuming operation, routers in IPv6 no longer calculate the header checksum, thus forwarding IP datagrams faster.
Source IP Address#
Occupies 32 bits, the sending end's IP address.
Destination IP Address#
Occupies 32 bits, the receiving end's IP address.
Concept of Fragmentation#
Example:
The key to calculating the header checksum is the operation of binary one's complement summation.
Static Route Configuration#
03m23s
Users or network maintenance personnel manually configure the routing table of the router using relevant commands.
Advantages: Manual configuration is simple and has low overhead.
Disadvantages: Cannot adapt to changes in network status (traffic, topology, etc.) in a timely manner.
Usage environment: Generally only used in small-scale networks.
Concept of Default Route#
The scenario is that now R1 needs to send to too many devices on the destination, unlike the few devices above, but rather the Internet, or a super large number of devices. At this point, it is unrealistic to manually write the mapping of all devices' IP addresses in the Internet into the routing table.
Solution: Add a default entry to replace the massive routing entries going to many networks on the Internet.
Specific Host Route#
08m44s
Manually add the router port leading to the subnet area as the next hop, manually add the specific network address as the destination network.
Classification of Routing Selection#
Within an autonomous system, the specific internal gateway protocol used is unrelated to the external gateway protocol.
- Routing selection between autonomous systems (like AS 1 and AS 2 here) is called inter-domain routing selection; routing within an autonomous system, such as AS 1 itself, is called intra-domain routing selection.
- Inter-domain routing: Select External Gateway Protocol (EGP) routing selection protocols, such as BGP 4.
- Intra-domain routing: Select Interior Gateway Protocol (IGP) routing selection protocols, such as RIP.
- External Gateway Protocol (EGP) and Internal Gateway Protocol (IGP) are merely classification names for routing selection protocols, not specific routing selection protocols.
- The term "gateway" in the names of external and internal gateway protocols is used because early RFC documents in the Internet did not use "router" but rather "gateway".
- If the source host and destination host are in different autonomous systems (which may use different internal gateway protocols), when the datagram reaches the boundary of an autonomous system, a protocol is needed to pass routing selection information to another autonomous system (EGP).
- First, it should be known: RIP is based on UDP, OSPF is based on IP, and BGP is based on TCP. These routing protocols defined in the TCP/IP protocol stack are used to discover and maintain the shortest path to the destination.
- Then how to determine which layer a protocol belongs to: The implementation of a protocol depends on the functionality of the next layer of the protocol.
TCP and UDP belong to the transport layer; IP belongs to the network layer;
(1): RIP uses UDP (RIP uses the transport layer), so ==RIP is an application layer protocol==;
(2): OSPF uses IP (RIP uses the network layer), so ==OSPF is a network layer protocol==; (Some consider it a transport layer).
(For example, in the examination syllabus: OSPF protocol does not use UDP datagrams for transmission but directly uses IP datagrams for transmission, so OSPF is classified as a network layer protocol - this point should be based on the examination syllabus);
(3): BGP uses TCP (RIP uses the transport layer), so BGP is an application layer protocol.
Routing Information Protocol (RIP) - Works at the Application Layer, Solving Network Layer Problems#
Basic concepts related to the Routing Information Protocol (RIP):
- Routing Information Protocol (RIP)
- RIP requires each router within an autonomous system (AS) to maintain distance records (Distance Vector, D-V) from itself to each network within the AS;
- RIP uses hop count to measure the distance to the destination network;
- The distance from the router to a directly connected network is defined as 1;
- The distance from the router to a non-directly connected network is defined as the number of routes plus 1;
- A path can contain a maximum of 15 routers, making it suitable only for small internets;
"Plus 1" is because once the destination network is reached, it is delivered directly, while the distance to directly connected networks is already defined as 1.
Principles:#
Principle 1: RIP will choose the route with the fewest routers (even if it is slow, it should take the least number of hops).
Principle 2: When there are multiple routes to the same destination network with equal distance, equal-cost load balancing can be performed to distribute traffic evenly across multiple equivalent paths (if the distances are equal, distribute across multiple waves).
Three Elements:#
- Who to exchange information with - Only exchange with neighboring routers;
- What information to exchange - The router's own routing table.
That is, the shortest RIP distance from this router to each network in the autonomous system (AS) and the next-hop router that should be taken to reach each network.
- When to exchange information - Periodically exchange. That is, routing information is exchanged at fixed time intervals.
To speed up RIP's convergence, when network topology changes, routers should promptly notify neighboring routers of the updated routing information, which is called triggered updates (updating immediately when changes occur, rather than waiting for fixed time intervals).
【Basic Working Process】:
- When the router first starts working, it only knows that its distance to directly connected networks is 1;
- Each router only periodically exchanges and updates routing information with neighboring routers;
- After several exchanges and updates, each router knows the shortest distance to each network within the AS and the next-hop address, which is called convergence; (Convergence is the process in which all nodes within the autonomous system obtain correct routing selection information.)
The RIP routing selection algorithm is based on distance vectors.
Each node calculates and updates its shortest distance to each destination node based on the direct link distance to its neighboring nodes and the distance vectors exchanged by neighbors, then announces the new distance vector to all its neighbors until the distance vector no longer changes.
For RIP messages sent from the neighboring router with address X, first modify all items in this message:
Change the addresses in the "next hop" field to X and add 1 to all "distance" field values (see the explanation later). Each item has three key data: to destination network N, distance d, and next-hop router X.
【RIP Routing Update Rules】:
- When a router C has all its next hops to reach a destination network marked as a question mark in its routing table, if C's RIP update message sending period has arrived, it will encapsulate the relevant routing information from its routing table into the RIP update message and send it to router D. D will modify C's routing table according to the update reasons and non-update reasons.
- Update Reasons
- Discovered a new network, add it.
- To reach the destination network, the same next hop (meaning all go through C), the latest message must be updated.
- To reach the destination network, the same next hop (meaning the original path does not go through C), the new route is advantageous and must be updated.
- To reach the destination network, different next hops, RIP distances are equal, can perform equal-cost load balancing, add it.
- Non-Update Reasons
- To reach the destination network, different next hops, the new route is inferior, do not update.
RIP has a problem — "bad news travels slowly."
"Bad news travels slowly" (routing loops or RIP distance infinity counting issues).
- This is an inherent problem of the distance vector algorithm (cannot be completely resolved).
- The following measures reduce the probability of this problem occurring or mitigate the harm caused by this problem:
- Limit the maximum RIP distance to 15 (16 indicates unreachable).
- When the routing table changes, immediately send routing update messages (i.e., "triggered updates"), rather than just sending periodically.
- Allow routers to record the interface that received a specific routing information and not allow the same routing information to be sent back through this interface (i.e., "horizontal segmentation").
5. RIP Versions and Related Message Encapsulation
RIP uses UDP for transmission (UDP port number is 520), and belongs to the application layer.
The encapsulation format of RIP: L2 Header + IP + UDP + RIP.
6. Advantages and Disadvantages of RIP
OSPF Protocol - Open Shortest Path First Protocol#
For larger autonomous systems (AS), OSPF should be used.
OSPF was developed to solve the problems of RIP; the name Open Shortest Path First does not imply that other routing selection protocols are not shortest path first. In fact, every internal gateway protocol within an autonomous system, such as RIP, seeks the shortest path.
The basic features of OSPF are:
- Based on link state management.
- Uses the shortest path algorithm (SPF) to ensure that routing loops do not occur.
- OSPF does not limit network size, has high update efficiency, and fast convergence speed.
- Uses IP datagrams (protocol number 89) for encapsulation, while RIP uses UDP encapsulation (application layer).
- In multi-access OSPF, designated routers (DR) and backup designated routers (BDR) are elected to maintain neighbor information.
Link State#
OSPF is based on link state, unlike RIP, which is based on distance vector.
Link state refers to which routers are adjacent to this router and the corresponding link cost (COST) (this cost represents cost, distance, delay, and bandwidth, set by network administrators).
The cost of OSPF in Cisco routers is: $\displaystyle \frac{100Mb/s}{link bandwidth}$; if the calculation result is less than 1, the count is set to 1; if greater than 1, the decimal is discarded, keeping the integer.
Hello Packets#
OSPF uses hello packets to maintain information about neighboring nodes.
Hello packets are encapsulated in IP datagrams and sent to the multicast address 224.0.0.5.
224.0.0.5 is a D-class multicast address.
The protocol number field in the IP datagram header is set to 89, indicating that the payload of the IP datagram is an OSPF packet.
Remember the IP datagram? It looks like this:
By sending hello packets, each router establishes a neighbor table.
The sending period of hello packets is 10 seconds.
If no hello packets are received from neighbors for 40 seconds, the neighbor router is considered unreachable.
Taking R1 as an example, R4 is connected to its interface 0, and interface 1 is connected to router R2. Upon receiving hello packets from R4 and R2, it resets the alive timer until the next hello packet is received from them.
LSA (Link State Advertisement)#
Routers in OSPF generate link state advertisements (LSA).
LSA contains the following two types of link state information:
- Link information of directly connected networks.
- Link state information of neighboring routers.
LSU (Link State Update)#
LSA is encapsulated in LSU packets and sent using a ==reliable== flooding method.
Flooding Method#
The key point of the flooding method is that ==routers send LSU link state update packets to all their neighboring routers==.
Each router that receives this packet ==forwards it to all its neighboring routers== (except for the upstream router), and so on, like a domino effect.
Reliable flooding means that after receiving the LSU link state update packet, an acknowledgment must be sent. Duplicate update packets do not need to be forwarded again, but an acknowledgment must be sent once.
LSDB (Link State Database)#
Each router in OSPF maintains an LSDB, which stores LSA link state advertisements.
Through the flooding of link state update packets (LSU) encapsulated with each router's link state advertisements (LSA), the link state databases (LSDB) of all routers will eventually reach consistency.
Shortest Path First Calculation Based on Link State Database#
Perform shortest path first calculations on each router's maintained LSDB to construct the shortest paths to reach other routers, i.e., build their routing tables.
Five Types of Packets Maintained by OSPF#
- Hello packets
- LSDB
- LSR
- LSU
- LSA
OSPF Working Process#
OSPF Routers in Multi-Access Networks#
To reduce the number of hello packets and link state update packets sent via flooding, OSPF adopts the following measures:
Elect a designated router (Designated Router, DR) and a backup designated router (Backup Designated Router, BDR).
All non-DR/BDR routers only establish neighbor relationships with DR/BDR.
Non-DR/BDR routers exchange information through DR/BDR.
After electing DR and BDR, the number of established neighbor relationships is $\displaystyle 2\times(n-1) + 1$.
OSPF Area Division#
To enable the OSPF protocol to be used in large-scale networks, OSPF divides an autonomous system (AS) into several smaller areas, referred to as areas.
The size of each area should not be too large, generally not exceeding 200 routers.
Each area is identified by a 32-bit area identifier, which can be represented in dotted decimal notation.
The benefit of dividing areas is that it limits the scope of link state information exchanged using flooding to each area rather than the entire autonomous system (AS), thus reducing communication volume across the entire network.
Chapter 4 - Network Layer
The backbone area needs to exchange routing information with other autonomous systems; routers in this area are called Autonomous System Border Routers (ASBR): R6.
- Routers within the backbone area are called Backbone Routers (BBR): R3, R4, R5, R6, and R7.
- If a router's all interfaces are within an area, it is called an Internal Router (IR): R1 and R2 in area 1, R8 in area 2, and R9 in area 3.
- Routers that connect this area with other areas within the same autonomous system are called Area Border Routers (ABR): R3, R4, and R7. An area border router has one interface used to connect to its own area and another interface used to connect to the backbone area.
OSPF Example#
BGP Border Gateway Protocol#
This is the external gateway protocol used for communication between two autonomous systems.
- Because the metrics for routing "cost" (distance, bandwidth, cost, etc.) may differ within different ASs, it is not feasible to use a unified "cost" as a metric to find the best route for routing selection between ASs.
For example, in the figure below, there may be multiple shortest path answers based on different metric scales.
On the other hand, we may have some personalized requirements for certain paths due to our own needs. - Routing selection between autonomous systems must consider relevant policies (political, economic, security, etc.);
-
- BGP only strives to find a route that can reach the destination network and is relatively good, rather than finding the best route (without looping).
How BGP Works#
- When configuring BGP, each AS administrator must select at least one router as the "BGP speaker" for that AS.
- Generally, two BGP speakers are connected via a shared network, and the BGP speaker is often the ==BGP border router==.
- Using TCP connections, the two BGP speakers exchange routing information, and each other is referred to as a neighbor or peer.
- In addition to running the BGP protocol, the BGP speaker must also run the internal gateway protocol (IGP) used by its AS, such as RIP or OSPF.
- In other words, the speaker plays a role in communicating both internally and externally.
- BGP speakers exchange network reachability information.
- This means the series of autonomous systems that must be traversed to reach a certain network.
- After BGP speakers exchange network reachability information with each other,
- Each BGP speaker finds a better route to reach each autonomous system based on the adopted policies.
- That is, constructing a tree structure without loops of interconnected autonomous systems.
![[Chapter 4 - Network Layer.pdf#page=350]]
- That is, constructing a tree structure without loops of interconnected autonomous systems.
- Each BGP speaker finds a better route to reach each autonomous system based on the adopted policies.
Example of BGP Speakers Exchanging Path Vectors#
BGP is suitable for multi-level structured Internet
![[Chapter 4 - Network Layer.pdf#page=351]]
The paths that can reach the destination are called path vectors.
AS3 must not choose paths that pass through itself to reach the destination.
Four Types of BGP Messages#
BGP-4 is currently the most widely used version, and the four types of BGP-4 messages are specified in [RFC 4271]
[[Chapter 4 - Network Layer.pdf#page=352]]
- OPEN: Used to establish a relationship with another neighboring BGP speaker, initializing communication.
- KEEPALIVE: Used to periodically confirm the connectivity of neighbors.
- UPDATE: Used to announce information about a certain route and list multiple routes to be withdrawn.
- NOTIFICATION: Used to send detected errors.
Basic Working Principles of Routers#
Definition of Routers#
A router is a dedicated computer with multiple input and output ports, whose task is to forward packets.
- Routing Selection
- Routing selection processor
- Routing selection protocol
- Packet Forwarding
- Input
- Output
- Switching structure
Forwarding Process of Routers#
- Signals enter the router through one of its input ports.
- The physical layer converts the signal into a bit stream and delivers it to the data link layer for processing.
- The data link layer identifies frames from the bit stream, removes the frame header and trailer, and delivers it to the network layer for processing.
- If the packet delivered to the network layer is a normal data packet waiting to be forwarded:
- It performs a "match": looking up the routing table for the destination address.
1. If no matching forwarding entry is found, the packet is discarded.
2. Otherwise, it is forwarded according to the port indicated in the matching entry. - The network layer updates certain field values in the data packet header.
- For example, decrementing the Time to Live (TTL) of the data packet by one.
- It is then delivered to the data link layer for encapsulation.
- The data link layer encapsulates the data packet into a frame and delivers it to the physical layer for processing.
- The physical layer treats it as a bit stream and converts it into the corresponding electrical signal for transmission.
In step four, a branch may occur. - If the packet delivered to the network layer is a routing packet containing routing information,
- It sends the routing packet to the routing selection processor.
- The routing selection processor updates its routing table based on the content of the routing packet.
- The routing table generally only contains mappings from destination networks to the next hop.
- The routing table needs to optimize calculations for changes in network topology, while the forwarding table is derived from the routing table.
- The structure of the forwarding table should optimize the lookup process.
- The routing selection processor not only processes received routing packets but also periodically sends routing information it knows to other routers.
Data Packets Delivered to the Network Layer#
Routing Packets Delivered to the Network Layer#
Characteristics of Routers#
- The routing table generally only contains mappings from destination networks to the next hop.
- The routing table needs to optimize calculations for changes in network topology.
- The forwarding table is derived from the routing table.
- The structure of the forwarding table should optimize the lookup process.
- Each port of the router should also have input and output buffers.
- The input buffer is used to temporarily store packets that have just entered the router but have not yet been processed.
- The output buffer is used to temporarily store packets that have been processed but have not yet been sent.
- Router ports generally have both input and output functions (meaning a port has two functions, which are differentiated into two ports for simplification).
- The speed of the switching structure is crucial for the performance of the router
Switching Structure#
Three ways to implement switching structures:
- Through memory
- Through a bus
- Through an interconnection network
The routing forwarding rates achievable by these three switching structures increase in order.
ICMP Internet Control Message Protocol#
Why ICMP Was Invented#
?
To more effectively forward IP datagrams and increase the chances of successful delivery of IP datagrams.
Characteristics of ICMP#
Which Layer Does ICMP Operate On?#
?
The Internet layer of the TCP/IP architecture uses the Internet Control Message Protocol.
Where is ICMP Encapsulated?#
?
ICMP messages are encapsulated in IP datagrams for transmission.
Who Uses ICMP and What for?#
Hosts or routers use ICMP to send ==error report messages== and ==inquiry messages==.
Types of ICMP Messages#
ICMP messages can be divided into the following two major categories
[[Chapter 4 - Network Layer.pdf#page=375]]
Error Report Messages#
?
Used to report error conditions to hosts or routers.
Common ICMP error report messages include the following five types
[[Chapter 4 - Network Layer.pdf#page=376]]
Destination Unreachable#
?
- When a router or host cannot deliver an IP datagram,
- It sends a destination unreachable message to the source.
- This can be further subdivided into up to 13 types based on the ICMP code field:
- Destination network unreachable,
- Destination host unreachable,
- Destination protocol unreachable,
- Destination port unreachable,
- Destination network unknown,
- Destination host unknown, etc.
Source Quench#
When a router or host drops an IP datagram due to congestion, it sends a source quench message to the source of the IP datagram, informing it to slow down the sending rate of IP datagrams.
[!HINT]
The sending speed is suppressed.
Time Exceeded (Timeout)#
When a router receives an IP datagram whose destination IP address is not its own, it decrements the value of the Time to Live (TTL) field in its header by 1.
If the result is not 0, the router forwards the datagram;
If the result is 0, the router not only discards the datagram but also sends a time exceeded (timeout) message to the source that sent the IP datagram.
Parameter Problem#
When a router or destination host receives an IP datagram and finds an error in the header based on the value of the checksum field, it discards the datagram and sends a parameter problem message to the source that sent the datagram.
Redirect#
The router sends a redirect message to the host, informing it that the next IP datagram should be sent to another router, allowing it to reach the destination host via a better route.
Notes on Error Report Messages#
The following situations should not send ICMP error report messages
[[Chapter 4 - Network Layer.pdf#page=386]]
?
- Do not send ICMP error report messages for ICMP error report messages.
- If an ICMP message is erroneous, it cannot report again.
- Do not send ICMP error report messages for all subsequent fragments of the first fragment of the IP datagram.
- Subsequent fragments of the IP datagram being transmitted should not be reported.
- Do not send ICMP error report messages for IP datagrams with multicast addresses.
- IP datagrams with multicast addresses, i.e., broadcast datagrams, should not be sent.
- Do not send ICMP error report messages for IP datagrams with special addresses (e.g., 127.0.0.0 or 0.0.0.0).
- Local data should not be reported.
Inquiry Messages#
?
Used to inquire about the status of hosts or routers.
Common ICMP inquiry messages include the following two types
[[Chapter 4 - Network Layer.pdf#page=388]]
Echo Request and Reply#
?
Sent by a host or router to a specific destination host or router.
The host or router that receives this message must send an ==ICMP Echo Reply message== back to the source host or router that sent the message.
This inquiry message is used to ==test whether the destination station is reachable and to understand its status==.
Timestamp Request and Reply#
?
Used to request a certain host or router to respond with the current date and time.
In the ICMP timestamp reply message, there is a ==32-bit== field, where the integer written represents the total number of seconds from ==January 1, 1900== to the current moment.
This inquiry message is used for ==clock synchronization and time measurement==.
Typical Applications of ICMP#
Packet Inter-network Probe (Ping)#
?
- Packet inter-network probe PING is used to test the connectivity between hosts or routers.
- PING is a direct application of the Internet layer ICMP used by the application layer of the TCP/IP architecture; it does not use the transport layer's TCP or UDP.
- The ICMP message type used in the PING application is ==Echo Request and Reply==.
- Some hosts or servers may ignore incoming ICMP Echo Request messages to prevent malicious attacks.
Traceroute#
?
- The traceroute application is used to probe which routers the IP datagram passes through to reach the destination host from the source host.
- The commands and implementation mechanisms of the traceroute application vary across different operating systems:
- In the UNIX version, the specific command is "==traceroute==", which uses the ==UDP protocol== in the ==transport layer== and uses ICMP message types only for ==error report messages== in the ==network layer==.
- In the Windows version, the specific command is "tracert", which directly uses the ==ICMP protocol== of the Internet layer in the application layer, using ICMP message types for ==Echo Request== and ==Reply Messages== as well as ==Error Report Messages==.
Traceroute - Example in Windows Version#
[!cite] Video transcript
- The message is as follows: this is the command line tool TRT I used on my home host with Windows operating system to probe which routers the host has to pass through to reach the official website server of the China Internet Network Information Center (CNNIC).
- Each row in the figure has three times because three tests are conducted for each router in the path
- The appearance of an asterisk in the time indicates that no response message has been received from the router within the timeout period.
- There are various reasons for this situation, for example, routers may strategically report errors for IP datagrams that have errors.
- For example, only one of ten identical errors is reported, rather than sending a corresponding error report for each identical error, to avoid being subjected to malicious attacks.
- Next, we take the trace RT command in the Windows version as an example to see how it uses ICMP to achieve the function of tracing routes. As shown in the figure, suppose host H1 wants to know which routers it passes through to reach host H2.
- H1 sends an ICMP Echo Request message to H2, which is encapsulated in an IP datagram.
- The Time to Live (TTL) field value in the IP datagram header is set to 1.
- The IP datagram reaches router R1, and its TTL field value is decremented by 1, resulting in 0.
- Therefore, R1 discards the datagram and sends an IP datagram encapsulated with an ICMP error report message back to the source host H1.
- The ICMP error report type is time exceeded.
- Thus, H1 knows the IP address of the first router in the path to reach H2.
- H1 continues to send the next IP datagram encapsulated with an ICMP Echo Request message, setting the TTL field value in the header to 2.
- After being forwarded by R2, the TTL field value of the datagram is reduced to 1
- The datagram reaches R2, and its TTL field value is decremented to 0. Therefore, R2 discards the datagram and sends an IP datagram encapsulated with an ICMP error report message back to the source host H1.
- The ICMP error report type is time exceeded, so H1 knows the IP address of the second router in the path to reach H2.
- H1 continues to send the next IP datagram encapsulated with an ICMP Echo Request message, setting the TTL field value in the header to 3.
- After being forwarded by R1 and R2, the datagram reaches H2, and its TTL field value is reduced to 1 by R1 and R2![](ipfs://QmcfY63CZRQ