Packet loss

What does application performance mean in the network?

Controlling packet loss with the Allegro Network Multimeter

TCP protocol reliability

Packet loss is disruptive to the network because it causes delays. TCP protocols establish reliable data transmission but disguise the effects of packet loss. TCP ensures that data is transmitted on the basis of a concept called "sliding window". This mechanism controls the transmitted byte sequences and the received acknowledgements.

With the help of sequencing, a receiver can inform the sender about missing data (e.g. packet loss). Independently, the sender can detect packet loss by the expiration of the retransmission timer. From a performance analysis perspective, the importance of packet loss must be understood, to avoid "ghosts in the machine". The following article looks at the behavior and performance of these mechanisms.

The retransmission timer

Each transmitted packet is linked by the sender to a retransmission timer. If the timer expires before an already transmitted data segment has been acknowledged, the data segment is declared lost and retransmitted. In terms of performance, there are two important features of the retransmission timer:

  • The default value for the initial retransmission timeout (RTO) is almost always 3000 milliseconds. This value is subsequently dynamically adjusted to a more realistic value according to the actual path retransmission time.
  • The timeout value always doubles for subsequent retransmissions of a packet.

For short data streams (e.g. web traffic), the retransmission timer is used to detect packet loss. A message of only 1000 bytes is transmitted in a single packet. Of course, if the packet is lost, the receiver cannot send an acknowledgement of receipt, because the receiver has no idea that the lost packet was ever sent. If the packet is lost early in the life of a TCP connection, e.g. SYN packets during the three-way handshake, the packet loss is not recovered for three seconds.

Triple duplicated ACKs

Within larger data flows, a lost packet can be detected before the retransmission timer has expired. This is done with the help of three received ACK duplicates. This mechanism is generally more efficient than waiting for the retransmission timer to expire. If the arriving node receives packets that are out of sequence, it sends out duplicate ACKs. Out-of-sequence packets can be packets sent after the missing packet data. The repeated packets of ACKs contain the exact sequence numbers that the receiver is still waiting for. When the sending node receives the third duplicate ACK, it assumes that the packet in question was not only delayed but actually lost. As a result, the lost packet is retransmitted. If this event occurs, the sender assumes that there is congestion in the network and reduces the Congestion Window by 50 percent to actively counteract the congestion. The slow-start mechanism increases the CWD value slowly.

Figure: Effects of packet loss (blue diagram) on the congestion window (brown diagram)

For example, if a server transmits a large file to a client, the throughput of the sending node is ramped up more slowly due to the slow-start mechanism. When the congestion window reaches 24, packet loss is detected by a triply duplicated ACK. Subsequently, the server retransmits the lost data and the CWD value is reduced to twelve. The slow-start mechanism will re-enable its congestion avoidance mode at this time. This behavior is often seen in modern networks.

Conclusion and corrective measures

What is clear is that preventing packet loss due to congestion will improve performance. However, this is only possible by reducing traffic congestion from other traffic and can be achieved in the following way:

  • QoS policies for queue prioritization
  • Reduction of the total traffic or increase of the bandwidth

If packet loss is due to other conditions, such as a faulty network interface, a misconfigured queue, or a bad cable connection, it is important to ensure that TCP connections are not closed unnecessarily and that TCP sessions do not time out unnecessarily. One can also reduce the value of the retransmit timeout. The initial value of three seconds corresponds to a perceived eternity in most networks.

back to the blog

Thousands of IT experts trust in Allegro Packets

Allegro Network Multimeter

Ask for more Information

Contact

Do you have questions ?
+49 341 / 59 16 43 53

Technology Partner

Honours