Speeding Up TCP with Selective Loss Prevention
Low latency is an important design goal for reliable data transmission protocols such as TCP and QUIC. However, timeout-based loss recovery can unnecessarily increase end-to-end latency. Previous work in reducing timeout-based loss recovery latency either duplicates every packet to avoid loss or focuses on fine-tuning the timeout timers to shorten the timeout latency without causing spurious packet retransmissions. In this work, we propose a new mechanism called Selective Loss Prevention (SLP) to reduce the loss recovery latency of a reliable transport protocol. Through extensive trace analysis, we find that not all lost packets are equal. The loss of packets with certain flags, such as SYN and PSH, is more likely to cause timeouts than other packets. Based on this observation, we propose to selectively duplicate an "important"packet whose loss is likely to increase a connection's latency. We design an algorithm to determine when to duplicate a lost packet proactively and incorporate it into TCP's congestion control algorithm so that duplicate packets will not congest the network. We incorporate SLP into Linux's kernel and evaluate its performance. Our results show that SLP can reduce timeout-based latency caused by the loss of important packets in a connection, and its overhead is low.