One issue that I continually see reported by customers is slow network performance. Although there are literally a ton of issues that can effect how fast data moves to and from a server, there is one fix I've found that will resolve this 99% of time — disable Large Send Offload on the Ethernet adapter.
So what is Large Send Offload (also known as Large Segmetation Offload, and LSO for short)? It's a feature on modern Ethernet adapters that allows the TCP\IP network stack to build a large TCP message of up to 64KB in length before sending to the Ethernet adapter. Then the hardware on the Ethernet adapter — what I'll call the LSO engine — segments it into smaller data packets (known as "frames" in Ethernet terminology) that can be sent over the wire. This is up to 1500 bytes for standard Ethernet frames and up to 9000 bytes for jumbo Ethernet frames. In return, this frees up the server CPU from having to handle segmenting large TCP messages into smaller packets that will fit inside the supported frame size. Which means better overall server performance. Sounds like a good deal. What could possibly go wrong?
Quite a lot, as it turns out. In order for this to work, the other network devices — the Ethernet switches through which all traffic flows — all have to agree on the frame size. The server cannot send frames that are larger than the Maximum Transmission Unit (MTU) supported by the switches. And this is where everything can, and often does, fall apart.
The server can discover the MTU by asking the switch for the frame size, but there is no way for the server to pass this along to the Ethernet adapter. The LSO engine doesn't have ability to use a dynamic frame size. It simply uses the default standard value of 1500 bytes,or if jumbo frames are enabled, the size of the jumbo frame configured for the adapter. (Because the maximum size of a jumbo frame can vary between different switches, most adapters allow you to set or select a value.) So what happens if the LSO engine sends a frame larger than the switch supports? The switch silently drops the frame. And this is where a performance enhancement feature becomes a performance degredation nightmare.
To understand why this hits network performance so hard, let's follow a typical large TCP message as it traverses the network between two hosts.
- With LSO enabled, the TCP/IP network stack on the server builds a large TCP message.
- The server sends the large TCP message to the Ethernet adapter to be segmented by its LSO engine for the network. Because the LSO engine cannot discover the MTU supported by the switch, it uses a standard default value.
- The LSO engine sends each of the frame segments that make up the large TCP message to the switch.
- The switch receives the frame segments, but because LSO sent frames larger than the MTU, they are silently discarded.
- On the server that is waiting to receive the TCP message, the timeout clock reaches zero when no data is received and it sends back a request to retransmit the data. Although the timeout is very short in human terms, it rather long in computer terms.
- The sending server receives the retransmission request and rebuilds the TCP message. But because this is a retransmission request, the server does not send the TCP message to the Ethernet adapter to be segmented. Instead, it handles the segmentation process itself. This appears to be designed to overcome failures caused by the offloading hardware on the adapter.
- The switch receives the retransmission frames from the server, which are the proper size because the server is able to discover the MTU, and forwards them on to the router.
- The other server finally receives the TCP message intact.
This can basicly be summed up as offload data, segment data, discard data, wait for timeout, request retransmission, segment retransmission data, resend data. The big delay is waiting for the timeout clock on the receiving server to reach zero. And the whole process is repeated the very next time a large TCP message is sent. So is it any wonder that this can cause severe network performance issues.
This is by no means an issue that effects only Peer 1. Google is littered with artices by major vendors of both hardware and software telling their customers to turn off Large Send Offload. Nor is it specific to one operating system. It effects both Linux and Windows.
I've found that Intel adapters are by far the worst offenders with Large Send Offload, but Broadcom also has problems with this as well. And, naturally, this is a feature that is enabled by default on the adapters, meaning that you have to explicitly turn it off in the Ethernet driver (preferred) or server's TCP/IP network stack.
In the next article, I'll describe how to turn off Large Send Offload on both Linux and Windows systems.