Saturday, December 19, 2015

Efficient file transmission, part 2

Right after I have fixed the SFTP performance problem mentioned in part 1, I found that another feed, this time using FTP, is gathering a backlog. Again the bandwidth was not an issue, latency was on the order of 40 ms, but somehow pushing even the smallest files took no less than 5 seconds. Equipped with Wireshark and a lot of enthusiasm, I started looking into the problem.
FTP is a much less sophisticated protocol than SFTP. It creates a dedicated TCP connection for every file transfer. The entire file is transferred over that connection, and closing the connection marks the end of file. All commands are exchanged over so-called control connection. Normal sequence of events for FTP upload is (after establishing the control connection & logging in):
  • Client sends PASV command
  • Server responds with an address, to which the client should upload
  • Client connects to that address and issues a STOR command
  • Client starts pushing the file through the data connection
The first thing I noticed when examining a wireshark trace was a two second gap between receiving PASV response and sending SYN packet; during that gap, two NBNS packets were sent to the server, with no response. NBNS stands for NetBIOS name service, and was completely out of place at that point, because PASV returns an IP address which does not require any lookups.
I launched the application in Visual Studio debugger, and during the 2 second pause I hit "Break all" to see what was happening. I found that the application was using DNS.Resolve to convert string to convert string to IPAddress object. The method is obsolete in .NET 2.0, and documentation suggests to use DNS.GetHostEntry instead. GetHostEntry (and Resolve) method has an interesting behavior that was not documented until .NET 3.5. When you pass an IP string to the method, it first performs a reverse name lookup to get host name, then it performs a name lookup to get IP addresses associated with the name. Since we only wanted an IPAddress object, I changed the code to use DNS.GetHostAddresses, which does not use name servers when passed an IP literal. This saved 2 seconds on every transfer.

Next interesting thing I noticed was that the client was sending data in 8KB batches; it sent a batch, then waited a long time for the server to acknowledge previous batch. Only then it sent another 8KB batch, which meant that most of the time the client was waiting for an acknowledgement from the server.
Receive window size captured in the traces showed plenty of free space; for a brief moment I suspected the congestion window, until I read that congestion window size doubles every round trip time as long as there are no packet losses, which were absent in the traces. Then I found that the default send buffer size in Windows is 8KB.
I modified the send buffer size to 1MB. After that the client was no longer waiting for acknowledgements, but instead was sending data until the receive window was full; then the server advertised a larger receive window, enabling the client to send even more data, until the network started dropping packets.
Final result: files with average size of 500KB that previously took 5+ seconds to upload, now were sent in less than a second over the same link.

No comments:

Post a Comment