"What can anyone give you greater than now" - William Stafford
The development of high-speed networks such as ATM and Myrinet promised performance gains for network applications. In general, and especially for local-area traffic, this performance has not been realized (see [Keeton95] for a detailed explanation and performance measurements). Examination of these results shows that packet processing overhead is the primary culprit for poor performance - that is, the time spent perparing packets for transmission and receiving them off the network is absurdly high.
The immediate problem, that of high processing overheads, has been licked by systems such as Fast Messages, U-Net, and our very own Active Messages. Unfortunately, all of these systems use new programming models - they deliver high performance for programs written to use them. If you have a legacy application, you're out of luck. (While U-Net has a version of TCP/IP running on it, you can't get it off their web page and I can't determine if they use a standard programming interface or not).
The nice thing about most network applications is that they are written to an interface, not a protocol or a network. This means that it is okay to change the underlying implementation or even the protocol, as long as things look the same to the programmer. Our Fast Communications work has exploited this feature. We have developed Fast Sockets, a user-level library which provides the Berkeley Sockets API but yields high-performance communication. It is built upon the Active Messages work done here at the University.
We have recorded single-byte round-trip times of 81 microseconds using Fast Sockets on the Myrinet/SPARCstation 20 network here at Berkeley; preliminary results on the UltraSPARCS have single-byte round-trips of 54 microseconds. Realized bandwidth is currently about 12 MB/second on the SPARC20's and 18MB/s on the UltraSPARCS - this is due to a forced copy, which we are trying to eliminate.