Saturday, March 27, 2010

Comparison of HTTP polling duplex and net.tcp performance in Silverlight 4 RC

Silverlight 4 RC that shipped recently at MIX 2010 supports a new mode of the HTTP polling duplex protocol with greatly improved performance compared to the version in Silverlight 3. This post compares the performance of the three mechanisms for asynchronous data push from the server to the client available in Silverlight 4 RC: the net.tcp protocol, and the two modes of the HTTP polling duplex protocol (SingleMessagePerPoll and MultipleMessagesPerPoll).

The net.tcp protocol has been added in Silverlight 4 Beta2. It enables duplex communication with a WCF service exposed using the net.tcp binding. You can read more about the net.tcp protocol in the previous two articles: an introductory post about the net.tcp protocol in Silverlight 4, and  a pub/sub sample using net.tcp protocol.

The HTTP polling duplex protocol has been supported since Silverlight 2. At the high level the protocol uses HTTP long polling mechanism to enable the server to send messages to the client asynchronously. The only mode supported by the protocol until Silverlight 4 is described in some depth in my article on the scale-out of the HTTP polling duplex protocol. The distinguishing characteristic of this mode is that the server can only send one message back to the client per each HTTP long poll. This mode, now called SingleMessagePerPoll, continues to be supported in Silverlight 4 for backwards compatibility.

The new HTTP polling duplex mode added in Silverlight 4 RC enables the server to send multiple messages back to the client using a single HTTP long poll response. In cases where the number of messages the server needs to send to the client is large, this mode can provide dramatic improvements in communication performance compared to the SingleMessagePerPoll mode.  The new mode, called MultipleMessagesPerPoll, uses .NET Framing protocol to frame multiple logical messages on a single HTTP response. The resulting binary octet stream is sent using HTTP response chunking enabled in Silverlight 4 which may enable the client to receive some of the messages before the server is done sending them. In addition, WCF’s binary session encoding is used to encode all messages sent over the single poll response which reduces the bandwidth consumption (ad-hoc measurements indicate about 50% reduction compared to sending the same set of test messages using text encoding).

Performance benchmark

To compare relative performance of the net.tcp protocol and the two modes of the HTTP polling duplex, I have created a small benchmark application. The application does not aspire to simulate real-world scenario: there is a single Silverlight client and a single WCF backend server, with the server sending a large number of messages to the client. Given the structure of test, the results of the benchmark should not be used for anything other than setting expectations about the relative performance of the three protocols. In particular, real-world performance will be affected by several factors including the number of concurrent clients, server and client hardware, and network environment. Although I am showing absolute throughput numbers below, they should be viewed as the results of an “optimistic case” on the given hardware, since a situation where there is only one client connected to a server is highly unrealistic. It is best to view these results as providing an idea about relative performance of the protocols.

A few words about the configuration of the benchmark: server and client were running on a single Intel dual-core 2.26GHz box with 4GB RAM, Windows 7 Ultimate, IIS7, and .NET Framework 3.5 SP1. The client was Silverlight 4 RC running in IE 8. After the client sent a single request message to server, the server responded by sending a requested number of messages back to the client using the selected protocol. The client measured the time it took to receive all the messages, and calculated the resulting throughput. Each measurement was taken 10 times and the average was calculated. Moreover, two variations of the test were conducted: in one the client was sending and receiving messages on the UI thread of the Silverlight application, in the other a worker thread was used. The code of the benchmark application can be downloaded from here.

Results are presented on the graph below. Please note the scale is logarithmic.

Relative performance of net.tcp and HTTP polling duplex protocols in Silverlight 4 RC

Conclusions

First and foremost, the performance increase of the new MultipleMessagesPerPoll mode of the HTTP polling duplex protocol, compared to the SingleMessagePerPoll mode supported since Silverlight 2, is a whooping 91,000% (that is 910 times faster) on a worker thread. In fact, using HTTP response chunking to send multiple messages back to the client using a single HTTP response allows the HTTP polling duplex protocol to achieve 88% of the net.tcp performance on a worker thread. This is a great result considering net.tcp is the WCF protocol that offers the best throughput, and the 12% performance loss compared to net.tcp is a price well worth paying for the lack of restrictions associated with using net.tcp in the Silverlight applications.

All but one variations of the test where the client initiated communication from the worker thread are substantially faster than corresponding UI thread variations. Silverlight application is running a single UI thread at a time, while there may be several worker threads created. So in case of the UI thread variations the communication bottleneck was clearly related to the necessity to synchronize response processing on a single thread on the client side. One surprising exception to this rule is the SingleMessagePerPoll mode, which shows the same performance on the worker thread and the UI thread. This is related to a combination of two factors. First, in the SingleMessagePerPoll mode, every message from the server to the client requires a new HTTP long poll from the client. Second, the HTTP implementation in Silverlight 4 synchronizes low level operations on the UI thread even if the request originated on a worker thread (which is a known limitation that will be addressed in future versions). MultipleMessagesPerPoll mode does not suffer from this constraint, since sending multiple messages from the server to the client requires only a single HTTP request.

Worth calling out is also the performance benefit of binary session encoding (which is the default) compared to text encoding in the MultipleMessagesPerPoll mode on a worker thread. Binary encoding offers 138% of the throughput of text encoding. This is related to reduced processing cost of binary XML compared to text XML; not to mention the reduction of network bandwidth (~50% of text encoding).

20 comments:

  1. Could you post the source code for this test?

    ReplyDelete
  2. Fallon, the benachmark code is availavle at http://janczuk.org/code/samples/nettcpperf.zip.

    ReplyDelete
  3. Is port restriction removed for net.tcp in Silverlight 4 RC ? I think I have seen somewhere that this restriction is removed?

    ReplyDelete
  4. Post restrictions continue to apply when the Silverlight application runs in the browser. There are no restrictions in the out of the browser mode (this also applies to cross domain HTTP calls).

    ReplyDelete
  5. More precise, there are no restriction for Trusted Application.

    ReplyDelete
  6. Thanks for posting this entry Tomansz, the information is concise and very helpful!

    ReplyDelete
  7. Great article !!

    Thank you so much for sharing your evaluations of Silverlight duplex communication.

    What do you think how much does MultipleMessagePerPoll affect scalability issues of polling duplex protocol (approximately)?
    In comparison to the conditions of your performance test article. I presume it wouldn't be 910 times more scalable, but could we say two times, five times, or 50 times?


    (revisited article of polling duplex scalability with multiple messages per poll would be excellent, but now I'm just being pain in the a** :D )

    ReplyDelete
  8. Ted, I don't have data to highlight any differences in backend scalability between the SingleMessagePerPoll and MultipleMessagesPerPoll at this point. However, I would expect the backend to be able to support very similar number of concurrent client connections in both HTTP long polling duplex modes given the threading model is largely the same.

    ReplyDelete
  9. Hmm... There's an example in MSDN called How to: Build a Duplex Service for a Silverlight Client (http://msdn.microsoft.com/en-us/library/cc645027%28VS.95%29.aspx). It's quite different from this one presented here. This example works for me, but the one presented in this blog post, NettcpPerf just throws an exception, which basically culminates to
    "{System.InvalidOperationException: Unrecognized attribute 'maxPendingMessagesPerSession' in service reference configuration. Note that attribute names are case-sensitive. Note also that only a subset of the Windows Communication Foundation configuration functionality is available in Silverlight.
    at System.ServiceModel.Configuration.PollingDuplexElement.ReadXml(XmlReader reader)}".

    ReplyDelete
  10. How about the comparison of server scalability between the MultipleMessagesPerPoll mode of the HTTP polling duplex protocol and .net.tcp protocol??

    ReplyDelete
  11. Aio, I don't have a comparison of server side scalability between MultipleMessagesPerPoll and net.tcp, but I expect net.tcp to do better in all scenarios due to HTTP overhead over TCP. In some messaging patterns the difference may be negligible; in particular, if the server keeps sending messages back to the client at frequency shorter than MaxOutputDelay, I expect the MultipleMessagesPerPoll to scale to a number of concurrent connections very similar to net.tcp.

    ReplyDelete
  12. From a previous blog QUOTE ==> We are planning to release a sample (in source code form) that demonstrates the consumption of the protocol from .NET 3.5, I will make sure to blog about it once it is available. <== END QUOTE

    Any news about that non-silverlight client sample that would support the long polling duplex protocol?

    Thanks!
    Eric

    ReplyDelete
  13. Hi!

    Thanks for a great article, we are currently implementing polling duplex (http binding) subscribe / publish type service similar to that as you described in a previous article, we are now looking at securing this service and exploring WIF, do you know of any issues we may have with polling duplex or alternatively could you suggest the a best practice of securing this service?

    Kind Regards & Thanks in advance
    Vanessa

    ReplyDelete
  14. Vanessa, your options for securing this communication are limited. For server autentication, integrity, and confidentiality you can use HTTPS. For client authentication, you can use transport level authentication at the HTTPS level or pass over a username/password security token at the SOAP level using WS-Security. On the server side both should compose with WIF.

    ReplyDelete
  15. Hi Tomasz, I wonder if you saw the same performance improvement if the client and server are on different machines? I created a server to send a jpeg image (16Kb) to the client on every request. Request is generated by the client on every MouseMove event. It works very fast when everything is local, but when the client is on another matchine, the speed is not good (even worst than the SingleMessagePerPoll mode).

    ReplyDelete
  16. Thanh, the benchmark numbers above are for a setup with client and server running on the same machine. I remember running the same benchmark with client and server on separate machines and the relative performance of http and net.tcp was comparable. However, have not saved those results. There are many factors that may influence the performance degradation you see in cross-machine deployment and it is hard to speculate without in-depth investigation. Sending a 16k message to a client on every mouse move event does sound like a lot of data to me. A few knobs you may want to try to tweak are documented at http://blogs.msdn.com/b/silverlightws/archive/2010/07/16/pollingduplex-multiple-mode-timeouts-demystified.aspx.

    ReplyDelete
  17. Hi Tomasz, I conducted another experiment with the service sending an image (20K) to the client for every request. 50 requests are generated on the client inside a single loop. Here is what I found:

    When the service was hosted outside IIS, I got an overall speed (from beginning to the end) of 9 fps or even more (15, 20 fps). When it was hosted inside IIS, the speed dropped to 5 fps.

    In both cases, all images were returned in a single response. The time it took the client to retrieve the entire stream were 500 ms and 5.5s respectively.

    Can you explain why there's such a difference between hosting the service inside and outside IIS?

    Thanks

    ReplyDelete
  18. Hi Tomasz,

    I noticed in your sample NetTCPPref, you have defined the PubSubChunkedBinary/PubSubChunkedText using a bindingConfiguration using the pollingDuplexBinding configuration. In those bindings, you have set the useTextEncoding property as necessary. Is this necessary?

    I've noticed other samples around the net that don't include using an extra pollingDuplexBinding extention configuration element and just simply use the customBinding when creating their EndPoints.

    If you don't use this does it have any impact on the results?

    Thanks,
    Kevin

    ReplyDelete
  19. Kevin, you can configure the endpoint using either the http polling duplex binding, or using a custom binding with polling duplex binding _element_. There is no difference other than the custom binding approach allowing you more control over properties that are not exposed from http polling duplex binding.

    ReplyDelete
  20. Hello Tomasz, I know this blog post is almost 4 years old- but if the code for your http://janczuk.org/code/samples/nettcpperf.zip application is still available, it would be a great help for some of the metrics I'm running. The link in the blog post for the nettcpperf.zip is broken. Thanks!!!! Michael.

    ReplyDelete

My Photo
My name is Tomasz Janczuk. I am currently working on my own venture - Mobile Chapters (http://mobilechapters.com). Formerly at Microsoft (12 years), focusing on node.js, JavaScript, Windows Azure, and .NET Framework.