On performance measurements in general
Performance measurements are highly sensitive to details. One implementation detail can have a huge impact. The performance measurement results, posted here, apply to the implementation, that was used to generate the measurements. You can find the link to that implementation, the WCF Test Harness code (on codeplex), below.
The main reasons to blog about the performance measurement results is to receive feedback and validation of the results by others, who are also interested in it, and to verify that the implementation doesn’t contain any issues that might invalidate the results. If you find anything wrong with the code, have trouble reproducing, etc., pls. leave a comment and I’ll get back to it asap.
Throughput vs Latency
The implementation of the WCF Latency Test Harness is focussed on the latency of a range of methods, implemented by a WCF service and called by a WCF client. For throughput measurement on WCF services, pls. go to this Microsoft whitepaper.
Variations during the performance tests
The WCF Latency Test Harness allows to measure the latency of WCF whilst varying the following parameters:
- Test scope: Direct, SameAppDomain, SameProcess, SameMachine, Remote
- Bindings: nullTransport, netNamedPipe, netTcp, netMsmq, basicHttp, wsHttp (and .NET Remoting).
- instancingModes: Single, PerSession, PerCall
- Serializers: DataContractSerializer (DefaultSF (for Remoting, this is the BinaryFormatter)), PreserveObjectReferencesDataContractSerializer (PORDCSF), NetDataContractSerializer (NDCSF), XmlSerializer (XMLSF)
- CallModes: OneWay, TwoWay
- ReceiveDataMethods: several methods are implemented, each using a different data type for the data to be transferred, ranging from byte, string, Stream to more complex data structures.
- DataSizes: size of the used data type/data structure to be transferred.
You’ll find enums for most of the parameters above in the implementation (if not already provided by the .NET Framework). Most of these parameters can be manipulated by changing the configuration only.
Overview of measurement method
Before the tests start, the appropriate configuration files are copied to the service and client private application paths. The service is self-hosted and loads the configuration from those copied files.
The implementation contains a WCF service, which exposes a number of methods via a ServiceContract and DataContract. It also contains a WCF client, which calls the WCF service methods within a loop.
The latency measurements of the WCF service methods are done using the System.Diagnostics.Stopwatch. The client call to the WCF service is synchronized (even in a OneWay scenario) by using a 2nd WCF netTcp service, hosted by the client, which receives the receipt signal of the service. There is never more than one call on the service at any one time. This also means that the queueing mechanism of WCF is not tested to the full extent. Since the aim of the WCF Latency Test Harness is to measure the latency, this is acceptable.
In order to test methods with different parameters, the following data structures were used.
Member variables data types:
Data: byte of size DataSize.
Text: string of size DataSize.
In the tests with the AdvancedContainer and AdvancedTextContainer data structures, an object hierarchy of 1x3x3 was used (One parent, three children, with each three children), resulting in a memory footprint of about 10x the footprint of the SimpleContainer and SimpleTextContainer (for the same DataSize).
Customizing the WCF Latency Test Harness
Additional bindings can be tested by adding some configuration, including binding configuration. Testing the impact of configuration settings can be done by changing the binding configuration file. For other additions or changes (additional data structures, etc.), changes to the code are necessary.
WCF is secure by default. In the WCF Latency Test Harness implementation, in the default configuration (RequestedBindingConfigTags=Default), all security has been disabled for the WCF bindings. The impact of enabled security can be tested by adjusting the configuration files and configuring certificates etc.
Test parameter combinations
Some combinations of the test parameters above just don’t fly, or weren’t implemented due to other reasons:
- netMsmq and TwoWay. WCF enforces that methods are decorated with the OneWay=true OperationContractAttribute parameter in a netMsmq scenario.
- netMsmq and PerSession.
- .NET Remoting and other-than DefaultSF Serializer. .NET Remoting uses the BinaryFormatter, while the other WCF Bindings use the default DataContract serializer. Hence the term “DefaultSF” (Default Serializer Format).
- nullTransport and other-than SameAppDomain test scope.
Target system specifications
The measurements were performed on the following system:
- Intel Core 2 Duo E4500 2.20 GHz CPU
- 2.0 Gb RAM
- 32-bit Vista Business, .NET 3.5
Results were generated using Changeset 5849 of the source code.
The raw data of the graphs, presented below, can be found in the file SameAppDomain.Single.zip here. This file also contains the configuration files, used to produce these results, and the performance counter log output (CPU usage etc.).
N.B. All graphs are with double logarithmic scales.
Time for some text here:
- basicHttp and wsHttp performed equally. We haven’t been using many of the ws specific settings, so that probably explains a lot.
- nullTransport, netNamedPipe and netMsmq seemed to converge for large DataSizes (10^6+ bytes).
In the last graph, we see the effect that .NET Remoting outperforms netTcp using the default DataContractSerializer when using a more complex data structure. A detailed discussion and explanation can be found here.
Streaming and WCF TransferMode
In a TwoWay or buffered scenario, the difference between ReceiveData and ReceiveStream was negligible. In a OneWay Streamed scenario, using the transferMode=Streamed setting was beneficial at DataSizes > 10,000.
Performance Counters (CPU Usage)
The performance counters to be logged during the measurements with the WCF Latency Test Harness can be configured in one of the configuration files by adding the logman.exe/perfmon.exe compatible performance counter name.
The above graphs show that the CPU usage was lower than 100% at all times during the measurements, which indicates that the latency measurements were performed correctly regarding the total system load, as latency may degrade when the total CPU usage nears 100%. Such a scenario (near-100% CPU usage) would be appropriate for throughput measurements, but not for latency measurements.
Other measurements, that were not performed, but seem interesting:
- Other OS: XP, 64-bit Vista
- Bindings configured to use transport and/or message security
The following people have contributed to the WCF Latency Test Harness:
- Ilan Tavor (KLA-Tencor, making it possible)
- Alon Fliess (technical guidance)
- Sasha Goldshtein (technical input on the serialization aspects; Kudos!)
- Josh Reuben (general technical input)
- Manu Cohen-Yashar