Disappointing Wire/Trigger performance on Windows using XEM6310-LX150

I’ve been running benchmark tests of the XEM6310-LX150 to see whether it is a better fit for my application than the older XEM6010 that we’ve been using. To do this I’ve been using the PipeTest example. On Linux, I get numbers that are very good - 5k-7k calls per second for Wires and Triggers. I almost stopped there but I’m glad I didn’t. Next I tried benchmarking on two different Windows 7 machines and found that here the Wire/Trigger performance is just 1k calls per second, which is actually worse than the XEM6010! I’ve attached the full PipeTest benchmark output (which among other things shows full USB3 throughput for large Pipe transfers, so I know the device is correctly enumerating as USB SuperSpeed). Also note that both computers are relatively new Dell machines with i7 processors and 16GB RAM, and I was careful to make sure that the only device plugged into any USB ports is the XEM6310. Finally, I’m not using a USB hub or anything else that might complicate the test output.

With all that in mind, I would hope that there is just something about the low-level system configuration that must be adjusted to improve the performance. I’m hoping this isn’t some fundamental limitation of the USB3 stack in Windows 7 but if that is indeed the case it would be nice to know that. Has anyone out there actually gotten anything like the USB3 Wire/Trigger performance numbers listed in the FrontPanel User Manual? Are there any low-level Windows settings like some type of polling interval that can be adjusted to improve performance (preferably without any other terrible implications)? The sheer consistency of the results smacks of a 1ms polling interval embedded somewhere in the stack but I don’t know much about USB3 so it would be hard for me to go much further than I have already. Any help would be greatly appreciated!

----------PipeTest Output on Windows 7 Machine ------------------

---- Opal Kelly ---- PipeTest Application v2.0 ----
FrontPanel DLL loaded. Built: Mar 14 2015 13:42:08
Found a device: XEM6310-LX150
Device firmware version: 1.16
Device serial number: 14020005YU
Device device ID: 22
FrontPanel support is enabled.
UpdateWireIns (1000 calls) Duration: 0.998 seconds – 1002.00 calls/s
UpdateWireOuts (1000 calls) Duration: 0.999 seconds – 1001.00 calls/s
ActivateTriggerIns (1000 calls) Duration: 1.000 seconds – 1000.00 calls/s
UpdateTriggerOuts (1000 calls) Duration: 0.998 seconds – 1002.00 calls/s
Read BS:0 SS:4194304 TS:67108864 Duration: 0.203 seconds – 315.27 MB/s
Read BS:0 SS:4194304 TS:33554432 Duration: 0.094 seconds – 340.43 MB/s
Read BS:0 SS:4194304 TS:16777216 Duration: 0.046 seconds – 347.83 MB/s
Read BS:0 SS:4194304 TS:8388608 Duration: 0.027 seconds – 296.30 MB/s
Read BS:0 SS:4194304 TS:4194304 Duration: 0.009 seconds – 444.44 MB/s
Read BS:0 SS:1048576 TS:33554432 Duration: 0.109 seconds – 293.58 MB/s
Read BS:0 SS:262144 TS:33554432 Duration: 0.172 seconds – 186.05 MB/s
Read BS:0 SS:65536 TS:16777216 Duration: 0.250 seconds – 64.00 MB/s
Read BS:0 SS:16384 TS:4194304 Duration: 0.265 seconds – 15.09 MB/s
Read BS:0 SS:4096 TS:1048576 Duration: 0.249 seconds – 4.02 MB/s
Read BS:0 SS:1024 TS:1048576 Duration: 1.014 seconds – 0.99 MB/s
Read BS:1024 SS:1024 TS:1048576 Duration: 1.045 seconds – 0.96 MB/s
Read BS:1024 SS:1048576 TS:33554432 Duration: 0.125 seconds – 256.00 MB/s
Read BS:900 SS:1048500 TS:33553800 Block Size Not Supported
Read BS:800 SS:1048000 TS:33554400 Block Size Not Supported
Read BS:700 SS:1047900 TS:33553800 Block Size Not Supported
Read BS:600 SS:1048200 TS:33554400 Block Size Not Supported
Read BS:512 SS:1048576 TS:33554432 Duration: 0.140 seconds – 228.57 MB/s
Read BS:500 SS:1048500 TS:33554000 Block Size Not Supported
Read BS:400 SS:1048400 TS:16777200 Block Size Not Supported
Read BS:300 SS:1048500 TS:16777200 Block Size Not Supported
Read BS:256 SS:1048576 TS:16777216 Duration: 0.078 seconds – 205.13 MB/s
Read BS:200 SS:1048400 TS:8388600 Block Size Not Supported
Read BS:128 SS:1048576 TS:8388608 Duration: 0.047 seconds – 170.21 MB/s
Read BS:100 SS:1048500 TS:8388600 Block Size Not Supported
Write BS:0 SS:4194304 TS:67108864 Duration: 0.203 seconds – 315.27 MB/s
Write BS:0 SS:4194304 TS:33554432 Duration: 0.109 seconds – 293.58 MB/s
Write BS:0 SS:4194304 TS:16777216 Duration: 0.054 seconds – 296.30 MB/s
Write BS:0 SS:4194304 TS:8388608 Duration: 0.023 seconds – 347.83 MB/s
Write BS:0 SS:4194304 TS:4194304 Duration: 0.011 seconds – 363.64 MB/s
Write BS:0 SS:1048576 TS:33554432 Duration: 0.140 seconds – 228.57 MB/s
Write BS:0 SS:262144 TS:33554432 Duration: 0.250 seconds – 128.00 MB/s
Write BS:0 SS:65536 TS:16777216 Duration: 0.517 seconds – 30.95 MB/s
Write BS:0 SS:16384 TS:4194304 Duration: 0.514 seconds – 7.78 MB/s
Write BS:0 SS:4096 TS:1048576 Duration: 0.515 seconds – 1.94 MB/s
Write BS:0 SS:1024 TS:1048576 Duration: 2.044 seconds – 0.49 MB/s
Write BS:1024 SS:1024 TS:1048576 Duration: 2.045 seconds – 0.49 MB/s
Write BS:1024 SS:1048576 TS:33554432 Duration: 0.140 seconds – 228.57 MB/s
Write BS:900 SS:1048500 TS:33553800 Block Size Not Supported
Write BS:800 SS:1048000 TS:33554400 Block Size Not Supported
Write BS:700 SS:1047900 TS:33553800 Block Size Not Supported
Write BS:600 SS:1048200 TS:33554400 Block Size Not Supported
Write BS:512 SS:1048576 TS:33554432 Duration: 0.141 seconds – 226.95 MB/s
Write BS:500 SS:1048500 TS:33554000 Block Size Not Supported
Write BS:400 SS:1048400 TS:16777200 Block Size Not Supported
Write BS:300 SS:1048500 TS:16777200 Block Size Not Supported
Write BS:256 SS:1048576 TS:16777216 Duration: 0.063 seconds – 253.97 MB/s
Write BS:200 SS:1048400 TS:8388600 Block Size Not Supported
Write BS:128 SS:1048576 TS:8388608 Duration: 0.047 seconds – 170.21 MB/s
Write BS:100 SS:1048500 TS:8388600 Block Size Not Supported

You should note that USB is not intended as a low-latency system for real-time applications. For fast, low-latency calls, PCI Express is a better solution. USB excels at high bandwidth for larger data transfers where the transfer overhead can be minimized.

USB 2.0 defined a polling interval down to 125us as part of its shared bus architecture. USB 3.0 is a bit different in that this polling has been eliminated in favor of what effectively amounts to a point-to-point system called asynchronous notification. While this increases the effective polling rate you can achieve, it (architecturally) improves power consumption and also improves achievable bandwidth.

We aren’t aware of any host controller knobs that allow this rate to be adjusted.

These are all good points about USB vs. PCIe. However, I still feel that when you get to the specifics the points you are making about USB3 vs. USB2 throughput are at odds with the performance notes given in the FrontPanel User Manual (which show on an old machine running XP SP2 rates much higher than USB2 or PCIe, on the order of 5k transactions per second). This suggests to me that there is something “different” about my Windows systems that is inhibiting performance. Are you aware of any good utilities on Windows that might give some visibility into the low-level USB3 bus activity? I’m willing to do some leg work on this one and even post the results to this thread to help other users with similar issues, but it’s going to be hard without even basic insight into the bus.

One other thing I noticed is that the XEM6310 I’m using has relatively old device firmware, version 20131217-XS6-1.16. I’m wondering if a newer firmware version has been released and if so how I might obtain that (I checked the website but didn’t see any obvious link to download any firmware files, new or old). My hope is that a newer firmware file may have different performance metrics.

As the notes in the User’s Manual explicitly state, “USB performance can vary significantly …”

Our recorded data also indicates that Mac OS X performs well at these tests, as well.

We’re not aware of any OS-based utilities that help investigate.

The firmware will not affect these performance metrics.