USB3 Full-Duplex Operation


Hi, I have several ZEM5305s. I’m not sure if this is an FX3 limitation or a FrontPanel limitation, but my understanding is that there’s no way to take advantage of USB3’s full-duplex nature. Is that right?

I.e. is there no way to have the FPGA simultaneously processing a WriteToPipeIn() and a ReadFromPipeOut()? I know the FX3 has a single 32-bit bidirectional channel, but I was hoping that this could be multiplexed between upstream and downstream traffic at a fine timescale.

In my application, latency is not critical (I’m comfortable with milliseconds between the PC and FPGA), but it is important that the FPGA is not starved of downstream traffic and is not blocked when trying to send upstream traffic. To avoid starvation/stalling while simultaneously achieving decent throughput, it seems I need to implement buffering (with very deep buffers, probably with the DRAM) on the FPGA.

Am I missing something? A nonblocking version of the read/write calls? A way to invoke read and write concurrently?

Edit: Let me rephrase this in terms of my application’s requirements:

I’m doing streaming, bidirectional IO between the PC and a device connected to the ZEM5305’s headers. The PC is constantly sending messages which must be forwarded to the device at precise, periodic times, and the device is constantly, periodically emitting messages which must be forwarded to the PC. The FPGA may buffer the events going in either direction. The system looks like this:

              +---------------------------------------+     +--------+  
              |                ZEM5305                |     |        |  
+----+        |  +-----+       +-------------------+  |     |        |  
|    |        |  |     |       |       FPGA        |  |     |        |  
|    |        |  |     |       | +------+  ||||||| |  |     |        |  
|    |        |  |     |       | |      |->||buf||--------->|        |  
|    |<----------|     |       | |      |  ||||||| |  |     | User   |  
| PC |  USB3  |  | FX3 |<----->| |okHost|          |  |     | Device |  
|    |---------->|     |  32b  | |      |  ||||||| |  |     |        |  
|    |        |  |     |  bus  | |      |<-||buf||<---------|        |  
|    |        |  |     |  bus  | +------+  ||||||| |  |     |        |  
|    |        |  |     |       |                   |  |     |        |  
+----+        |  +-----+       +-------------------+  |     |        |  
              |                                       |     |        |   
              +---------------------------------------+     +--------+

Some latency between the PC and device is tolerable in either direction. However, the device must have uninterrupted input and output streams: if the device’s output is blocked because the USB interface to the PC is blocked, and whatever buffering between the device and the USB interface has been exhausted, that is a problem. Similarly, if there is a gap in the inputs to the device because the buffer ran out of queued-up inputs from the last USB transmission, that is also a problem.

Given my requirements, how do I get the best possible bidirectional bandwidth, while minimizing latency and queue depth requirements?

Given the limitations of the FrontPanel API/hardware as far as I understand them, I only know how to trade off buffer depth and latency for throughput. I don’t know much about the FX3, but it seems like it could potentially be possible for it to effectively communicate full-duplex on the USB3-side, and switch between up and downstream traffic at a fine time resolution on the bidirectional bus to the FPGA. If this capability existed and could be leveraged though FrontPanel, it would allow near-optimal throughput with only minimal latency and queue depth requirements (effectively, the illusion of full-duplex operation at the application layer, just with bottlenecked throughput because of the FX3’s limited IO bus width).