Cloud native EDA tools & pre-optimized hardware platforms
By Matthew Myers, Sr. Staff R&D Engineer, 草榴社区
When the USB-IF announced that the next generation of USB would more than double the speed of USB 3.0, many news articles focused on how it would be accomplished: by increasing the physical layer data rate from 5 Gbps to 10 Gbps. However, in addition to the PHY changes, this speed increase also requires carefully thought-out protocol changes to fully take advantage of the additional bandwidth.
With a point-to-point connection including only one PC communicating with one device, the main challenges of increasing the data rate would need to be solved at the physical layer. Double the data rate, change some encoding, but keep the same packet formats and protocol, as shown in Figure 1. However, USB is not a point-to-point protocol.
Figure 1: Doubling data rates between a single PC and device
One of the reasons USB is so popular is because it is based on a hub topology that allows the expansion of connections. Even if a PC only has one USB port, it is theoretically possible, through multiple levels of hubs, to connect to more than 100 devices. The presence of hubs in the system and the requirement for backward compatibility with slower devices combine to make the protocol much more complicated when trying to increase the data rate.
To demonstrate the complexity that hubs bring to designs, take the example of a PC with a single port connected to a hub which is connected to two devices (see Figure 2).
Figure 2: Using USB hubs to connect to multiple devices
In all generations of the USB specification, the host (the PC) initiates each transaction. When the host wants to read data from a device, it initiates a transaction to an “IN endpoint,” called an “IN transaction.” In USB 3.0, the host performs a read to a device by sending a Transaction Packet (ACK TP) to the device, waiting for the device to respond with a Data Packet (DP), and then sending another ACK TP to acknowledge the receipt of data and optionally request more data.
Now, imagine the host is reading two packets each from two devices: Device 0 operating at 5 Gbps and Device 1 operating twice as fast at 10 Gbps, with no protocol-level changes (see Figure 3).
Figure 3: Wasted bandwidth in host when two devices using different data rates are connected via a hub without protocol-level changes
On the upstream link to the host, significant bandwidth is wasted while the host waits for the data from the slower device. Two factors from the USB 3.0 specification cause this wasted bandwidth:
At this point, it is reasonable to question why this was not a problem when we moved from USB 2.0 (480 Mbps) to USB 3.0. The answer to that question is that USB 2.0 devices communicate all the way from the host through the hubs to the devices on a completely separate set of wires than USB 3.0 devices. The introduction of USB 3.1’s 10 Gbps rate marks the first time that the USB 3.0 wires are being shared between two different device speeds: 5 Gbps for USB 3.0, and 10 Gbps for USB 3.1.
Without a protocol-level solution, this loss of throughput would be unacceptable to users who purchase a USB 3.1 device and expect it to operate at least twice as fast as a USB 3.0 device, even when connected to a USB 3.1 hub.
The ideal system completely saturates the upstream link to the host with data, fully utilizing the extra bandwidth of the USB 3.1 link. Ignoring the USB 3.0 protocol for a minute, the perfect behavior would be to send an ACK TP to the slower device followed by the faster device, and have the hub buffer up the received data so that it can keep the upstream link busy, as shown in Figure 4.
Figure 4: Ideal system with fully utilized USB 3.1 bandwidth between hub and host
To fully utilize the available bandwidth, the USB 3.1 specification addresses the two limiting factors described in the previous section:
Over the years, many bus protocols have transitioned from a single transaction architecture to a multiple outstanding transaction architecture. The transition from conventional PCI to PCI Express brought split transactions which decoupled the address phase from the data phase. In addition, the on-chip AMBA? AHB? bus was updated to support multiple outstanding addresses with out-of-order responses when the AMBA AXI specification was released.
In the same way, USB 3.1 adds a protocol feature named “multiple INs.” The USB 3.1 host is now allowed to issue a read request to an endpoint on a device and then proceed with other transactions without waiting for the response. After issuing multiple requests to multiple endpoints/devices, the data may return to the host in a different order than it was requested due to arbitration taking place on the devices and the upstream ports of hubs. The host is abstracted away from the exact USB topology in this manner because it cannot predict exactly how fast or in what order the transactions will complete. This can be seen in Figure 4, where the host initiated a transaction to Device 0 first, but it received data packets from Device 1 first.
This change to host behavior also has an effect on USB 3.1 devices. Now, a USB 3.1 device may receive a second ACK TP for a different endpoint before it has a chance to respond to the first ACK TP. A simple USB 3.1 device may choose to continue to handle only one request at a time by responding Not Ready (NRDY) to additional ACK TPs. A more complex USB 3.1 device would take advantage of the multiple INs by servicing the requests separately, possibly returning data to each one, reducing the likelihood that NRDY would need to be used. Figure 5 shows the difference in behavior for a USB 3.1 device that has two or more IN endpoints, depending on whether it supports multiple INs or not.
Figure 5: USB 3.1 devices that support multiple INs can handle ACK TPs for different endpoints without the need for NRDY responses
In a system that includes a USB 3.1 host and more than one USB 3.0 device, the host can to perform multiple IN transactions to different legacy USB 3.0 devices, while ensuring that each USB 3.0 device will have only one outstanding IN transaction. This is possible in cases where the legacy devices are not affected by the Multiple IN transactions and reside on different “bus instances.” Bus instances, illustrated in Figure 6, are a link and all of its downstream hubs and devices operating at the same speed. In this example, the host can transmit an ACK TP to the three USB 3.0 devices without any dependency between them, but it cannot issue multiple INs to two endpoints on the same device.
Figure 6: Bus instances in a USB 3.1 system (Figure 3-8 from USB 3.1 specification)
A simple hub design in USB 3.0 was a “cut-through” model, where data flowed directly from a downstream port to an upstream port with little buffering inside the hub. This model was possible because the upstream port and downstream ports on the hub were all operating at the same speed. However, in USB 3.1, a USB 3.1 hub must support the scenario of having a faster upstream link than downstream link. Cut-through will not work in this scenario, so it needs to buffer at least one packet from the slower port for retransmission on the faster upstream link when the link is available. By performing this “store and forward” service, USB 3.1 hubs take a much more active role in making sure that the flow of data is smooth and the user’s experienced performance is not interrupted.
In USB 3.1, there are significant challenges to make sure that backward compatibility with USB 3.0 devices can be maintained in a hub topology without sacrificing the speed gains available to new USB 3.1 devices. These challenges are not limited to increasing the physical data rate and they permeate every layer of the protocol, including the Link and Protocol layers.