With the Draft 0.7 release, the PCI-SIG process described earlier requires that no new functionality be added, so it’s an excellent time for even the most cautious of early adopters to begin work. Designers can develop even the lowest levels of the PCIe protocol stack and be fairly secure in the solidity of the specification. There is always some risk of a misinterpretation or oversight in the specification forcing a slight change to the details of implementation, but these are not common and generally only a minor impact on a design. PCI-SIG members can download the complete 0.7 Draft from the PCI-SIG website https://members.pcisig.com/wg/PCI-SIG/document/download/9977 for full details.
The evolution from PCIe 8GT/s signaling to 16GT/s is similar to that of PCIe 2.5GT/s to 5GT/s– primarily a new speed, negotiated at link initialization. However, in contrast to earlier data rates, getting to PCIe 16GT/s data rates requires a two-stage process. First, the link is brought up to 8GT/s using the familiar 4-phase equalization process, then the same 4-phase process is repeated while running 8GT/s rate to switch to 16GT/s rate. This requires some new arcs on the PCIe link state machine, but is re-using methods now well-proven in PCIe 8GT/s. The 128/130 encoding scheme from PCIe 8GT/s is used at PCIe 16GT/s data rates, so designers can re-use virtually all of that logic. Naturally designers need to make some minor changes to the main protocol state machine, the Link Training and Status State Machine (LTSSM), to accommodate the new equalization. A few other minor symbol and test pattern tweaks are specified to ease operation at the higher speed, but overall a PCIe 4.0 16GT/s link will look almost unchanged to those familiar with 8GT/s operation.
One concern raised during development of the PCIe 4.0 specification was that certain devices with specific workloads might not be able to fully utilize the 16GT/s data rate with the existing limits on credits and outstanding transactions. To address this, the Draft 0.7 expanded the Tagfield in the PCIe 4.0 packet header from 8-bits to 10-bits. Note that one combination of the new bits is reserved to help detect erroneous hierarchy configurations, for a total of 768 tags available. All devices implementing 16GT/s signaling are required to support receiving 10-bit tags, but may choose whether or not to generate them based on their own needs. Therefore, all designers of PCIe 4.0 16GT/s devices will need to expand their received tag-tracking logic to handle the larger tags, but they can continue to rely on header credits to throttle the total number of simultaneous requests they must accept.
To support full utilization of the additional tags, the PCIe 4.0 specification defines a scaling scheme for the flow-control credit mechanism. Devices requiring more credit than previously available can now advertise a scaling factor of 4x or 16x whereby each numeric credit in the protocol actually represents 4 or 16 credits respectively. Here again, all devices implementing PCIe 4.0 16GT/s are required to support their link partner scaling by 4x or 16x, but are permitted to use 1x scaling for their own credits if desired. Using the new scaling factors, PCIe 3.1’s maximum of 127 header credits can be extended to 508 (using 4x scaling) or 2,032 (using 16x scaling) – independently for each Posted (PH), Non-Posted (NPH) or Completion (CPLH) credit type. Likewise, data credits can grow from PCIe 3.1’s 2,047 (~32KB) to 8,188 (~128KB) or 32,752 (~512KB) using 4x or 16x scaling respectively for each Posted (PD), Non-Posted (NPD) or Completion (CPLD) credit type.
Probably the most significant item introduced by the 0.7 draft is “Lane Margining at the Receiver.” This feature uses software that runs on the PCIe system board to evaluate how much margin exists in each lane of the PCIe channel, or put another way, how close a given lane is to failing to transfer data reliably. The specification defines a set of registers and a basic command set whereby the host software can instruct each receiver in a PCIe channel to move its sampling point in time (and optionally voltage) to determine roughly how wide (and optionally how high) the signal eye is at the receiver. A critical distinction is that this feature is intended for use as a system diagnostic/evaluation tool to provide an approximate measurement of the PCIe channel and not a measurement of the receiver. Also significant is the fact that supporting Lane Margining is required of all devices supporting PCIe 4.0 16GT/s but use of Lane Margining is not required to run at 16GT/s. Lastly, implementation of this feature in an SoC requires close cooperation between a PCIe 4.0 16GT/s controller and 16GT/s PHY.