PCI Express 4.0 Draft 0.7 & PIPE 4.4 Specifications - What Do They Mean to Designers?

By: Richard Solomon, Sr. Technical Marketing Manager, Synopsys

The PCI Express® (PCIe®) standard has long been used in applications like personal computers, networking and workstations. Due to its many benefits such as reliability, low-power, latency and scalable bandwidth from 2.5GT/s to 16GT/s, the specification has evolved to also become prevalent in designs for storage, cloud computing, mobile and automotive. The PCI-SIG announced the latest PCIe 4.0 16GT/s (Gen4) specification in November 2011, but it was almost two years later before work began in earnest. The PCIe 4.0 Draft 0.7 specification was recently released to PCI-SIG members, sparking renewed urgency in System-on-Chip (SoC) designers looking to take advantage of the PCIe 4.0 16GT/s specification. The complementary Physical Interface for PCI Express (PIPE) 4.4 specification was also made available by Intel shortly thereafter. This article describes what designers need to know about these releases and why designers should develop their PCIe 4.0 designs today.

To understand the importance of the Draft 0.7 release, it is necessary to understand the PCI-SIG specification development process and the history of the PCIe 4.0 releases. There are 5 primary releases/checkpoints in a PCI-SIG specification:

  • Draft 0.3 (Concept): this release may have few details, but outlines the general approach and goals. For PCIe 4.0 this included the 16GT/s signaling rate, re-use of the 128/130 encoding scheme developed for PCI 3.0 8GT/s mode, maintaining full backwards compatibility, etc. and was released in February 2014.
  • Draft 0.5 (First draft): this release has a complete set of architectural requirements and must fully address the goals set out in the 0.3 draft. The PCIe 4.0 Draft 0.5 specification was released in February 2015.
  • Draft 0.7 (Complete draft): this release must have a complete set of functional requirements and methods defined, and no new functionality may be added to the specification after this release. Before the release of this draft, electrical specifications must have been validated via test silicon. For PCIe 4.0, two independent implementations were provided to PCI-SIG workgroup members, one from Synopsys, and the other from Mellanox. The PCIe 4.0 Draft 0.7 was released November 15, 2016.
  • Draft 0.9 (Final draft): this release allows PCI-SIG member companies to perform an internal review for intellectual property, and no functional changes are permitted after this draft.
  • 1.0 (Final release): this is the final and definitive specification, and any changes or enhancements will be through Errata documentation and Engineering Change Notices (ECNs) respectively.

Historically, the earliest adopters of a new PCIe specification generally begin designing with the Draft 0.5 as they can confidently build up their application logic around the new bandwidth definition and often even start developing for any new protocol features. At the Draft 0.5 stage, however, there is still a strong likelihood of changes in the actual PCIe protocol layer implementation, so designers responsible for developing these blocks internally may be more hesitant to begin work than those using interface IP from external sources. 

What is new in PCI Express 4.0 and Draft 0.7

With the Draft 0.7 release, the PCI-SIG process described earlier requires that no new functionality be added, so it’s an excellent time for even the most cautious of early adopters to begin work. Designers can develop even the lowest levels of the PCIe protocol stack and be fairly secure in the solidity of the specification. There is always some risk of a misinterpretation or oversight in the specification forcing a slight change to the details of implementation, but these are not common and generally only a minor impact on a design. PCI-SIG members can download the complete 0.7 Draft from the PCI-SIG website https://members.pcisig.com/wg/PCI-SIG/document/download/9977 for full details.

The evolution from PCIe 8GT/s signaling to 16GT/s is similar to that of PCIe 2.5GT/s to 5GT/s– primarily a new speed, negotiated at link initialization. However, in contrast to earlier data rates, getting to PCIe 16GT/s data rates requires a two-stage process. First, the link is brought up to 8GT/s using the familiar 4-phase equalization process, then the same 4-phase process is repeated while running 8GT/s rate to switch to 16GT/s rate. This requires some new arcs on the PCIe link state machine, but is re-using methods now well-proven in PCIe 8GT/s. The 128/130 encoding scheme from PCIe 8GT/s is used at PCIe 16GT/s data rates, so designers can re-use virtually all of that logic. Naturally designers need to make some minor changes to the main protocol state machine, the Link Training and Status State Machine (LTSSM), to accommodate the new equalization. A few other minor symbol and test pattern tweaks are specified to ease operation at the higher speed, but overall a PCIe 4.0 16GT/s link will look almost unchanged to those familiar with 8GT/s operation.

One concern raised during development of the PCIe 4.0 specification was that certain devices with specific workloads might not be able to fully utilize the 16GT/s data rate with the existing limits on credits and outstanding transactions. To address this, the Draft 0.7 expanded the Tagfield in the PCIe 4.0 packet header from 8-bits to 10-bits. Note that one combination of the new bits is reserved to help detect erroneous hierarchy configurations, for a total of 768 tags available. All devices implementing 16GT/s signaling are required to support receiving 10-bit tags, but may choose whether or not to generate them based on their own needs. Therefore, all designers of PCIe 4.0 16GT/s devices will need to expand their received tag-tracking logic to handle the larger tags, but they can continue to rely on header credits to throttle the total number of simultaneous requests they must accept.

To support full utilization of the additional tags, the PCIe 4.0 specification defines a scaling scheme for the flow-control credit mechanism. Devices requiring more credit than previously available can now advertise a scaling factor of 4x or 16x whereby each numeric credit in the protocol actually represents 4 or 16 credits respectively. Here again, all devices implementing PCIe 4.0 16GT/s are required to support their link partner scaling by 4x or 16x, but are permitted to use 1x scaling for their own credits if desired. Using the new scaling factors, PCIe 3.1’s maximum of 127 header credits can be extended to 508 (using 4x scaling) or 2,032 (using 16x scaling) – independently for each Posted (PH), Non-Posted (NPH) or Completion (CPLH) credit type. Likewise, data credits can grow from PCIe 3.1’s 2,047 (~32KB) to 8,188 (~128KB) or 32,752 (~512KB) using 4x or 16x scaling respectively for each Posted (PD), Non-Posted (NPD) or Completion (CPLD) credit type.

Probably the most significant item introduced by the 0.7 draft is “Lane Margining at the Receiver.” This feature uses software that runs on the PCIe system board to evaluate how much margin exists in each lane of the PCIe channel, or put another way, how close a given lane is to failing to transfer data reliably. The specification defines a set of registers and a basic command set whereby the host software can instruct each receiver in a PCIe channel to move its sampling point in time (and optionally voltage) to determine roughly how wide (and optionally how high) the signal eye is at the receiver. A critical distinction is that this feature is intended for use as a system diagnostic/evaluation tool to provide an approximate measurement of the PCIe channel and not a measurement of the receiver. Also significant is the fact that supporting Lane Margining is required of all devices supporting PCIe 4.0 16GT/s but use of Lane Margining is not required to run at 16GT/s. Lastly, implementation of this feature in an SoC requires close cooperation between a PCIe 4.0 16GT/s controller and 16GT/s PHY.

New PIPE 4.4 Specification

Fortunately for those designers procuring their PCIe 4.0 16GT/s PHYs and controllers from different sources, Intel has incorporated PCIe 4.0 16GT/s operation into the Physical Interface for PCI Express (PIPE) specification and made it available to the public with version 4.4. The new PCIe 4.0 16GT/s rate is supported using 32-bit, 16-bit, or 8-bit per-lane datapath options just as the earlier PCIe 2.5GT/s through 8GT/s rates were. This means that designers will be dealing with clock rates topping out at 500MHz using 32-bits per lane, all the way up to a staggering 2GHz using 8-bits per lane!

The basic PHY-Controller interface signals familiar to users of previous PIPE specifications remain largely unchanged in PIPE 4.4, with obvious and expected changes to indicate PCIe 4.0 16GT/s and details related to the minor physical layer changes mentioned earlier. Extending this signaling to the Lane Margining mechanism, however, would have required a large number of new signals in each direction to exchange the needed control and status information between a PCIe 4.0 16GT/s PHY and controller. Using a mechanism originally proposed by Synopsys engineers, the PIPE specification now uses a generic register-type interface to provide control and communication between PHY and controller. Initially defined only for the PCIe 4.0 16GT/s Lane Margining feature, this interface could greatly simplify numerous PHY features in the future – both existing ones such as L1 Sub-states control, and potential future controls for higher data rates, more complex equalization schemes, etc.

Immediate Availability of DesignWare IP for PCI Express 4.0 Draft 0.7

The PCI-SIG specification development process freezes functionality at the Draft 0.7, so now is the ideal time to start designing high-performance SoCs using the PCIe 4.0 16GT/s interface. PCIe 4.0 Draft 0.7 delivers scaled credits (1x, 4x, or 16x) and widened tags (from 8 to 10-bits) to improve link bandwidth, and lane margining at the receiver for system designers to assess the performance variation tolerance of their system. Synopsys’ DesignWare® IP Solution for PCI Express 4.0 supports the latest Draft 0.7 release and is available now. The complete PCIe IP solution consisting of PHYs and controllers is silicon-proven and supports a wide range of foundry process nodes.