# Understanding Bandwidth: Back to Basics

Apr 21, 2016 / 3 min read

One of the questions I’ve been getting a lot recently is along the lines of “How many lanes and of what ‘generation’ of PCI Express do I need?”

This is a fairly straightforward question, and while coming up with a good first-order estimation is also fairly straightforward, it’s not necessarily obvious in the PCI Express specification.  Let’s start with the “raw” data rate, which is fairly easy:

 PCI Express Data Rates “Gen1” 2.5 Gb/s “Gen2” 5 Gb/s “Gen3” 8 Gb/s “Gen4” 16 Gb/s

Folks who are new to PCIe may be scratching their heads right about now and thinking “Richard said before that each generation of PCIe has doubled the bandwidth…. so what happened between Gen2 and Gen3??!?!”  That leads us to the second piece of the puzzle – the encoding scheme.  The original PCI Express specification used “8b10b” encoding – which means every 8 bits of data was expanded to 10 bits when sent on the wire.  I won’t go into the details here of why this was done, but it was a common technique for limiting “runs” of 0s and 1s in the data stream.  When the 5Gb/s “Gen2” data rate was developed, it kept the same encoding scheme.  However, when “Gen3” was being developed, it was hoped that by limiting the actual signaling rate to something below 10Gb/s simpler receivers could be defined (this ultimately didn’t happen, but that’s a story for another Flashback I suppose).  To do that, and still keep a “doubling”, the encoding scheme for “Gen3” was changed to 128/130 – meaning every 128 bits of data get expanded only to 130 bits (instead of to 160 as 8b10b would have).

So 5 Gigabits multiplied by 8/10 gives 4 Gigabits/second of effective data transfer, while 8 Gigabits multiplied by 128/130 gives 7.88 Gigabits/second which is close enough to double 🙂.

“Ok Richard, I’ve got it – so I take the data rate, multiply by the encoding factor and I’ve got my real per-lane data rate, right?”

 PCI Express Data Rates Encoding Factor “Gen1” 2.5 GT/s (8/10) “Gen2” 5 GT/s (8/10) “Gen3” 8 GT/s (128/130) “Gen4” 16 GT/s (128/130)

## Packet Efficiency

That’s the first step, yes, but I’m afraid there’s one more piece of the puzzle – the packet efficiency.  This is just a reflection of the fact that there is overhead to every packet sent on PCI Express.  Firstly, every data packet includes a header which is either 3 or 4 DWORDs (32-bit or 4-byte chunks), so we add 12 or 16 bytes of overhead for that.  Every data packet also includes a 1 DWORD LCRC, so add 4 more bytes for that.  Then there is a sequence number and some start/stop information – for simplicity we’ll pretend that’s always another 4 bytes total.  (While true for “Gen1” and “Gen2” the 128/130 encoding scheme makes this not exactly accurate for “Gen3” and “Gen4”, but it will do for our purposes at the moment.)  Lastly, there is an optional End-to-End CRC called the ECRC which can be included in packets as well, at a cost of another 4 bytes.

Since ECRC isn’t commonly used, let’s just look at 3 DWORD and 4 DWORD header packets and add those 20 or 24 bytes of overhead to our PCI Express packet sizes.  So for 128 byte packets, we actually have to send 128+20=148 or 128+24=152 bytes, which means our packet efficiency is 128/148=0.865 or 128/152=0.842.  Doing that math for the rest of the packet sizes and expressing efficiency as a percentage gives:

 Header Size Efficiency (%) for Various Packet Sizes (Bytes) 128 256 512 1024 2048 4096 3-DWORD 86.5% 92.8% 96.2% 98.1% 99.0% 99.5% 4-DWORD 84.2% 91.4% 95.5% 97.7% 98.8% 99.4%

So *NOW* you’ve got the calculation down!  Take the “raw” data rate, multiply by the encoding factor, then by the packet efficiency to get the effective data rate per lane.  Of course if you’re using a multi-lane implementation, you get to multiply that by the number of lanes.