Cloud native EDA tools & pre-optimized hardware platforms
Manuel Mota, Sr. Staff Product Marketing Manager, Synopsys
A key challenge facing the semiconductor industry is its inability to catch product defects early in the production phase. The cost (economic and reputational) associated with deploying a defective product to market is very significant. This is especially true for high-performance computing system-on-chip (SoC) designers that are targeting hyperscale data center, networking, and AI applications, since any product defect can have catastrophic impact on AI workloads or data processing.
The semiconductor industry has developed an array of test methodologies to improve the speed and coverage of production tests. These methodologies have been standardized to improve efficiency by using common testing metrics and interfaces at different stages of the end-product manufacturing ‒ from wafer testing to chip testing to board-level testing.
This article describes how efficient production test of system-in-packages (SiPs) using die-to-die PHY IP can ensure the end-product is not defective and the production yield is kept as high as possible. The article describes how die-to-die PHY IP internal test features can extend the test coverage across all dies.
Figure 1: Different packaging technologies with distinct routing features
Individual dies, package “structures” (interposer, TSVs, bumps), and the package assembly can suffer from yield limitations. Even if the yield of each individual element is relatively high, the total SiP yield, which is the cumulative yield of all the different elements, can be prohibitively low, as seen in the following formula:
Yield SiP = Yield NDie x Yield Package x Yield Assembly
where N = number of dies assembled in the same package.
As an example, a SiP with 4 dies, each with a yield of 90%, and a package and assembly yield of 100% has a total SiP yield of only ~65%. For large dies in advanced process nodes, an individual yield of 80% may be good, but the resulting SiP yield may be prohibitively low at approximately 41%. Basically, a defect in one die invalidates the complete SiP, including the remaining three non-defective dies.
To improve yield, companies follow two directions:
Bypassing or otherwise overcoming identified defects can also help improve yield by implementing test and repair functionality at die level and at assembled system level. Such test and repair functionality can include redundancy or other schemes and is particularly useful for large, regular structures, such as memory or very wide busses across dies.
Given the complexity of SiP testing, with dies coming from different sources, standardizing test infrastructures and methodologies across the ecosystem is critical to the success of the SiP and chiplet ecosystem. The IEEE and other standards organizations are stepping up with new test architecture standards for 3D packaged dies.
Figure 2: IEEE 1838 test access architecture for testing individual naked dies, assembled dies, and packaged SiP
Figure 4: Die-to-die PHY implementing internal and external loopbacks
A mandatory initial step is to identify defective dies before assembly in the SiP, so only KGDs are assembled, significantly improving the overall production yield.
KGD testing is performed on the naked die, prior to packaging. For an IEEE 1838 compliant die, a standard serial and parallel test access ports are used for access to the complete test infrastructure of the die via a reduced set of test bumps.
The test features within the analog blocks such as the high-speed PHY IP are also interconnected with the die test infrastructure by an IEEE 1500 compliant wrapper to also allow PHY testing.
Depending on the die’s built-in test capabilities and the individual blocks in the die, the test coverage can be very high, ensuring a KGD is correctly identified. However, even in the best test coverage scenarios, there are items that cannot be adequately covered at the naked die level. For example, faulty bumps or the last stages of sensitive output drivers and first stages of low-noise amplifiers that could not be included in the high-speed PHY’s deep loopback are not covered. Other examples include functions that straddle the two dies such as a control loop.
Extending coverage to such missing items, as well as to the inter-die connections is executed in the test strategy next steps that are performed on the assembled SiP.
Assuming both dies are IEEE 1838 compliant, the dies' test infrastructure is seamlessly merged into a single structure assessed at the test ports in a single (the “first”) die and extended to the next die via secondary test ports.
It is now possible to launch tests, such as boundary scan EXTEXT for digital pins and across die loopback tests for high-speed PHYs, extending the test coverage to the periphery of the dies and on the package itself.
It is relevant to note, in some special cases, the hierarchical test methodology described above may not be enough to improve yield to the required levels.
Consider a wide parallel interface between two dies: for example, high-bandwidth memory (HBM) between memory and digital chip or high-bandwidth interconnect (HBI) / advanced interface bus (AIB) between two digital chips. These interfaces may have thousands of pins using micro-bumps and very dense traces on an interposer to connect between them. In this case, the substrate traces or micro-bumps yields may be low enough, resulting in the loss of KGDs. For these cases, a complementary test & repair strategy, relying on redundant pins on each PHY and the corresponding redundant micro-bumps and traces, enables additional yield recovery after final product assembly.