# In-Stack Monitoring of Signal and Power Nodes in Three Dimensional Integrated Circuits

Yuuki Araga, Ranto Miura, Nao Ueda, Noriyuki Miura, Makoto Nagata Graduate School of System Informatics, Kobe University 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan

Email: {araga,r\_miura,ueda,miura,nagata}@cs26.scitec.kobe-u.ac.jp

*Abstract*— An on-chip waveform monitoring technique embodies in-stack evaluation of three-dimensional integrated circuits (3D IC) regarding physical connections using through silicon vias (TSV) and electronic characteristics of signal transmission as well as noise propagation. On-chip generation of reference voltage steps and sampling timings reduces the complexity of analog signal routing in a chip stack and enhances measurement throughputs. The demonstrated 7.6 effective bit resolution with a 5.8 times higher throughput is suitable for in-stack monitoring. Sinusoidal signal transmission in a two-tier 3D IC is on-chip evaluated.

### I. INTRODUCTION

On-chip waveform acquisition techniques have brought about in-place findings regarding the events of signal and power integrity as well as electromagnetic emission and immunity within VLSI chips. Power supply and ground noise waveforms were observed in digital microprocessors and related with delay variations [1][2]. Substrate noise waveforms were evaluated in the presence of substrate coupling in RF communication chips [3][4]. Recently, the power delivery and signal networks in a chips stack are validated for a three dimensional routing system with through silicon vias [5][6][7].

A variety of on-chip monitoring circuits have emerged on the measurement principles using indirect voltage-domain circuit sensitivity [8], high-resolution direct sampling [9], indirect sensing through voltage-frequency translation of voltage controlled oscillators [10], continuous-time analog buffering [11], and others.

While the usability of the on-chip waveform monitoring has been obviously proven by these studies, the scenarios to adapt and utilize monitor circuitry should be developed for technical concerns of specific interest. This paper focuses on an in-stack monitoring technique of power and signal wires within a three-dimensional (3D) integrated circuit. Since a silicon chip is fully covered by the other chips within a chip stack, its internal nodes are not accessible for diagnosis as well as testing objectives. In-stack monitoring only provides the opportunities to observe and evaluate power and signal networks deeply within a chip stack.

The remaining part of this paper is organized as follows. Section 2 describes the structure of in-stack waveform monitoring. Section 3 evaluates waveform acquisition performance. Section 4 discusses the integration of monitors in a 3D IC chip and demonstrates waveform capturing. A brief summary will be given in Sect. 5.



Fig. 1. Embedding OCM system in 3D chip stack.

### II. IN-STACK WAVEFORM MONITORING

### A. Overview

Embedding on-chip waveform monitoring (OCM) functionality in a chip stack is overviewed in Fig. 1. A two-tier chip stack is chosen as an example. Through silicon vias (TSVs) are processed on the thinned top die and vertically connect wires in the tiers. The bottom die remains in the typical thickness and mechanically supports the stack.

Probing front end (PFE) circuits, first proposed in [9], are arrayed in each tier of the stack and capturing waveforms at the nodes of interest with in-place digitization. Monitoring points possibly include power supply ( $V_{\rm dd}$ ) and ground ( $V_{\rm ss}$ ) wires among power delivery networks (PDNs), taps on biased wells as well as on a silicon substrate, and signal nodes within circuits. The captured waveforms will realize in-sight evaluations of 3D chip stacking, from the viewpoints of such as the physical completion of 3D process technologies, the power and signal integrities in 3D routings, the electromagnetic compatibility (EMC) of 3D PDNs, and so forth.

The output from a single PFE selectively activated in the array is digitally processed by the unit (DPU) and forwarded to external processors. While the wires of power delivery and common signals to PFE circuits are 3D routed and united with the pads on the top tier, those intentionally isolated for each tier are routed horizontally within the tier and then vertically to respective pads on the top tier. All pads are located on the top tier that is only accessible in packaging. The other possible packaging technique uses micro bumps formed on the top or on the back surface of the stack.

### B. OCM system

The OCM system diagrams are given in Fig. 2, showing the integration of the in-stack subsystems and external mea-

### EMC'14/Tokyo



Fig. 2. System diagram of OCM including in-stack and off-chip subssystems. (a) PFE and DPU are only on-chip embedded. (b) PFE and DPU are integrated with TG and VG.

surement equipment. In the system of Fig. 2(a), while the DPU is digitally communicated with a field programmable gate array (FPGA) chip as a controller, the PFE is in parallel supplied with the reference voltage ( $V_{\rm ref}$ ) and the sampling timing ( $T_{\rm smp}$ ) from analog signal sources, more precisely, a digital to analog converter (DAC) and a pulse pattern generator (PPG), respectively. On the other hand, the system of Fig. 2(b) integrates the on-chip generation of  $V_{\rm ref}$  and  $T_{\rm smp}$  by the voltage generator (VG) and timing generator (TG), respectively. This reduces the complexity of analog signal routing in a 3D chip stack and eliminates undesirable noise coupling. The measurement time is also reduced that will be discussed in detail in the later section of this paper.

A personal computer (PC) serves as a master controller and communicates with PPG through a general-purpose interface bus (GPIB, IEEE488) and also with the FPGA through a universal serial bus (USB). The FPGA works as a digital coprocessor of autonomous waveform acquisition and executes the digitizing algorithm. The PPG provides the global time base for synchronous operation of the OCM system and the circuits under monitoring.

The construction of a PFE circuit is also given in Fig. 2. A source follower (SF) as an input stage of the PFE senses the voltage variation at its input ( $V_{\rm in}$ ) and provides output voltage ( $V_{\rm sfo}$ ) following to  $V_{\rm in}$  with the dc offset voltage,  $V_{\rm offset}$ . A latched comparator (LC) as the subsequent stage compares  $V_{\rm ref}$  and  $V_{\rm sfo}$  at the timing of  $T_{\rm smp}$ . The nearest value of  $V_{\rm ref}$  to the  $V_{\rm sfo}$  at  $T_{\rm smp}$  is searched as  $V_{\rm out}(T_{\rm smp})$ , according to the digitizing algorithm with iterative operations of the entire system. The time evolution of  $V_{\rm out}(T_{\rm smp})$  provides a waveform. A set of PFEs are designed for sensing different voltage domains of  $V_{\rm dd}$ ,  $V_{\rm ss}$ , and signals, where the  $V_{\rm offset}$  is tuned for each voltage domain so that  $V_{\rm sfo}$  matches the input voltage range of the LC.

The determination of  $V_{\rm out}$  by the LC is based on the

Fig. 3. Updating strategy of  $V_{ref}$  in search algorithm of  $V_{out}$ .



Fig. 4. Chip floor plan and physical layout of OCM system in each tier.

transfer characteristics regarding the voltage difference of  $V_{\rm ref}$ and  $V_{\rm sfo}$  at its input to the logical probability ( $P_{\rm out}$ ) in the output. The DPU evaluates  $P_{\rm out}$  by counting the output of the LC for the iterations of comparison at  $T_{\rm smp}$ . When  $V_{\rm ref}$  is sufficiently greater or smaller than  $V_{\rm sfo}$ ,  $P_{\rm out}$  is fixed at either 1.0 or 0.0, correspondingly. When  $V_{\rm ref}$  approximates  $V_{\rm sfo}$ ,  $P_{\rm out}$ sharply approaches to 0.5, reflecting the randomness of binary decisions. The value of  $V_{\rm out}$  is determined and stored as the digital code of the DAC (Fig. 2(a)) or the VG (Fig. 2(b)), when the first order slope of the  $P_{\rm out}$  becomes maximized against the adjacent codes. Once the  $V_{\rm out}$  is decided, the  $T_{\rm smp}$  is updated to the next timing position.

The search algorithm of  $V_{\rm out}$  uses the updating strategy of  $V_{\rm ref}$  as depicted in Fig. 3[12], based upon the fact that the voltage difference between adjacent digitized points becomes small if the sampling interval is small. The initial value of  $V_{\rm ref}$  for searching at the next timing position  $(T_{\rm smp} + \Delta T)$  is set at the last value  $V_{\rm out}$  decided for the current timing position  $(T_{\rm smp})$ . The first calculation of  $P_{\rm out}$  determines the next value of  $V_{\rm ref}$  to be increased  $(V_{\rm ref} + \Delta V)$  or decreased  $(V_{\rm ref} - \Delta V)$  in the stepping. In addition, the search step of  $V_{\rm ref}$  is sized in an order from coarse to fine, decided accordingly to the change of  $P_{\rm out}$ . The step sizes of  $\Delta T$  and  $\Delta V$  are chosen for the timing and voltage resolution, respectively, and defined by the DAC and PPG in Fig. 2(a) while by the VG and TG in Fig. 2(b).

### III. WAVEFORM ACQUISITION PERFORMANCE

### A. Design example

An array of PFE and DPU are integrated with on-chip VG and TG circuits in a 65 nm CMOS prototype chip, as

## EMC'14/Tokyo

illustrated in Fig. 4. These circuits use 2.5V I/O CMOS transistors with the gate length of 0.28  $\mu$ m, for a high voltage tolerance as well as a rail-to-rail coverage. The physical layout of PFEs for voltage domains of interest is regulated for tiling to form an array. The size of PFEs, DPU, TG, and VG are given in the figure.

#### B. Performance measurements

The OCM system acquires sinusoidal waveforms externally given at the input of PFEs for the evaluation of waveform acquisition performance. The acquired waveforms are resolved into frequency components, which are entirely undesirable except only for the frequency of the input sinusoid. The signal to noise and distortion ratio (SNDR) is then computed from (1) and represented in dB. This number is often interpreted in the effective number of bits (ENOB) for representing the resolution of waveform acquisition. The spurious free dynamic range (SFDR) gives another metric. This reflects the linearity of the entire OCM system with the dependency on signal frequencies and is often degraded by the primary harmonics due to the distortion or the interferers by undesired couplings.

$$SNDR = \frac{P_{\text{signal}}}{P_{\text{noise}} + P_{\text{distortion}}} \tag{1}$$

Figures 5 and 6 provide the dynamic performance of  $V_{\rm ss}$ and signal PFE channels capturing sinusoids centered at 0.10 V and 1.15 V, respectively. The SNDR increases with the input signal amplitudes, while being bounded by the distortion for very high amplitudes. The figures also compare the measurements using the external DAC and PPG (Fig. 2(a)) and the on-chip VG and TG circuits (Fig. 2(b)). It was confirmed that the waveform capturing using the on-chip generation of  $V_{\rm ref}$ and  $T_{\rm smp}$  provided the highest ENOB of 7.6 bit for the finest resolutions of 0.94 mV and 1.0 ns. While the highest ENOB is almost equivalent among the PFEs with using external sources and on-chip generators, the largest amplitudes of 244 mV for  $V_{\rm ss}$  PFE and 380 mV for signal PFE are larger and more desirable for in-stack signal and noise capturing in the latter.

The throughput (THP) of waveform acquisition indicates the efficacy of measurement systems. The number of transactions between the PC and measurement equipment exhibits a great impact on THP, since a few 100 milliseconds is consumed per a transaction, involving the update of voltage and timing, the wait for settling, and the receipt of status messages. The digitizing flow and search algorithms have been developed to minimize the number of transactions.

Figure 7 compares the THP of waveform acquisition evaluated in Figs. 5 and 6. The usage of on-chip VG/TG circuits efficiently reduces the transactions and enhances the THP by 5.8 times. It should be noted that the THP is further augmented even with the search algorithm of Fig. 3 that had already achieved a roughly 100 times acceleration over the exhaustive (brute force) search algorithm.

### IV. WAVEFORMS IN 3D CHIP STACK

In-stack waveform capturing is demonstrated with a test structure of Fig. 8. The top and bottom dies have individual



Fig. 5. Dynamic performance of  $V_{\rm ss}$  PFE using (a) external DAC/PPG and (b) on-chip VG/TG. The frequency components for the highest SNDR in respective measurements are also shown in (c) and (d).



Fig. 6. Dynamic performance of Signal PFE using (a) external DAC/PPG and (b) on-chip VG/TG. The frequency components for the highest SNDR in respective measurements are also shown in (c) and (d).

PDNs, where the p+ taps are specially prepared for biasing the p-type silicon substrate and connected to  $V_{\rm ss}$  wires. There is another wiring system ( $V_{\rm sub}$ ) for biasing p+ guard rings surrounding the OCM system. This also unifies the p-type substrates of the dice, however, the contribution is not significant since the location is distant from the PDNs of interest.

The sinusoid at the center of 0.0 V is selectively introduced to the  $V_{\rm ss}$  for either top PDN or bottom one, where their pads are located on the top die. The  $V_{\rm ss}$  waveforms are then captured by the PFE and evaluated for physical connections as well as underlined couplings.

When the sinusoid is introduced to the bottom  $V_{\rm ss}$ , the waveforms are captured as in Fig. 9(a) on the bottom  $V_{\rm ss}$  and clearly exhibit the completion of connections by TSVs. On the other hand, while the sinusoid is given to the top  $V_{\rm ss}$ , the waveforms on the bottom  $V_{\rm ss}$  shows reduced replica of the

# EMC'14/Tokyo



Fig. 8. Test structure of in-stack waveform monitoring in 3D chip stack.

sinusoids, as in Fig. 9(b). This suggests the couplings through side-wall capacitances of TSVs in parallel with the unification of substrates by the distant  $V_{\rm sub}$  wiring.

The quantitative evaluation of connections and couplings needs further measurements of DC and AC impedances that can be supported by the demonstrated in-stack waveform monitoring.

### V. CONCLUSION

In-stack evaluation of 3D ICs is realized with an on-chip waveform monitoring technique. The probing frontend circuits are equipped with on-chip generators of reference voltage steps and sampling timings. The reduced complexity of analog signal routing provides easier integration of in-stack monitors in 3D ICs. The higher waveform capturing throughput is also achieved, that is of essential importance for a large number of monitoring points to evaluate within multiple tiers of 3D ICs.

There is a strong need of realizing in-depth experiments of electrical performance in real 3D IC chips. The design principles are in very high demands for achieving power and signal integrities as well as electromagnetic compatibilities of 3D ICs. The demonstrated waveform capturing of sinusoidal signals in the 3D stack test chip has proven the high potentiality of in-place evaluation of 3D power and signal networks with TSVs. The analysis of in-stack acquired waveforms will be established in future studies.

#### ACKNOWLEDGMENT

This work was partly supported by Grants-in-Aid for Scientific Research (23360156).



Fig. 9. Sinusoidal waveforms monitored on bottom chip. Sinusoids are input to (a) bottom and (b) top  $V_{\rm ss}$  networks.

#### REFERENCES

- [1] J. Tschanz, N. S. Kim, S. Dighe, J. Howard, G. Ruhl, S. Vangal, S. Narendra, Y. Hoskote, H. Wilson, C. Lam, M. Shuman, C. Tokunaga, D. Somasekhar, S. Tang, D. Finan, T. Karnik, N. Borkar, N. Kurd, and V. De, "Adaptive Frequency and Biasing Techniques for Tolerance to Dynamic Temperature-Voltage Variations and Aging," in *IEEE Int. Solid-State Circuits Conf.*, pp. 292-604, Feb. 2007.
- [2] M. Fukazawa, T. Matsuno, T. Uemura, R. Akiyama, T. Kagemoto, H. Makino, H. Takata and M. Nagata "Fine-Grained In-Circuit Continuous-Time Probing Technique of Dynamic Supply Variations in SoCs," in *IEEE Int. Solid-State Circuits Conf.*, pp.288-289 Feb. 2007.
- [3] M. Badaroglu, S. Donnay, H. J. de Man, Y. A. Zinzius, G. G. E. Gielen, W. Sansen, T. Fonden and S. Signell, "Modeling and experimental verification of substrate noise generation in a 220-Kgates WLAN systemon-chip with multiple supplies," in *IEEE J. Solid-State Circuits*, pp. 1250-1260, July 2003.
- [4] N. Azuma, T. Makita, S. Ueyama, M. Nagata, S. Takahashi, M. Murakami, K. Hori, S. Tanaka and M. Yamaguchi, "In-system diagnosis of RF ICs for tolerance against on-chip in-band interferers," in *IEEE Int. Test Conference*, pp. S12.03.1-S12.03.9, Sept. 2013.
- [5] P. Jain, J. Dong, W. Xiaofei, and C. H. Kim, "Measurement, analysis and improvement of supply noise in 3D ICs," in *Proc. Symp. VLSI Circuits* 2011, pp. 46-47, June 2011.
- [6] Y. Araga, M. Nagata, G. Van der Plas, Jaemin Kim, N, Minas, P. Marchal, Y. Travaly, M Libois, A. La Manna, Wenqi Zhang, E. Beyne, "In-Tier Diagnosis of Power Domains in 3D TSV ICs," in *IEEE Proc. 3DIC 2011*, pp. 7.2.1-7.2.6, January 2012.
- [7] S. Takaya, M. Nagata, A. Sakai, S. Uchiyama, H. Kobayashi and H. Ikeda, "A 100GB/s Wide I/O with 4096b TSVs through an active silicon interposer with in-place waveform capturing," in *IEEE Int. Solid-State Circuits Conf.*, pp. 434-435, Feb. 2013.
- [8] K. M. Fukuda, T. Anbo, T. Tsukada, T. Matsuura, and M. Hotta "Voltage-Comparator-Based Measurement of Equivalently Sampled Substrate Noise Waveforms in Mixed-Signal Integrated Circuits," in *IEEE J. Solid-State Circuits*, Vol. 31, No. 5, pp. 726-731, May. 1996.
- [9] M. Nagata, J. Nagai, T. Morie, and A. Iwata, "Measurements and Analyses of Substrate Noise Waveform in Mixed-Signal IC Environment," in *IEEE Trans. CAD of Integrated Circuits and Systems*, Vol. 19, No. 6, pp. 671-678, June 2000.
- [10] Y. Kanno, Y. Kondoh, T. Irita, K. Hirose, R. Mori, Y. Yasu, S. Komatsu, and H. Mizuno, "In-Situ Measurement of Supply-Noise Maps with Millivolt Accuracy and Nanosecond-Order Time Resolution," in *IEEE J. Solid-State Circuits*, Vol. 42, No. 4, pp. 784-789, April 2007.
- [11] A. Muhtaroglu, G. Taylor, and T. Rahal-Arabi, "On-Die Droop Detector for Analog Sensing of Power Supply Noise," in *IEEE J. Solid-State Circuits*, Vol. 39, No. 4, pp. 651-660, April 2004.
- [12] Y. Araga, T. Hashida, M. Nagata, "An On-Chip Waveform Capturing Technique Pursuing Minimum Cost of Integration," in *IEEE Proc. ISCAS* 2010, pp. 3557-3560, May 2010.