## CMOS 3D-Stacked FSI Multi-Channel Digital SiPM for Time-of-Flight Vision Applications Francesco Gramuglia\*, Andrada Muntean\*, Carlo Alberto Fenoglio, Esteban Venialgo, Myung-Jae Lee, Scott Lindner, Makoto Motoyoshi, Andrei Ardelean, Claudio Bruschini\*\*, Edoardo Charbon\*\* Multi-channel digital silicon photomultipliers (MD-SiPMs) [1] have been introduced to increase spatial granularity of time stamping in multi-photon detection with respect to analog SiPMs. The use of CMOS technologies for their implementation guarantees cost reduction and the integration of new functionality near light detection. To achieve time stamps with high resolution and low dead time, advanced technology nodes are preferable. However, this choice can potentially compromise the performance of single-photon avalanche diodes (SPADs), the core of the SiPM. 3D integration overcomes this issue by allowing the selection of different technologies for top- and bottom-tier chips. The first 3D-integrated SPAD-based sensor was implemented by employing a backside-illumination (BSI) approach [2]. This chip had very low sensitivity in the 400-500 nm range, making it unsuitable for some applications. Improvements of the technology, led to other implementations of 3D-stacking for SPADs [3,4]. Although higher sensitivity was achieved at lower wavelengths, it is still too low for many time-resolved imaging applications, especially in bioimaging. We have therefore developed a 3D-stacked frontside-illuminated (FSI) MD-SiPM [5], whereas the approach is similar to the one first proposed in [6]. The sensor consists of top- and bottom-tier chips all fabricated in 0.18µm CMOS technology that are bound in a 3D-stacked configuration. The two tiers are connected by means of through-silicon vias (TSVs) and bump-bonds (Fig. 1), whereas a TSV has been implemented for each SPAD. Fig. 2 shows the back of the top-tier where micropillars are implemented to allow proper electrical connection between the two tiers after 3D-stacking (Fig. 2, *right*). Top and bottom-tier are bonded to create the electrical connection and have a flat and mechanically stable structure, thanks also to the relatively small TSV pitch (Fig. 3). To avoid the need for different TSV structures and to eliminate the risk of breakdown in the oxide surrounding the TSVs, high voltage is provided to SPADs through a set of bonding pads directly on the top tier. In this implementation, the maximum voltage across the TSV oxide never exceeds 5V. To prevent cracks during the wirebonding process, a support structure is implemented beneath the bonding pad (Fig. 4). While the top-tier chip is dedicated to light detection, the bottom-tier embeds all the electronics needed for such system, including SPAD address tree, photon counters, time-to-digital converters (TDCs), data distribution, readout scheduler, and a TDC calibration system. The full chip (Fig. 5) measures $7.5 \times 4.2 \text{ mm}^2$ and it comprises two arrays of $64 \times 64 \text{ SPADs}$ each. The two arrays are segmented in 8 × 8 clusters of 64 pixels each (Fig. 5-6). The MD-SiPM is based on an event-driven architecture. Fig. 7 shows a simplified scheme of the cluster architecture. The pixel pitch is 50 µm, the fill factor 67%, and the peak PDP 55% at 500nm (Fig. 8). Each SPAD pixel, based on a P-i-N cross section [7,8], uses a cascode transistor to allow high excess bias voltage, thus improving sensitivity and jitter performance [8,9]. Moreover, active recharge allows the tuning of the pixel dead time, down to about 3 ns [8]; it guarantees a good temporal compression, thus mitigating the limitation given by the dead time of the propagation tree [10,11]. A pixel masking circuit can turn off noisy pixels and thus improve the overall SNR. The pulses generated by the SPADs are propagated through an OR-tree for spatial compression within the cluster, up to the input of the TDC trigger [11]. 128+1 TDCs, based on a multi-path gated ring oscillator [12,13] (Fig. 9), has an LSB of 15ps. The power consumption per TDC is ~1.4mW. A photon-counting system is added in each cluster, along the propagation tree, to estimate the total number of photons detected. The counting is performed by TSPC counters connected to the fifth level of the OR-tree to minimize the effect of dead time. The result of each counter is summed by a 6bits adder, sampled with a frame signal (STOP) and saved in a memory buffer. Along the propagation OR-tree, a winner-take-all (WTA) tree [2] was inserted to determine the address of the first SPAD that fired, triggering the TDC of its cluster. This strategy allows to reach a spatial granularity at the level of a single SPAD. In addition, correction of the temporal skews given by unbalance of the signal paths is also possible. A flag bit is asserted every time the TDC is triggered in order to indicate the presence of valid data. All TDC flags are taken as an input by a control unit. The latter acts as a scheduler that implements a priority ceiling protocol, where each cluster has a fixed priority. The use of such a system, especially at low light levels, eliminates the need for reading all of the clusters, thus reducing data throughput, readout time, and power consumption. Each array of the chip can be read out independently through a random-access architecture composed by a row encoder and column multiplexer (Fig. 6). All the clusters of the same column share the same output bus and access it through a high impedance buffering stage, enabled by the row encoder. The data are output in parallel to increase readout speed. The system was designed to achieve a maximum readout speed up to 1 Mframe/s. Each output word provides the address of the cluster read, the address of the SPAD that triggered the TDC for that frame, the TDC output, and the result of the counting system. This work was supported by the Swiss National Science Foundation under Grant No 169465 and 166289. F. Gramuglia, A. Muntean, C. A. Fenoglio, A. Ardelean, C. Bruschini, and E. Charbon are with EPFL, Switzerland (e-mail: <a href="mailto:francesco.gramuglia@epfl.ch">francesco.gramuglia@epfl.ch</a>, <a href="mailto:andrada.muntean@epfl.ch">andrada.muntean@epfl.ch</a>, <a href="mailto:carlo.fenoglio@epfl.ch">carlo.fenoglio@epfl.ch</a>, <a href="mailto:a.ardelean@epfl.ch">a.ardelean@epfl.ch</a>, href="mailto:a.ardelean@epfl.ch">carlo.fenoglio@epfl.ch</a>, <a href="mailto:a.ardelean@epfl.ch">a.ardelean@epfl.ch</a>, <a href="mailto:a.ardelean@epfl.ch">carlo.fenoglio@epfl.ch</a>, <a href="mailto:a.ardelean@epfl.ch">mailto:a.ardelean@epfl.ch</a>, <a href="mailto:a.ardelean@epfl.ch">mailto:a.ardelean@epfl.ch</a>, <a href="mailto:a.ardelean@epfl.ch">mailto:a.ardelean@epfl.ch</a>, <a href="m ## References - [1] S. Mandai and E. Charbon, "Multi-channel digital SiPMs: Concept, analysis and implementation," IEEE Nuclear Science Symposium and Medical Imaging Conference Record (NSS/MIC), pp. 1840-1844, 2012. doi: 10.1109/NSSMIC.2012.6551429. - [2] J. M. Pavia, et al., "A $1 \times 400$ Backside-Illuminated SPAD Sensor With 49.7 ps Resolution, 30 pJ/Sample TDCs Fabricated in 3D CMOS Technology for Near-Infrared Optical Tomography," IEEE Journal of Solid-State Circuits, vol. 50, no. 10, pp. 2406-2418, 2015. doi: 10.1109/JSSC.2015.2467170. - [3] M.-J. Lee, et al., "High-Performance Back-Illuminated Three-Dimensional Stacked Single-Photon Avalanche Diode Implemented in 45-nm CMOS Technology," IEEE Journal of Selected Topics in Quantum Electronics, vol. 24, no. 6, pp. 1-9, Art no. 3801809, 2018. doi: 10.1109/JSTQE.2018.2827669. - [4] T. Al Abbas, N.A.W. Dutton, O. Almer, S. Pellegrini, Y. Henrion, R.K. Henderson, "Backside illuminated SPAD image sensor with 7.83 μm pitch in 3D-Stacked CMOS technology," IEEE International Electron Devices Meeting (IEDM), pp. 811-814, 2016. doi: 10.1109/IEDM.2016.7838372. - [5] F. Gramuglia, et al., "CMOS 3D-Stacked FSI Multi-Channel Digital SiPM for Time-of-Flight PET Applications," IEEE Nuclear Science Symposium and Medical Imaging Conference Record (NSS/MIC), 2020. - [6] F. Nolet, et al. "A 2D Proof of Principle Towards a 3D DigitalSiPM in HV CMOS With LowOutput Capacitance, in IEEE Transactions on Nuclear Science, vol. 63, no. 4, pp. 2293-2299, Aug. 2016, doi: 10.1109/TNS.2016.2582686.. - [7] C. Veerappan and E. Charbon, "A Low Dark Count p-i-n Diode Based SPAD in CMOS Technology," IEEE Transactions on Electron Devices, vol. 63, no. 1, pp. 65-71, 2015. doi: 10.1109/TED.2015.2475355. - [8] F. Gramuglia, M. -L. Wu, C. Bruschini, M. -J. Lee and E. Charbon, "A Low-noise CMOS SPAD Pixel with 12.1 ps SPTR and 3 ns Dead Time," in *IEEE Journal of Selected Topics in Quantum Electronics*, doi: 10.1109/JSTQE.2021.3088216. - [9] S. Lindner, et al. " A High-PDE, Backside-Illuminated SPAD in 65/40-nm 3D IC CMOS Pixel With Cascoded Passive Quenching and Active Recharge," IEEE Electron Device Letters, vol. 38, no. 11, pp. 1547-1550, 2017. doi: 10.1109/LED.2017.2755989. - [10] S. Gnecchi *et al.*, "Digital Silicon Photomultipliers With OR/XOR Pulse Combining Techniques," in *IEEE Transactions on Electron Devices*, vol. 63, no. 3, pp. 1105-1110, March 2016, doi: 10.1109/TED.2016.2518301. - [11] L. H. C. Braga et al., "A CMOS mini-SiPM detector with in-pixel data compression for PET applications," IEEE Nuclear Science Symposium Conference Record, Valencia, Spain, pp. 548-552, 2011. doi: 10.1109/NSSMIC.2011.6154110. - [12] M. Z. Straayer and M. H. Perrott, "A Multi-Path Gated Ring Oscillator TDC With First-Order Noise Shaping," IEEE Journal of Solid-State Circuits, vol. 44, no. 4, pp. 1089-1098, 2009. doi: 10.1109/JSSC.2009.2014709. - [13] A. Muntean, et al., "Towards a fully digital state-of-the-art analog SiPM," IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), Atlanta, GA, USA, 2017, pp. 1-4, 2017. doi: 10.1109/NSSMIC.2017.8533036. Fig. 1. *Top:* the top tier houses square SPADs with rounded corners and is 3D-stacked to the bottom-tier chip (*left*). The bonding with the bottom-tier chip is ensured through TSVs and micro-bump connections (*right*). *Bottom:* Optical microscope image of the final implementation (*left*); SEM image of the cross section (*right*). **Fig. 2.** *Top:* SEM images of the micropillars implemented on the back of the top tier. *Bottom:* detail of the micropillar structure. **Fig. 3.** SEM image of the micropillars used for 3D stacking and implementing the electrical connection between top-tier TSV and bottom-tier pixel frontend. **Fig. 4.** SEM image showing a detail of the bonding pads on the top tier supported by micropillars beneath to improve reliability and mechanical stability during wire-bonding. **Fig. 5.** MD-SiPM micrograph. The chip is partitioned in two independent segments including 4096 pixels each. Fig. 6. Top-level architecture. Fig. 7. Cluster architecture. **Fig. 8**. *Left:* I-V characteristics measured on isolated SPAD samples. The measurements were performed both with and without illumination. The actual breakdown voltage is about 22 V, whereas the measurements without illumination tend to overestimate the voltage of breakdown. *Right:* PDP measurements for several excess bias voltages (from 1 V to 6 V) and wavelength (from 320 nm to 960 nm). Fig. 9. Left: multipath ring oscillator simplified scheme. Right: TDC architecture scheme. The ring oscillator output is sampled by 4 sets of phase registers, triggered by a signal, which is delayed by a controllable block $t_{d,i}$ . The four values are then input to a processing unit that calculates the final TDC code.