# High-speed Time-Delay-Integration (TDI) Imaging with 2-D SPAD arrays

Daniel Van Blerkom, Steve Huang, Barmak Mansoorian SWIRLabs Corporation, Pasadena, California, USA dvb@swirlabs.com

*Abstract*— We describe the implementation of very high speed TDI imaging using 2-D SPAD arrays. We propose a fast frame read approach to optimize SNR and scalability and investigate its performance using simulation and measurements from a prototype 2-D SPAD sensor.

#### I. INTRODUCTION

TDI imaging is a technique used when a scene is moving relative to the sensor. Multiple images of the scene are shifted to remove the relative movement and then summed to improve the SNR by effectively increasing the integration time. While the charge-packet accumulation and shifting of CCD technology fits naturally with TDI imaging, TDI sensors have recently been described using CMOS pixels, where the accumulation occurs digitally. However, both CCD and CMOS TDI sensors have speed and noise limitations.

## II. MOTIVATION

SPAD based sensors can break the speed bottleneck for TDI, while operating close to the shot noise limit. The combination of precise time resolution and single-photon detection of SPADs makes it a natural candidate for TDI imaging. A digital "charge packet" can be accumulated over multiple lines with no excess noise, unlike a traditional CMOS image sensor, where noise will corrupt each readout. The advantage of applying this Quanta Image Sensor (QIS) approach to TDI is anticipated in [1]; here we examine the implementation of such a sensor with SPAD technology.

Table I compares the performance of recent published TDI sensors with the proposed SPAD TDI sensor. The SPAD TDI can reach line rates that are not possible with a CCD, with an extremely low noise floor. The SPAD TDI readout temporal noise is principally due to the SPAD dark count rate, which is a fraction of an electron at the line rate. The current generation SPAD PDE is lower than the equivalent peak QE of CCD implementations, and the SPAD pitch is larger than CCD TDI pixel sizes, limiting the number of columns that can be implemented in the sensor. However, recent research shows >80% PDP at visible wavelengths with SPAD pitches of 2.5  $\mu$ m. [2] The power of the SPAD implementation is high due to the large bandwidth requirement of the readout, but it benefits from the continued reduction in digital supply voltages, whereas CCD transfer gates do not share this benefit. The SPAD TDI full-well is determined by the register width allocated to the digital accumulators; this can also scale to larger values with tighter digital processes. In the sensor described here, the full-well can be traded against the line rate, from 255e- at the maximum 12.5 MHz line rate to 4095e- at 780 kHz line rate.

| Parameter             | SPAD TDI                                 | [3]                    | [4]                    | [5]                                       |
|-----------------------|------------------------------------------|------------------------|------------------------|-------------------------------------------|
| TDI Stages            | 256                                      | 256                    | 256                    | 256                                       |
| Columns               | 512                                      | 4096                   | 4096                   | 9072                                      |
| Pixel Pitch           | 10 µm                                    | 5.4 µm                 | 5 µm                   | 5 µm                                      |
| Peak QE/PDE           | 55%                                      | 96%                    | 49%                    | 82.4%                                     |
| Full-well             | 255 e- / 4,095 e-                        | 23 ke-                 | 30 ke-                 | 15.8 ke-                                  |
| Noise floor           | << 1 e-                                  | 15 e-                  | 12 e-                  | 11.4 e-                                   |
| Dark current<br>at RT | $200 \text{ cps} =$ $32 \text{ pA/cm}^2$ | 2.5 nA/cm <sup>2</sup> | 3.7 nA/cm <sup>2</sup> | $4 \text{ ke-/sec} = 2.5 \text{ nA/cm}^2$ |
| Max Line Rate         | 12.5 MHz / 780 kHz                       | 1000 kHz               | 270 kHz                | 600 kHz                                   |
| Power                 | 8 W                                      | N/A                    | 2 W                    | 5.5 W                                     |





Fig 1. Layout of bottom circuit layer implementing an 8-bit in-pixel accumulator.

The most straight forward digital implementation of a SPAD based TDI imager would be to mimic the operation of the CCD charge packet shifting in the digital domain. This approach is described in [6], where a ripple counter is dedicated to each SPAD to manage the count of detected photons. Fig. 1 shows a 3x3 layout for this approach in the 65nm/40nm ST Microelectronics stacked SPAD process under a 10µm SPAD, with 8 counter bits. As SPAD device pitch decreases with improvements in the technology, the size of the ripple counter must also reduce, limiting the effective full-well and SNR. There is also a timing

overhead to transferring the data and clocking the array, during which photons are not detected, and the simultaneous clocking during the transfer of the ripple counter values leads to a large peak current.

#### III. SPAD TDI SENSOR

To improve the SNR, we propose to move the TDI accumulators outside of the SPAD array and implement a single bit memory in each SPAD pixel, which can be read and emptied very quickly with a fast frame readout. The sensor architecture is shown in Figure 2. For a 512x256 10µm pitch stacked SPAD array, we simulate operation with a frame rate of 80 nsec per frame. This frame rate can support a TDI line rate of 12.5 MHz. If a lower TDI line rate is needed, then multiple samples can be accumulated for each line, oversampling the array and increasing the SNR. Since the SPAD readout is essentially noiseless, readout noise does not accumulate for multiple reads as it would with a CMOS TDI sensor.



Fig 2. SPAD TDI sensor block diagram.

Each pixel consists of two 1-bit storage nodes for photon detection, as shown in Fig. 3. To avoid interrupting the imaging to reset the detection state, one bit can be read out and reset while the other captures the current frame's SPAD pulse. The entire array is read out in a row-wise fashion, and then the other bit is selected for readout. This ping-pong approach allows for global-shutter and integrate-while-read (IWR) operation. Similar ping-pong approaches have been proposed in, for example, [7] and [8], although not for TDI imaging.



Fig 3. SPAD pixel schematic for fast TDI imaging.

The timing diagram for the ping-pong frame collection is shown in Fig. 4, and the row readout sequence is shown in Fig. 5. Following the row selection for readout, the pixel memory is reset to prepare it to capture the next frame. To support the fast row-select time for a large array, the row driver logic is generated and periodically buffered in the pixel, utilizing the remaining free layout area in each pixel.



Fig 4 & 5. Pixel timing and row readout timing.

Reading the entire array of memories in one TDI line time is a potential speed and power bottleneck. To reduce the bandwidth requirement, multiple rows are read out in each row time. For a 10 $\mu$ m SPAD pixel pitch, 8 bit-lines per pixel column allows for 8 rows to be read out simultaneously, with a row time of 2.5 nsec. Once the data is read out of the SPAD pixel array, the accumulators are not limited in size by the pixel pitch. For 256 TDI stages, a 12-bit final TDI register can accommodate 16x oversampling of the 1-bit pixel values per TDI stage. As the pixel data is read out of the array, it is added to the appropriate accumulators.

The block diagram and implementation of the accumulator digital logic is shown in Fig. 6. The TDI accumulators are arranged into blocks of 8 adjacent accumulator registers. The readout proceeds from the bottom to the top of the array by passing an enable bit from block to block. The accumulator values are shifted down in the array to make room for the next line, as determined by the oversampling desired. To make the best use of the routing resources, the accumulator blocks are implemented on a two-column pitch, and four columns of accumulators share clock and reset buffering. Clock gating is used to only enable the accumulator logic being addressed, reducing the power requirement significantly. The bit widths of the accumulators are progressively scaled in size to match the increasing full-well of the signal, as the TDI accumulator progresses from the first stage to the last. This means the first accumulator blocks are significantly smaller than the last accumulator blocks.



Fig 6. SPAD digital accumulator block implementation.

#### IV. POWER BREAKDOWN

The power consumption of the SPAD TDI sensor can be broken into four main circuit components: the row-selection drivers, the pixel readout and sense-amplifiers, the digital accumulators, and the output serializer and driver. In addition, there is power dissipated by the firing of the SPADs, which is scene dependent. The largest circuit power consumption is due to the digital accumulators, which require 10 mW per column when running at the highest speed. The overall circuit power of the 512x256 sensor is approximately 6 W. If the illumination causes every SPAD to fire for every frame time, the power due to the SPADs themselves is 2 W, giving a total worst-case power of 8 W. At the highest line rate of 12.5 MHz, the sensor outputs an aggregate of 51.2 Gbps of data.

## V. APPLICATIONS

Any photons that arrive during the SPAD dead time are missed. This leads to a soft saturation of the final accumulated count as the illumination increases. For the proposed TDI pixel, a similar situation occurs if more than one photon arrives during a frame readout time. Following [5], the final TDI count can then be approximated:

$$S = \frac{\eta \phi N_{TDI} T_{line}}{1 + \eta \phi \frac{T_{line}}{M}} \tag{1}$$

Where T<sub>line</sub> is the TDI line time, N<sub>TDI</sub> is the number of TDI stages,  $\phi$  is the photon arrival at the SPAD,  $\eta$  is the SPAD PDP, and M is the oversampling value. This soft saturation effectively extends the dynamic range of the sensor. [1,8]

One potential candidate for high-speed TDI imaging is flow cytometry [11], where there is great interest in imaging the cell morphology and precise location of fluorescence markers on the cell. To show the potential performance of the SPAD TDI approach, we took multiple shifted images of a cell micrograph with the prototype 2-D SPAD sensor and performed the TDI operation off-chip. We illuminated with a flicker-free monitor and captured frames with 300 nsec integration time. Fig. 7 shows the ground truth image, one example captured frame, and the final TDI output. Also shown is an image captured statically (i.e. no TDI shifting) at 30 µsec integration. The cell shape and non-uniformity within the cell are clear in the final TDI output, showing the promise of SPAD TDI sensors.



(a) (b) (c) (d)

# Fig 7. Ground-truth image (a), one frame of TDI sequence (b), final TDI output (c), and a statically captured image for comparison.

## REFERENCES

- [1] E. Fossum, et al., "The Quanta Image Sensor: Every Photon Counts", Sensors, 2016, 16.
- [2] S. Shimada, et al., "A SPAD depth sensor robust against ambient light: the importance of pixel scaling and demonstration of a 2.5um pixel with 21.8% PDE at 940nm", IEDM 2023.
- [3] P. Boulenc, et al., "Multi-spectral high-speed backside illuminated TDI CCD-in-CMOS imager", IISW 2019.
- [4] H. J. Lee, et al., "Charge-coupled CMOS TDI imager", IISW 2017.
- [5] Gpixel GLT5009BSI BSI TDI line scan image sensor datasheet, available at gpixel.com/products/line-scan/got/glt5009bsi
- [6] X. Kong, et al, "Time-Delay-Integration Imaging Implemented With Single-Photon-Avalance-Diode Linear Array", IEEE Sensors Journal, March 2021.
   [7] B. Park, et al., "A 400x200 600fps 117.7dB-DR SPAD X-ray detector with seamless global shutter and time-encoded extrapolation counter", ISSCC
- 2023.
- [8] A. Ingles, et al., "High-flux passive imaging with single-photon sensors", in Proc. IEEE/CVF CVPR, June 2019.
- [9] K. Morimoto, et al, "Megapixel time-gated SPAD image sensor for 2D and 3D imaging applications", Optica, April 2020.
- [10] G. Lepage, et al, "Time-Delay-Integration Architectures in CMOS Image Sensors", IEEE Trans. Elec. Devices, Nov 2009.
- [11] H. Mikami, et al, "Virtual-freezing fluorescence imaging flow cytometry", Nature Communications, 11, March 2020.