# A 3.5um Indirect Time-of-Flight Pixel with In-Pixel CDS and 4-Frame Voltage Domain Storage

Erez Tadmor<sup>1</sup>, Ben Dror<sup>1</sup>, Guy Likver<sup>1</sup>, Gal Fadida<sup>1</sup>, Zvika Veig<sup>1</sup>, Seiji Takeuchi<sup>2</sup>, Toshiki Rai<sup>2</sup>, Atsushi Noda<sup>2</sup>

onsemi, <sup>1</sup>Netiv Haor 1, Haifa, Israel, erez.tadmor@onsemi.com; <sup>2</sup>Gifu, Japan

This paper presents a new 3.5um indirect Time-of-Flight (iToF) pixel with in-pixel CDS circuit, ability to store 4 frames in voltage-domain storage capacitors, 38% QE @ 940nm, >25ke- linear full well, and up to 200MHz modulation frequency with >95% modulation contrast. The ability to store all the required raw data for depth calculation inside the pixel enables depth sensing with low motion artifacts and simplifies in-chip depth calculation.

# Motivation

iToF cameras usually require 4-8 exposures with different phase shifts between the illumination and the pixel modulation in order to reconstruct a depth scene. [1,2] The data from these exposures normally has to be read out and stored externally to the sensor. This makes iToF cameras prone to motion artifacts due to coupling between total exposure time and readout time, as shown in Fig.1. In addition, several external frame buffers are required in the system level to store the phase data until all the data is ready for calculation. These problems get worse with increasing resolution, as readout time takes longer, and required frame buffers become larger. In this work we present a pixel that solves this problem using in-pixel voltage domain storage of up to 4 frames.

## Pixel Architecture - In-Pixel CDS & Storage

The pixel consists of a pinned photodiode with two symmetrical modulation gates and a global shutter gate. The modulation gates control the electrostatic potential in the photodiode to quickly sweep electrons into one of the two charge-domain storage gates where they are stored for the duration of the exposure period. The global shutter enables operating in pulsed or hybrid modes, where the modulation is not continuous, but is done in short bursts. When coupled with high peak-power illuminator, this improves the system performance under ambient light [3].

Post-exposure, charge can be read out from the storage gates in a correlated double sampling (CDS) operation. The readout can be performed in the standard way, row wise and out of the sensor, or using the in-pixel CDS and into voltage domain storage capacitors. The mechanism that allows the latter option is presented in Fig.2(A): Each group of 4 pixels share a sample and hold circuit that is then connected to 16 MIM capacitors with high capacitive density. This way, the entire 1.2MP array can sampled into the storage capacitors in under 400 microseconds.

This operation quickly removes the signal from the charge domain, where it is vulnerable to parasitic light, into the voltage domain. It reduces the effective exposure time by 65%-90% (Depending on exposure time and readout time ratio), which means that the motion artifacts described previously are practically eliminated. Finally, all the data required for depth calculation is being stored in the pixel level so external frame buffers become redundant. It therefore simplifies the depth reconstruction pipeline, as all the data from a specific pixel is transmitted out in the order needed for efficient calculation. This permits depth calculation to be done in the sensor, using very simple and efficient pipelined logic.



Figure 1(A) – Illustration of the effective exposure time (first-to-last photon time) in a hypothetical 1280x960 iToF sensor, with 4 MIPI lanes outputting 12 bits per pixel at 2Gbps, and exposure time of 300us per phase, assuming only differential data is read from the sensor. (B) - Using inpixel CDS and storage, the effective exposure time is reduced by 65% which will result in significantly lower motion artifacts.

The basic operation of the in-pixel CDS mechanism is described in Figure 2(B). After one cycle of exposure is finished signal is stored in the pixel storage gates (SG). The SG bias is kept at an intermediate level to achieve optimal tradeoff between linear full well and dark current generation. First, the FD is reset, and its value is sampled on the left side of the sample-and-hold capacitor, while right side of the capacitor is being reset. In the next step, the right side of the sample-and-hold capacitor is disconnected from supply by CDS\_RST, and charge from SG1 is transferred to FD, and sampled on the capacitor, now holding the reset subtracted signal. The subtracted signal is then sampled through the second SF to the storage capacitor C1. The operation is repeated for sampling SG2 and the signal is stored on C3. This operation is repeated four times as every four pixels share a CDS circuit.



Figure 2(A) - A schematic drawing of 4 pixels sharing a CDS stage and 4 analog memory banks with 4 capacitors each. Additional 2 capacitors are shared between the 4 pixels and are used for averaging (i.e. writing twice to the same memory capacitor) and binning operations. (B) waveform description of the basic in-pixel CDS and storage operation.

After the operation described above is completed (in under 400 µsec), storage capacitors C1 in the entire array contain 0° data sampled from SG1, and respectively C3 capacitors contain 180° data sampled from SG2. The next step would be performing another exposure for sampling 90°-270°, and populating C2 and C4 with the corresponding data. Theoretically, at that point all the required data for calculating phase/depth exists. However, since each phase was collected from a specific storage gate, the data might not be perfectly symmetric. This is due to the fact that the two SGs may collect different amount of parasitic light, generate different amount of dark current, etc. which could lead to depth inaccuracy. In order to create symmetric data, we introduce the averaging mechanism: after storing phases 0° (from SG1), 90° (SG1), 180° (SG2), and 270° (SG2) in C1,2,3, and 4 respectively, we reverse the roles, and modulate with complementary phases: 180° -0° and 270° -90°.

Now, each storage gate stores the complementary phase to the one it stored previously, but we encounter a

problem: how to store this signal in a storage capacitor that already holds previous data without erasing it? This is done by temporarily storing the data on COUT1, and then connecting COUT1 to the relevant storage cap. The 2 capacitors are identical, and the new and previous signals are averaged through charge sharing. Eventually, each storage capacitor will hold signal of one phase (0°, 90°, 180°, or 270°), sampled and averaged equally from both storage gates, and therefore fully differential. The depth biases due to variations between storage gates will be eliminated from the data. The complete operation is summarized in Table 1.

Finally, the fact that every four vertically adjacent pixels share the same in-pixel CDS circuit and are connected to a set of 16 storage capacitors, allows store more than just 4 readouts, as long as the array is binned accordingly. This is very useful in order to resolve ambiguity issues due to phase wraparound that are inherent in iToF technology, which requires data from more than one frequency. This coincides nicely with the fact that longer ranges usually require pixel binning for improved SNR.

# Pinned Photodiode Optimization for High Modulation Contrast & Frequency

Achieving good depth quality in iToF sensors requires high modulation frequency, high modulation contrast, and high modulation uniformity between adjacent pixels and across the array. Those are achieved by design of the electric field in the photodiode, so that any photoelectron that is generated inside will quickly drift to the correct storage area. Photoelectrons that get stuck in areas with low electric field have chance of being integrated into the wrong storage, thus lowering the modulation contrast. Furthermore – the time it takes for the electron to cross the low-field area is in many cases sensitive to process variations and therefore might cause different modulation response in adjacent pixels that will translate into patterns in the depth image. Optimization the implant scheme and layout of the photodiode is required in order to eliminate those areas with potential pockets or low electric fields. In order to achieve maximal electron velocity, the photodiode was designed to guide the electron in the path described in Figure 3(A). After photogeneration the electron will drift toward the center of the photodiode, then drift toward the surface, and will finally be collected into the correct storage area under the effect of the modulation gates. A series of low-dose, high energy implants were introduced into the photodiode in order to

Table 1- summary of the data contained in each storage capacitor after each exposure and in-pixel CDS cycle

| Stored data after<br>Exposure #: | C1                                      | C2                                            | C3                                                      | C4                                              |
|----------------------------------|-----------------------------------------|-----------------------------------------------|---------------------------------------------------------|-------------------------------------------------|
| 1                                | $\phi(0^o)_{SG1}$                       | _                                             | $\phi(180^{o})_{SG2}$                                   | —                                               |
| 2                                | $\phi(0^o)_{SG1}$                       | $\phi(90^o)_{SG1}$                            | $\phi(180^{\circ})_{SG2}$                               | $\phi(270^{o})_{SG2}$                           |
| 3                                | $[\phi(0^o)_{SG1} + \phi(0^o)_{SG2}]/2$ | $\phi(90^{o})_{SG1}$                          | $[\phi(180^{o})_{SG1} + \phi(180^{o})_{SG2}]/2$         | $\phi(270^{o})_{SG2}$                           |
| 4 (Final)                        | $[\phi(0^o)_{SG1} + \phi(0^o)_{SG2}]/2$ | $[\phi(90^{\circ})_{SG1} + \phi(90)_{SG2}]/2$ | $[\phi(180^{\circ})_{SG1} + \phi(180^{\circ})_{SG2}]/2$ | $[\phi(270^{o})_{SG1} + \phi(270^{o})_{SG2}]/2$ |

achieve strong electric field and high electron drift velocity towards the surface. The implant scheme and photodiode layout were optimized using three dimensional TCAD simulations to make sure that an electron generated anywhere in the photodiode will be collected to the storage gate in under 1ns. Simulated cross-sections of the Electrostatic Potential, Electric Field, and electron Drift Velocity extracted from the center of the photodiode can be seen in figure 3B.



Figure 3(A) – a conceptual cross section of an iToF pixel photodiode, showing the path of the photoelectron. After photogeneration the electron quickly drifts towards the center of the photodiode (1), then towards the silicon surface (2), and finally is collected into the relevant storage gate (3). (B) – one-dimensional cross section generated using TCAD simulation showing the Electrostatic Potential, Electric Field, and Drift Velocity in the center of the photodiode.

As part of the pixel development an experiment was planned both around the layout and the implant recipe of the photodiode. The effect of the optimization on modulation performance is apparent in figure 4, which presents experimental results of scanning the pixel optical response vs. delay time using a pulsed laser with a very short (10s of picoseconds) pulse width. Each data point in the plot represents the average pixel response for a certain delay between the pixel modulation clock and the laser pulse. The dark shaded area (which is hardly visible) represents pixels within 1 standard deviation from the average modulation, and the light shaded area represents the 95<sup>th</sup> percentile of all pixels. Figure 4A shows the result of layout optimization, and specifically increasing the modulation gate (MG) length on the modulation uniformity. This measurement was performed in modulation frequency of 100MHz and with modulation voltage of 1.2V. It can be seen in the plot that the modulation shape and pixel to pixel variation improved significantly. Modulation contrast was not affected and remained at 96%.



Figure 4 – Scans of the pixel response to fast modulation, measured with a short (<100 ps) pulse laser and a delay generator. Each data point represents the average pixel response per specific laser delay. (A) – scans with modulation frequency of 100MHz and modulation voltage of 1.2V. (B) – scans with modulation frequency of 200MHz and modulation voltage of 1.8V.

Figure 4B shows the result of implant dose optimization, and in this case specifically optimization of the implant dose of the deep and shallow n-type implants that form the photodiode. In this case, the measurement was performed in modulation frequency of 200MHz and modulation voltage of 1.8V, and the results show again significant improvement in modulation shape and pixel to pixel variations.

#### Summary

We have presented a 1280x960 iToF sensor with a 3.5um pixel, in-pixel CDS circuit, ability to store 4 frames in voltage-domain storage capacitors, 38% QE @ 940nm, >25ke- full well capacity, and up to 200MHz demodulation frequency with >95% modulation contrast. Point cloud output from the sensor is presented in Figure 5. Part A of the figure shows a detailed depth scene captured in a single depth frame without temporal averaging. In Figure 5B the same point cloud is rotated about the Y-Axis, showing the high accuracy of the geometry and lack of flying pixels. Figure 6 presents a dynamic scene of a ball being thrown in the air with minimal motion artifacts, enabled by the in-pixel CDS and storage mechanism. Chip micrograph can be observed in Figure 7, and comparison to other recent iToF sensors in presented in Table 2.



Figure 5 - (A) Point cloud from a detailed scene captured from a single depth frame. (B) same point cloud rotated about the Y-Axis.



Figure 7 - Micrograph of the stacked die in CSP package. Chip dimensions are 6.1mm x 4.9mm.

Table 2-recent iToF sensor performance comparison



Figure 6- A depth image of a ball being thrown in the air, showing valid depth readout from the moving ball, highlighting the contribution of the in-pixel storage architecture to reduced motion artifacts.

## References

[1] R. Lange, "3D time-of-flight distance measurement with custom solid state image sensors in CMOS/CCD-technology," Ph.D. dissertation, UCLA, CA, USA, Univ. Siegen, Siegen, Germany, 2000.

[2] Bamji, Cyrus, et al. "A Review of Indirect Time-of-Flight Technologies." IEEE Transactions on Electron Devices (2022).

[3] Hatakeyama, Kunihiro, et al. "A Hybrid Indirect ToF Image Sensor for Long-Range 3D Depth Measurement under High Ambient Light Conditions." 2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits). IEEE, 2022.

[4] Y. Ebiko et al., "Low power consumption and high resolution 1280×960 gate assisted photonic demodulator pixel for indirect time of flight," in IEDM Tech. Dig., Dec. 2020, p. 33

[5] M.-S. Keel et al., "A 4-tap 3.5  $\mu$ m 1.2 Mpixel indirect time-of flight CMOS image sensor with peak current mitigation and multi-user interference cancellation," in IEEE ISSCC Dig. Tech. Papers, Feb. 2021

[6] M. Tsutsui et al. "A 3-Tap Global Shutter 5.0um Pixel with Background Canceling for 165MHz Modulated Pulsed Indirect Time-of-Flight Image Sensor" in IISW 2021.

[7] Tubert, Cédric, et al. 4.6µm Low Power Indirect Time-of-Flight Pixel Achieving 88.5% Demodulation Contrast at 200MHz for 0.54 MPix Depth Camera." ESSDERC 2021-IEEE 51st European Solid-State Device Research Conference (ESSDERC). IEEE, 2021

|                             | This Work    | IEDM 21' [4] | ISSCC 21'[5] | IISW21'[6]   | ESSDERC21'[7] |
|-----------------------------|--------------|--------------|--------------|--------------|---------------|
| Pixel Pitch                 | 3.5um        | 3.5um        | 3.5um        | 5um          | 4.6um         |
| Process                     | 65nm / 65nm  | 90nm / 65nm  | 65nm / 65nm  | 65nm / 65nm  | 65nm / 40nm   |
| Sensor Resolution           | 1280 x 960   | 1280 x 960   | 1280 x 960   | 640 x 480    | 672 x 804     |
| Max. Mod. frequency         | 200MHz       | 120MHz       | 200MHz       | 165MHz       | 200MHz        |
| Modulation Contrast         | 96% @ 100MHz | 96% @ 100MHz | 96% @ 100MHz | 88% @ 100MHz |               |
|                             | 95% @ 200MHz | _            | 80% @ 200MHz | 81% @ 165MHz | 88.5@200MHz   |
| Read noise (direct readout) | 3.5e-        | -            | 3.4e-        | 3.5e-        | 4.3e-         |
| Read noise (storage caps)   | 7.0e-        |              |              |              |               |
| Linear Full Well per tap    | 25ke-        | 18ke-        | -            | 11ke-        |               |
| 940nm QE                    | 38%          | 32%          | 38%          | 21%          | 18.5%         |