# A 2.5pJ/b Readout Circuit for 1000fps Single-bit Quanta Image Sensors Saleh Masoodian, Arun Rao, Jiaju Ma, Kofi Odame and Eric R. Fossum Thayer School of Engineering at Dartmouth, Hanover, New Hampshire USA E-mail: saleh.masoodian.th@dartmouth.edu Abstract—A pathfinder 1376Hx768V (1Mpixel) 1000fps (1Gb/s output data rate) binary image sensor with pixel signal levels of 1mV dissipates 20mW, including I/O pads, and shows power-dissipation viability of 1000 fps gigajot Quanta Image Sensors. Implemented in 0.18μm CMOS, the readout signal chain uses column-parallel fully differential charge-transfer amplifier (CTA) gain stages before a 1b-ADC, and pseudo-static clock gating units for row and column circuits to achieve an energy/bit FOM of 2.5pJ/b for gain+ADC. #### I. INTRODUCTION Quanta Image sensors (QIS) are proposed as a paradigm shift in image capture to take advantage of shrinking pixel sizes [1]. The key aspects of the single-bit QIS involve counting individual photoelectrons using tiny, spatiallyoversampled binary photodetectors at high readout rates, representing this binary output as a bit cube (x,y,t) and finally processing the bit cubes to form high dynamic range images (Fig. 1). One of the challenges for the QIS is the design of internal high-speed and low-power addressing and readout circuitry. A QIS may contain over a billion specialized photodetectors, called jots, each producing just 1mV of signal, with a field readout rate 10-100 times faster than conventional CMOS image sensors. This paper presents a pathfinder image sensor for exploring the low-power binary readout circuits needed for commercial implementation of gigajot single-bit QIS devices. Using a charge-transfer amplifier (CTA) design, and pseudo-static clock gating units for row addressing, the chip achieves 20mW total power consumption, including I/O pads, for 1Mpixels at 1000fps. Comparison with lower frame-rate CMOS image sensors with high resolution (12b) ADCs is difficult, but using an energy per comparator strobe FOM, we demonstrate no less than 2.5x improvement, and likely much more, over the best of the SOA, successfully paving the way for future gigajot QIS sensor designs. Fig. 1. QIS concept. Fig. 2. Architecture of the 1MP pathfinder image sensor. #### II. SENSOR # Sensor Architecture The 1376Hx768V pixel image sensor uses a conventional 3T with partially photodiode, 3.6µm pixel and readout architecture implemented in a 0.18µm process, as shown in Fig. 2. 4T-PPD pixels were not yet available at the time of tapeout. True jot implementation requires a smaller technology node and is underway separately. Our focus is on the signal chain from pixel to digital output, though specialized implants were used to increase conversion gain to about 120µV/e-, and the sensor is operated in a single-row rolling shutter mode so true CDS is utilized. However, this leads to extremely short integration times, useful only in the lab. Ideally, we would have used 4T pixels in this pathfinder sensor with integration times approaching the field readout time (1ms). A column-parallel single-bit ADC using a CTA- based design detects a >0.5 mV output swing ( $\sim$ 4.2e-) from the pixel. The ADC is capable of sampling at speeds of 768kSa/s to produce the binary output while consuming 1.9 $\mu$ W per column. The sensor operates at 1000fps, which corresponds to a row time of 1.3 $\mu$ s, a signal integration time, $T_{int}<1\mu$ s, and an output data rate of 1Gb/s. The ADCs working in tandem with digital circuits consume an average power of 6.4mW. ## Pixel As mentioned, since a conventional pinned photodiode (PPD) was not available in this process at the time of tapeout, a 3T pixel with partially-pinned photodiode [2] was utilized. The schematic and the layout of the pixel are shown in Fig. 3. To improve performance for this application, the implant conditions and the layout of this pixel was modified to realize smaller full-well capacity and higher conversion gain. We designed and simulated the pixel using Synopsys TCAD tools. This 3T pixel is front-side illuminated, with a pitch of 3.6µm, and the design fill factor is approximately 45%. Fig. 3. Schematic, layout and simulated doping profile of the pixel. #### **ADC** Fig. 4 shows the 1-bit ADC circuit. The analog readout comprises 4 stages of fully-differential CTA-based sense amplifiers, followed by a D-Latch comparator. The cascade of CTAs provides a gain of 400, which reduces the ADC's input-referred offset, mainly due to transistor mismatch in the comparator, to less than 500µV. Offset due to the CTAs themselves is minimized by resetting and precharging them during each sample, without the need for explicit auto zeroing. A detailed description of the CTA operation can be found in [3] and [4]. Compared to [4], use of a differential CTA and columnparallel ADC layout in an actual image sensor required more power dissipation. The gain of the CTA is approximately the ratio of $C_t/C_o$ , where C<sub>t</sub> is a drawn capacitor (see Fig, 4) and C<sub>o</sub> is the CTA's load capacitance. According to power consumption optimization calculations, optimum gain of the CTA should be between 4 and 5 (V/V), which is why a cascade of 4 is needed to obtain the gain of 400. Sensor readout is essentially rolling shutter with single-row integration time to allow CDS with 3T-pixels. Following row selection, the pixels are reset, CTAs in the ADCs are reset and precharged, respectively, and the D-latch comparator is in the latch and reset phases, respectively. After pixel resetting, the integration period is started. In the integration period, the CTAs in the ADCs are in the amplify phase and observe the changes on the column and amplify the voltage change, while the D-latch comparator is in the transfer phase. A DC-blocking capacitor is used between the column and ADC, in order to shield the ADC from differences in common mode due to threshold voltage mismatch in the pixels' source-followers. Due to the structure of the CTA, no sample and hold circuits are needed to store the reset and signal levels. The output of the photodiode is sampled (integrated) on the CTAs' capacitors, simultaneously. There is a current source at the bottom of each column to bias the large parasitic column capacitance that comes from the row-select switches on the column, and to provide the required settling time. At the end of the integration period (or amplify phase of the CTA), the D-latch comparator is in the latch phase, and it will flip state depending on whether or not the column voltage has changed by more than 500 µV. The final state of the comparator is saved in a dynamic flip-flop to be sent off-chip by column shift registers and multiplexers. The timing and signal waveforms of the functioning of one column and ADC are shown in Fig. 5. Note that for 4T-type pixels, the same general timing would be used, with the integration period replaced with the signal transfer from PPD to FD phase. Fig. 4. 1- bit ADC based on a cascade of sense amplifiers and a single D-latch comparator. Each sense amplifier is implemented as a charge transfer amplifier. Fig. 5. Timing diagram and various phases of operation for each column and ADC. #### Digital Circuits Besides the readout signal chain and ADC, a second concern in a gigajot QIS is clock distribution power in the row selection circuits. To address this concern, a tree structure of clock gating units is used, whereby power is conserved by distributing the clock to only the active sections of the shift registers [5]. In Fig. 6, the M\_ON and M\_OFF transistors of the clock gating units are controlled by the outputs of the flip-flops in the shift registers. The flip-flops and clock gating units are implemented as pseudostatic circuits, which combine the low power consumption of a dynamic circuit with the robustness of a static one. As shown in the schematic of Fig. 6, the pseudo-static flip-flop is based on a dynamic flip-flop that has been modified with weak feedback transistors, MPW and MNW, to prevent destructive charge leakage. Fig. 6, also shows the clock gating unit along with a timing diagram. The column shift registers at the end of the analog signal chain, transfer the binary output from the 1-bit ADC, serially, at 33Mbps (32 output pins). Every output pin corresponds to a group of 43 columns. The row addressing circuits including the buffers consume $0.73\mu W$ per row, whereas the column shift registers dissipate $2.3\mu W$ per column. The impact of clock power reduction is expected to become significant in gigajot QIS devices. Fig 6. Pseudo static flip flops and clock gating units used in the row addressing and column shift register circuits, with timing diagram. #### III. MEASURED RESULTS The final specifications of the sensor are shown in Table I, and Fig. 7 shows one frame of bits that was measured from the sensor. The zoomed image of Fig. 7 shows roughness in the edges of characters. This is likely caused by the resolution (600dpi) of the printer used to print the word "IEEE", leading to a less distinct edge. Dark current was not observed, due to the short integration time. The power consumption of the entire chip (including I/O pads) is 20mW. The die microphotograph is shown in Fig. 8 depicting various sections of the chip. There are few reported binary image sensors in recent years. To compare to SOA 12b CMOS image sensors is hardly fair, but perhaps useful as a rough guide. We define an energy/bit figure of merit FOM=Chip power/(# of pixels×fps×N), where N represents the ADC resolution in bits, which for algorithmic converters is the number of comparator strobes per conversion. The pathfinder sensor has an FOM=19pJ/b. We calculate that SOA 12b CMOS image sensors of [6], [7], have a FOM of 53pJ/b and 120pJ/b, respectively. Substituting gain+ADC power only Table I Specifications of the $1\,\mathrm{MP}$ binary image sensor | Process | | XFAB, 0.18 µm, 6M1P (non-standard implants) | |-------------------|-------------|-----------------------------------------------------------| | VDD | | 1.3 V (Analog and Digital), 1.8 V (Array), 3 V (I/O pads) | | Pixel type | | 3T-APS | | Pixel pitch | | 3.6 μm | | Photo-detector | | Partially pinned photodiode | | Conversion gain | | 120 μV/e- | | Array | | 1376 (H) X 768 (V) | | Column noise | | 2 e- | | Field rate | | 1000 fps | | ADC sampling rate | | 768 KSa/s | | ADC resolution | | 1 bit (LSB = 1 mV) | | Output data rate | | 32 (output pins) X 33 Mb/s = $1 \text{ Gb/s}$ | | Package | | PGA with 256 pins | | Power | Pixel array | 8.6 mW | | | ADCs | 2.6 mW | | | Addressing | 3.8 mW | | | I/O pads | 5 mW | | | Total | 20 mW | Fig. 7. Single captured binary frame with blowup to show more details as described in text. for chip power to remove pad I/O power, the FOM of the pathfinder chip becomes 2.5pJ/b, and that of [6] is estimated at perhaps 16.7pJ/b. ### IV. CONCLUSIONS The power reduction circuit strategies proven in the pathfinder chips allow us to proceed with confidence to gigajot single-bit QIS implementations in advanced processes. Energy FOM of 19pJ/b scaled down with smaller parasitic capacitances and rail voltages for advanced technology nodes gives power dissipation in the sub-Watt level range for gigajot QIS devices, sufficient for commercial purposes. Fig. 8. Micrograph of pathfinder sensor in 0.18 µm CMOS. #### V. ACKNOWLEDGMENTS The authors appreciate the sponsorship and collaboration of Rambus, and the in-kind support and collaboration of XFAB. The technical advice of Forza Silicon, particularly B. Mansoorian, D. Van Blerkom, and Rami Yassine, in the design review of this sensor is especially appreciated. #### VI. REFERENCES - [1] E.R. Fossum, "The quanta image sensor (QIS): Concepts and challenges," Proc. OSA Top. Mtg. on Comp. Opt. Sensing and Imaging, Toronto, Canada, July 10-14, 2011. - [2] T. H. Lee, et al., "Partially pinned photodiode for solid-state image sensors," U.S. Patent 5 903 021, Jan. 1997. - [3] W.J. Marble, et al., "Analysis of the dynamic behavior of a charge-transfer amplifier," IEEE Trans. on Circ. & Sys.-I, Vol. 48, No. 7, July 2001. - [4] S. Masoodian, et al., "Low-power readout circuit for quanta image sensors," Electronics Letters, Vol. 50, No. 8, pp. 589–591, April 2014. - [5] C.Yeh, et al., "Low power readout control circuit for high resolution CMOS image sensor," Proc. IEEE ISCAS, pp. 1163-1166, 21-24 May, 2006. - [6] T. Watabe, et al., "A 33Mpixel 120fps CMOS image sensor using 12b column-parallel pipelined cyclic ADCs," ISSCC Dig. Tech. Papers, pp. 388-390, Feb., 2012. - [7] T. Toyama, et al., "A 17.7Mpixel 120fps CMOS image sensor with 34.8Gb/s readout," ISSCC Dig. Tech. Papers, pp. 420-422, Feb., 2011.