# A 1Mpixel, 80k fps Global Shutter CMOS Image Sensor for High Speed Imaging

Daniel Van Blerkom, Loc Truong, Jeff Rysinski\*\*, Radu Corlan\*, Karthik Venkatesan, Sam Bagwell, Liviu Oniciuc, Jonathan Bergey Forza Silicon, 2947 Bradley St, Suite 130, Pasadena, California 91107 \*Vision Research, 100 Dey Rd, Wayne, New Jersey 07470 \*\*now at Gigajot, 3452 E Foothill Blvd, Suite 360, Pasadena, CA 91107 <u>daniel.vanblerkom@ametek.com</u>

## Abstract

We present a 1Mpixel, 80k fps global shutter BSI CMOS image sensor with 160 CML outputs each running at 6.25Gbit/sec. The sensor allows for wide field-of-view, high resolution imaging at high speed, which is crucial for research applications such as combustion imaging, crack propagation, and flow visualization using particle image velocimetry.

## Introduction

To achieve 80k frame-per-second (fps) imaging with commercial high-speed scientific cameras often requires significantly reducing the resolution in exchange for a high frame rate, limiting the ability to image a wide field of view. Here, we present a 1Mpixel (1280 x 832) global shutter BSI CMOS image sensor capable of 80k fps at full resolution. The sensor size is 23.7mm by 15.4mm, with a pixel pitch of 18.54µm, and is fabricated in a 6 metal, 0.11µm BSI process.

To reach this combination of resolution and frame rate, there are several important architectural aspects to consider in order to avoid bottlenecks in the design. The first is how much parallelism can be accommodated, and how can the signal chain timing be pipelined to overlap operations. With such a large amount of data necessary to send off-chip, the clocking and data transmission choices are also critical. The supply distribution and power dissipation are important in determining the sensor packaging and camera cooling requirements.

The other aspect of the design is that the pinned photodiode charge transfer and correlated double sampling techniques used to reduce noise and non-uniformities take too long to fit in the readout time. Although this is perhaps not an inherent limitation when stacked sensors with pixel-wise connections are considered, for the process technology involved it meant keeping the signal chain operations as simple and fast as possible, and depending on calibration and dark-frame subtraction in the camera system to remove non-uniformities.

## Sensor Architecture

The pixel uses a partially pinned photodiode, with voltage mode global shutter storage. The photodiode uses tapering and multiple implants to improve the photogenerated carrier transit time to the unpinned region. 2x2 binning is also supported, through binning connections in the pixel.

Pixel sampling and conversion proceeds 32 rows at a time, with 16 rows sampled at the top of the array and 16 sampled by the bottom, in a row time of less than 500 nsec. Only the pixel signal voltage is sampled - unfortunately, there is not enough time to sample a reset level to remove the pixel source follower voltage non-uniformity; this must be removed at the camera level. The analog signal chain is shown in Fig. 3, and the timing operations are shown in Fig. 4. The timing is pipelined as much as possible, with overlapping pixel sampling, ADC conversion, and digital readout. The pixel signal is stored on one of the sampling capacitors, while the other is being used for ADC conversion. Prior to each conversion, the pre-amp is reset by connecting both inputs to a common reference, and the comparator is auto-zeroed. The gained offset of the pre-amp appears at the output of the pre-amp, and is stored on the coupling capacitor to achieve output offset cancellation. Output offset cancellation has the drawback of limiting the gain of the pre-amp, since the gained worst-case offset must stay within the output range of the pre-amp. However, it is faster than auto-zeroing onto the pre-amp input capacitors, and allows for alternating between input sampling capacitors.

The design is arranged into self-contained "superblocks", where each superblock is dedicated to 64 columns of pixels, and contains all of the biasing, ramp and counter generation, digital logic, CML clock input and output drivers, and 1024 ADCs to read out that block of pixels (Fig. 1 & 2). Within the superblock, the single-slope ADCs are arranged in banks of 256 ADCs with associated memory to store the converted values. The 4 banks of 256 ADCs are arranged in two stripes in the superblock, one on top of the other. The sampling capacitors and ADC are on a 2um pitch, so as to provide a routing channel in the center of each superblock for power, biasing, and the ramp signals.

The CML outputs are power hungry and require precious FPGA receiver resources. To maintain the flexibility of using the sensor in a smaller camera system with fewer receiver channels, a data concentration scheme was implemented to route the digital data from exterior superblocks towards the center superblocks, and power-down the unused CML outputs. Several output configurations are supported to match desired row-times with available receiver channels.

## **Design Challenges**

There are 40,960 converters in the sensor, in 40 total superblocks. The choice of 40 superblocks was determined by a trade-off between high-speed FPGA input pins required in the camera and the maximum output data-rate the process could support. In the end, we also hit a ceiling on the number of superblocks due to the pad ring along the top and bottom of the die being pad limited.

The sensor requires a CML input clock for each superblock, for a total of 40 differential 3.125GHz input clocks. As opposed to distributing a high-speed clock internally across superblocks, or including multiple PLLs on chip, this choice obviously puts pressure on the camera system to manage and distribute many high-speed clocks. It did, however, limit the power dissipation and reduce the skew, jitter, and duty cycle concerns that would be limitations to distributing the high-speed clocks on-chip. As for using PLLs, the clock speed was high enough that we were concerned that the VCO in this process, which we would typically run at 2x the clock rate, would not oscillate at the frequency required over all corners.

Since the sensor is segmented into superblocks, mismatch between superblocks can result in difficult to correct block-wise artifacts. One potential source of mismatch is from a lack of synchronization of the timing between superblocks. The sensor was designed with a carefully balanced H-tree to distribute the global synchronization pulses to each superblock. Another source of mismatch is due to varying IR drops on the supply rails; this was addressed by limiting the variation in the current requirements during the different phases of operation, and by star connections in the layout to isolate the IR drops seen by each block.

Another major source of mismatch is the variation in the ramp generation, which is done independently in each superblock. After prototype evaluation, it was decided to short the ramp outputs across all of the superblocks, in essence creating one large, distributed ramp generator. With this approach, there are no sharp discontinuities in the ramp offset or slope from one column to the next, which would make calibration difficult.

## **Results & Summary**

The sensor demonstrates a dynamic range of 52dB at maximum speed and a linearity error of <1.75% up to 90% of the full swing. The dynamic range is limited by the swing of the ramp at the maximum frame rate; better dynamic range can be achieved with longer row times. Sensor specifications are given in Table 1. Fig. 5 shows three extracted frames from a video shot with the sensor of tempered glass cracking. This recording used a reduced vertical resolution and binning to achieve 1.7Mfps, with 95 nsec exposure time (courtesy of Citius Imaging). Fig. 6 is a frame from a 100kfps 1280x560 video of a cell-phone screen cracking.

## References

G. Meynants, et al, "700 frames/s 2 MPixel global shutter image sensor with 2 Me- full well charge and 12  $\mu$ m pixel pitch", proc. IISW 2015, Vaals, June 2015

| Parameter      | Value              | Notes                                       |
|----------------|--------------------|---------------------------------------------|
| Process        | 0.11um BSI         |                                             |
| Resolution     | 1280 x 832         |                                             |
| Pixel Size     | 18.54um            |                                             |
| Frame Rate     | 80,000 fps         | At full resolution                          |
| Bit Resolution | 10/11/12           | Depending on row time                       |
| Output Ports   | 160 @ 6.25Gbit/s   |                                             |
| Dynamic Range  | 52dB               | Limited by ADC ramp range at max frame rate |
| PLS            | < 1:10000          |                                             |
| Linearity      | < 1.75%            | 10-90% of swing                             |
| FPN            | < 1%               |                                             |
| Supply         | 3.3V / 2.5V / 1.2V | Pixel / Analog / Digital & CML supply       |
| Power          | < 40W              |                                             |

Table 1 – Sensor Specifications



Figure 1 – Sensor block diagram & die plot





Figure 4 – Timing Operations



Figure 5 – Three (non-successive) frames from a 1.7Mfps video of glass cracking



Figure 6 – Frame from a 100kfps video of cell-phone screen cracking



Figure 2 – Superblock floorplan