[Research Page]


CMOS Imager with Embedded Analog Early Image Processor
Christophe Basset, Bedabrata Pain (JPL), Pietro Perona

Abstract. We are developing a computational CMOS imager with integrated early image processing general-purpose filter. The goal of this collaborative work with the Jet Propulsion Laboratory is to produce a single chip serving as a camera able to pre-process the image in real-time through a convolution filter chosen by the user, allowing an efficient implementation of a variety of computationally intensive applications such as autonomous navigation, object avoidance or intercept, real-time target tracking and recognition.

Motivation. A system capable of tracking any target in real-time in an unknown environment finds numerous applications: object avoidance or interception, autonomous navigation (machine vision, nano-rovers, robots, docking...), recognition (tracking of eyes, nose...), etc. Low-level processing of images often consists of repetitive and computationally intensive tasks that are also based on convolution operators. A hardware implementation is very well suited for such real-time vision systems. A software-based approach would lack miniaturization (need of a camera, a computer with a frame grabber) and would run too slowly (~1 frame/second) for most of these applications requiring a real-time flow of data.

The on-going collaboration with the Jet Propulsion Laboratory has brought to this project expertise in Active Pixel Sensors. This CMOS technology for building imager chips allows on-focal plane signal processing (as opposed to their CCD counterparts that need to serially output the flow of pixels to an external processing chip). The filtering can therefore be implemented as a fast, low-power analog circuit.

Research and Achievements. Convolution is achieved by matching a template to an image using a computation unit, allowing generic filters to be used as a kernel. The chip has an integrated imager array and a 9×9 pixel digital memory to store the kernel. When recognizing or tracking a target, the kernel represents the template, chosen off-chip through a separate learning process. This part will therefore not be discussed here. Filtering is performed through a column-parallel architecture of computing units, so real-time computation can be achieved.

The core of the filtering system relies on a 9×9 pixel cell which task is to perform convolution with the imager. The image provided by the array reaches the convolution block one row at a time. Hence, the image can be processed by only 9 rows of convolution units operating in parallel. To reduce the number of cells, a small level of serialization was introduced by sharing the same convolution unit among 16 neighboring columns.
Each of these cells computes the matching between the analog image (I) and the template (T):

    eq. (1)

The chip can be divided into three main components that form the system. The imager comprises the pixel array, the necessary control logic for both row and column addressing and the readout circuitry with correlated double sampling implemented to reduce the image artifacts due to the fixed pattern noise (FPN). The convolution includes the other two with an array of mixed-signal multipliers and one of analog pipelined accumulators. Figure 1. below shows the block diagram of the chip.



Figure 1. System block diagram


The imager. A traditional voltage-mode pixel implementation was chosen to create the pixel array of the chip. It allows good linearity and noise robustness that current-mode pixels lack. However, for simplicity and compactness of implementation, the convolution calculations downstream are (as described below) performed in the current domain. A Voltage to current converter was therefore designed and is combined with the double-sampling circuit that offers a reduction of the fixed-pattern noise of the imager.



Figure 2. Imager and pixel layout detail

The voltage level read from each column when a specific row is selected, is converted into a current by changing the current flowing through a resistor when modulating the gate of a PMOS transistor as shown in Figure 3. The generated current is then mirrored into either a current memory cell (the light-exposed value is saved at the beginning of the readout cycle) or subtracted from the previously saved value so the dark value (which contains most of the FPN artifacts) is removed from the picture. Images taken with and without the FPN reduction circuit show the importance of such capability, as seen in Figure 4.



Figure 3. Pixel readout and FPN reduction simplified circuit



Figure 4. Effect of the FPN reduction on a 64×64 pixel image

The multipliers Σ (Ij·Kj). The equation for convolution, eq. (1) above, shows that it is a two-dimensional accumulation of products. In the row direction, the accumulators described below will handle the "one row at a time" type of addressing of the imager. The pixels of one same row being all available at the same time, the addition of the partial products in this direction is made by connecting the current outputs of the multipliers (Kirchhoff's currents law).

One of the operands being a digital, 8-bit signal, the multipliers are effectively Digital to Analog Converters (DAC) weighted by the second operand (an analog current). The input current acts like the reference current for the DAC and is mirrored into binary-scaled current mirrors, which outputs are controlled through switches by the bits of the template. To achieve 8-bit depth, it is therefore necessary to provide scaled current mirrors from ×1 to ×128, which is not practical. To circumvene this problem, multiplier was divided into two halves corresponding to the 4 LSBs and the 4 MSBs of the template, each scaling from ×1 to ×8. The output of the MSB half is scaled up by a factor 16 and added to the lower half. The full multiplication is performed using a much smaller area on chip.



Figure 5. One-pixel multiplier cell circuit


The accumulators Σ(Σ (Ij·Kj)) . The mixed-signal multipliers implemented arecombinatorial elements that do not have the capability to memorize the partial products computed when previous rows were available from the imager. Such a memory is, however necessary to reconstruct the convolution over as partial products from nine consecutive rows. As seen in Figure 6, accumulators for one culumn have nine inputs coming from the multipliers for the nine template rows and the corresponding column neighborhood. When a new row from the imager is readout, the partial products move two steps through the current memories in the pipeline and a new set of partial products is generated. A new result, the convolution from 9 rows ago, is then ready to be read out of the chip.



Figure 6. Pipeline accumulator - principle of operation.
Xi: input from multiplier for template row i; CMi: Current memory.


Tests and Results. This chip was fabricated and is currently in testing. A twin, simplified chip was also fabricated, containing the same elements for test and characterization of each block, independently of each other. Preliminary results show good performance of the imager (Figure 4(b)) and the multiplier. As an example of the linearity of the multiplier, Figure 7 below shows a linear fit of the multiplication of a saturated kernel (K=255) with a varying current in the single-pixel multiplying cell described above.



Figure 7. Multiplier linearity: fixed template; input current of varying intensity.

The project started with a first prototype of the convolution chip, which was built and tested. It consisted of a 64×64 pixel imager array, a 49-byte digital memory to store the kernel, and a single 7×7 pixel convolution cell tied to the center of the imager. Test results demonstrated a convolution was indeed performed and allowed identification of adjustments to be made for the following fabrication run.

The second generation of the convolution chip has been tested and was presented at the 2003 Workshop on CCDs and Advanced Image Sensors, May 2003 in Elmau, Germany. From the first generation, this run incorporated design modifications (we demonstrated better matching in computation cells and in the pixels) as well as added features (sum of all pixels necessary to perform normalization so targets can be tracked accurately, etc.).

The current circuit, described here, is currently being fully tested and characterized. A number of important design changes were included, either to address issues that could be improved on (for instance, we moved from a current-mode imager to a voltage mode pixel coupled with a current to voltage converter), or to demonstrate a new way to approach the problem (the accumulators operate in the current domain and no longer in the charge domain). By implementing a larger image array, we also demonstrated the scalability of the architecture in the spatial domain to an arbitrary sized imager. Preliminary test results are encouraging and full characterization is expected in the near future.

References
C. Basset, et al, "CMOS Imager with Embedded Analog Early Image processor", 2003 IEEE Workshop on CCDs and Advanced Image Sensors, Elmau, Germany, May 2003.

R.H. Nixon, et al, "256x256 CMOS Active Pixel Sensor Camera On A Chip", pp. 178-179, Proc. IEEE International Solid-State Conference, San Francisco, Ca. Feb. 1996.

L.G. McIlrath, et al, "Design and analysis of a 512×768 current-mediated active pixel array image sensor", IEEE Trans. on Electron Devices, vol. 44, pp. 1706-1715, Oct. 1997.

C.Clark, et al, "Application of APS arrays to star and feature tracking systems", Proc SPIE, Vol. 2810, pp 116-120, 1996.

T. Komuro, et al, "A digital vision chip specialized for high-speed target tracking", IEEE Trans. on Electron Devices, vol. 50, pp. 191-199, Jan. 2003.

V. Gruev, et al, "Implementation of steerable spatiotemporal image filters on the focal plane", IEEE Trans. on Circuits and Systems II, Vol 49, pp233-244, April 2002.

A. Graupner, et al, "CMOS image sensor with mixed-signal processor array", IEEE Journal of Solid-State Circuits, vol. 38, pp. 948-957, June 2003.

A.A. Biyabani, L. R. Carley, T. Kanade, "An analog CMOS IC for template matching", proc. IEEE Int. Solid-State Circuits Conference, pp. 82-83, Feb. 1999.


top