VIDEO-RATE ANALOG-TO-DIGITAL
CONVERSION USING PIPELINED
ARCHITECTURES

by

Stephen Henry Lewis

Memorandum No. UCB/ERL M87/90

18 November 1987
VIDEO-RATE ANALOG-TO-DIGITAL CONVERSION USING PIPELINED ARCHITECTURES

by

Stephen Henry Lewis

Memorandum No. UCB/ERL M87/90

18 November 1987

ELECTRONICS RESEARCH LABORATORY

College of Engineering
University of California, Berkeley
94720
Video-Rate Analog-to-Digital Conversion
Using Pipelined Architectures

Ph.D. Stephen H. Lewis E.E.C.S.

Abstract

A combination of steadily improving technology and new circuit techniques is resulting in a
dramatic reduction in the cost of analog-to-digital (A/D) converters for video-rate applications. By using
pipelined conversion architectures in scaled CMOS technologies, the silicon area required for such
conversion should soon be small enough that it can be economically incorporated within a larger digital
image-processing chip. An examination of the use of pipelined architectures in CMOS technologies is
presented in this thesis. The objective of this research is to determine whether pipelined architectures are
suitable and advantageous compared to other architectures for CMOS video-rate A/D conversion.

To verify the main ideas about pipelined converters investigated in this thesis, an experimental,
pipelined, 5-Ms/s, 9-bit A/D converter with digital correction has been designed and fabricated in a 3-
micron, CMOS technology. It requires 8500 mils$^2$, consumes 180 mW, and has an input capacitance of 3
pF. A fully differential architecture is used; only a 2-phase, nonoverlapping clock is required, and an
on-chip, sample-and-hold (S/H) amplifier is included.

This thesis arrives at four main conclusions. First, pipelined architectures and digital correction
techniques are of potential interest for high-speed CMOS A/D conversion applications because they
simultaneously provide high throughput rate, high tolerance to error sources, and low hardware cost.
Second, the main disadvantage of pipelined A/D converters is that they require the use of operational
amplifiers (op amps) to realize parasitic-insensitive S/H amplifiers. While the S/H amplifiers improve
many aspects of the converter performance, the op amps within the S/H amplifiers limit the speed of the
pipelined converters. Third, although the 3-micron prototype converters did not reach video conversion
rates, such rates should be attainable in 1.5-2-micron CMOS technologies. Finally, continued research
into the design of S/H amplifiers is needed to achieve both video conversion rates and more than about
12-bit linearity in CMOS technologies.
Acknowledgements

I appreciate the support and guidance given by Professor Paul Gray, my research advisor, throughout this study. His advice was always excellent. I am also grateful to Professors David Hodges and Bob Brodersen for their help in both research and academic matters.

From the last generation of graduate students, Rinaldo Castello, Paul Hurst, and Lee-Chung Yiu got me off to a good start by patiently explaining the operation of amplifiers and A/D converters. Of the current generation, Joey Doernberg, Robert Kavaler, Bosco Leung, Sehat Sutarja, and C.-k. Wang most strongly influenced this project through discussions on subjects ranging from the testing the prototype converters to the general design of CMOS circuits. In addition, Joey Doernberg and Alisa Scherer proofread parts of this thesis, and Bill Baringer and Phil Schrupp showed me how to use the image-processing lab to take a digitized photograph of a prototype converter.

Jim Mayfield, my roommate for two years, helped not only in writing papers and practicing talks, but also in maintaining a good balance between work and play.

This research was sponsored by both DARPA under contract number N00039-C-0107 and NSF under contract number DCI-8603430. Kodak supported me during one year of the work. MOSIS made the prototype converters. I appreciate all these contributions.

Finally, I thank my wife, Robin, who proofread my papers and listened to so many practice talks that she almost memorized the introduction to one of them.
## Table of Contents

Chapter 1 - Introduction .................................................................................................................. 1  
1.1. Background and Motivation .................................................................................................... 1  
1.2. Thesis Organization ............................................................................................................... 2  
Chapter 2 - Applications, Characterization, and Specifications .................................................... 2  
2.1. Introduction .......................................................................................................................... 3  
2.2. Applications ......................................................................................................................... 3  
2.3. Characterization ................................................................................................................... 4  
2.3.1. Static Characterization ..................................................................................................... 4  
2.3.2. Dynamic Characterization ............................................................................................... 8  
2.3.2.1. Bandwidth ..................................................................................................................... 8  
2.3.2.2. Beat Frequency Test ................................................................................................. 9  
2.3.2.3. Dynamic Linearity ...................................................................................................... 9  
2.3.2.4. Signal-to-Noise Ratio ............................................................................................... 10  
2.3.2.5. Noise Power Ratio .................................................................................................... 14  
2.3.2.6. Differential Gain and Phase ...................................................................................... 16  
2.4. Specifications ....................................................................................................................... 19  
2.4.1. Encoding of Composite Video Signals ............................................................................ 19  
2.4.2. Real-Time Image Processing ......................................................................................... 21  
Chapter 3 - Architecture Review and Comparison ........................................................................ 21  
3.1. Introduction .......................................................................................................................... 22  
3.2. Flash Architecture ............................................................................................................... 22  
3.2.1. Advantages ..................................................................................................................... 22  
3.2.2. Limitations ..................................................................................................................... 24  
3.3. Subranging Architecture ....................................................................................................... 28  
3.3.1. Advantages ..................................................................................................................... 29  
3.3.2. Limitations ..................................................................................................................... 31  
3.4. Pipelined Architecture ......................................................................................................... 32  
3.4.1. Advantages ..................................................................................................................... 34  
3.4.2. Limitations ..................................................................................................................... 36  
3.5. Oversampling Architecture .................................................................................................. 36  
3.5.1. Advantages ..................................................................................................................... 41  
3.5.2. Limitations ..................................................................................................................... 41  
3.6. Performance Comparisons of Real Prototypes .................................................................... 42  
3.6.1. Speed Comparison ......................................................................................................... 43  
3.6.2. Area Comparison .......................................................................................................... 44
6.4. Dependence of Test Results on Input Frequency ................................................. 113
  6.4.1. Code-Density Test ............................................................................... 113
  6.4.2. Signal-to-Noise Ratio Test ................................................................. 114
6.5. Dependence of Test Results on Power Supply Voltages .................................... 118
  6.5.1. Code-Density Test ............................................................................... 118
  6.5.2. Signal-to-Noise Ratio Test ................................................................. 119
6.6. Dependence of Code-Density Test Results on Digital Correction ......................... 121

Chapter 7 - Conclusion ......................................................................................... 122
  7.1. Summary of Research Results ..................................................................... 123
    7.1.1. Comparison of A/D Conversion Architectures ..................................... 123
    7.1.2. The Design of Pipelined A/D Converters .......................................... 124
  7.2. Projected Performance in Scaled Technologies ............................................. 125
  7.3. Extensions to Increased Resolution and Linearity ......................................... 126
  7.4. Extensions to Increased Conversion Rate, Resolution, and Linearity ............... 126

References ............................................................................................................. 128
Chapter 1 - Introduction

1.1. Background and Motivation

Analog-to-digital (A/D) conversion transforms a continuous-amplitude input into a discrete-amplitude output. In combination with a sampling process, which transforms a continuous-time input into a discrete-time output, such conversion composes the front end of some signal processing systems and allows the advantages associated with digital techniques to benefit the system performance. Although A/D conversion architectures have been studied since the 1950’s, the subject is still an active area of research because new technologies in which the converters can be implemented bring associated characteristics that sometimes change the best conversion techniques for particular applications. Of the many architectures, technologies, and applications that have been studied, this thesis examines the use of pipelined architectures in CMOS technologies for video-rate applications. The objective of this research is to determine whether pipelined architectures are suitable and advantageous compared to other architectures under these conditions.

Video-rate A/D converters, as well as many other electronic circuits, have evolved from discrete to modular to monolithic form in an effort to reduce their cost. In monolithic form, the cost of such conversion continues to decline through both the steady evolution toward denser and faster technologies and recent progress in circuit implementation techniques. Further developments that would allow not only an A/D converter (ADC), but also a digital signal processor (DSP), or other digital circuits to be integrated on one chip are desirable.

As has traditionally been the case, the highest conversion rates in A/D interfaces are achieved with fully parallel architectures\(^1,2,3,4,5,6,7,8,9,10,11,12,13\) integrated in the fastest possible bipolar technologies. This class of A/D converter can now do 8-bit conversions at a rate of several hundred Msamples/second (Ms/s)\(^12,13\) but uses a large die area both because bipolar component densities are low and because flash architectures require the most hardware of the architectures available to do high-speed conversion. Compared to the conversion rates of these converters, the range of conversion rates required
for the encoding of traditional video-rate signals (10-20 Msample/sec) is slow. Important objectives in
the high-speed A/D conversion field are to use CMOS technologies,\textsuperscript{7,8,9,10,14,15} with their inherently
high component densities and very large scale integration (VLSI) compatibility, and circuit approaches
other than fully parallel\textsuperscript{14,15,16,17,18,19} to reduce the required A/D converter area while maintaining
enough speed for video-rate applications. Both subranging architectures\textsuperscript{14} and pipelined architectures\textsuperscript{15}
have been investigated for this application. Progress in this area can dramatically reduce the cost of the
video-rate A/D conversion function, as well as allow higher levels of integration in DSP functions such
as image processing.

1.2. Thesis Organization

An examination of the use of pipelined architectures for video-rate, CMOS A/D conversion is
presented in this thesis. Chapter 2 reviews the applications, characterization, and specifications of
video-rate A/D converters. In Chapter 3, flash, subranging, and pipelined architectures for such conver-
sion are reviewed and compared. The effects of amplifier settling time, D/A converter nonlinearity, and
other error sources on the linearity of pipelined A/D conversion architectures are investigated in Chapter
4. Chapter 5 describes the design of a prototype of a pipelined A/D converter in a 3-micron CMOS tech-
nology. Design problems and circuit solutions are examined. Experimental results from the 3-micron
prototype converters are presented in Chapter 6. Conclusions and a summary of the research results are
given in Chapter 7.
Chapter 2 - Applications, Characterization, and Specifications

2.1. Introduction

Before any further description of video-rate A/D converters is given, the subjects of applications, characterization, and specification of such converters are reviewed. This chapter forms a basis for the discourse that follows in later chapters.

2.2. Applications

Video-rate A/D conversion has many applications, the most significant of which is as the front end of digital video systems. The function of such systems is to take an image as input and do some digital processing on the image; these systems can be classified according to whether their output is an image or some other representation of the input image.

The output of some imaging systems is an image that may or may not have been deliberately changed to subjectively improve its characteristics. For instance, image enhancement is commonly done with powerful computers to modify satellite and X-ray photographs to ease their interpretation. Many applications require real-time image processing. Because real-time image processing is expensive, however, the application of digital techniques in real-time systems has been limited. In television systems, for example, digital image processing has been used mostly in broadcast studios. Figure 2.1 shows that video coder/decoder (codec) circuits surround the processing and are used first to digitize the video signal and then, after digital processing, to reconstruct an analog output.

This arrangement has been used for more than 10 years to provide time-base correction and frame storage and synchronization. Video codecs are still too expensive to be commonplace in television receivers,

---

Figure 2.1 - Block diagram of a video codec surrounding a DSP

Input → ADC → DSP → DAC → Output
however, and digitally produced features such as picture in picture (PIP), in which simultaneous multichannel viewing is possible, are only found in expensive televisions. If the cost of video codecs is reduced, video-rate A/D converters have future application in many imaging systems including standard color television, extended-quality television (EQTV), and high-definition television (HDTV).

The output of other real-time image processing systems, such as recognition systems, is not an image but some interpretation of the input image. There are many applications for such systems, including robotics and automated manufacturing. Although the output of these systems may contain less information than the input, high-speed A/D conversion and digital processing are still required for real-time operation. Therefore, these systems are also sensitive to the cost of the A/D conversion.

A key goal in both kinds of digital video systems is the development of inexpensive video-rate A/D converters.

2.3. Characterization

In this section, the characterization of video-rate A/D converters is presented. The subject is divided into two parts, static and dynamic characterization, based on the frequency of the input signal to the A/D converter. For DC and low-frequency input signals, the characterization is considered static in the sense that the input signal changes little, if at all, between consecutive samples. While static requirements have been standardized through the design of audio-rate A/D converters, such requirements are not enough to guarantee adequate video-rate A/D conversion. For input signal frequencies approaching half the sampling rate, the characterization is considered dynamic. Although dynamic characterization of video-rate A/D converters is not standardized, it is important because it is a closer approximation to real use than is static characterization.

2.3.1. Static Characterization

Static characterization of A/D converters involves a description of the transfer curve measured with a DC input signal. The transfer curve is a plot of the digital output on the y axis versus analog input on the x axis; both ideal and nonideal curves are shown for a 3-bit example in Figure 2.2. Ideally, the
Digital Output

\[ 2^{N-1} = 7 \]
\[ 2^{N-1} = 4 \]

Characteristic Line
Slope = Gain

Offset = 0

a. Ideal

Digital Output

\[ 2^{N-1} = 7 \]
\[ 2^{N-1} = 4 \]

Characteristic Line
Slope = Gain

b. Nonideal

Figure 2.2 - Example transfer curves for N=3-bit A/D converter
The transfer curve resembles a staircase whose steps are spread equally from negative to positive full scale. The input voltages at which the digital output changes are called transition voltages. The voltage difference between two consecutive transition voltages is the width of a step; the width of all steps in the ideal case are equal to one least significant bit (LSB). Assume that the code is offset binary; a negative full-scale input produces code 0, and a positive full-scale input produces code $2^N - 1$, where $N$ is the resolution. Nonidealities are described as offset, gain, and nonlinearity errors by drawing a characteristic line through a real curve.

The characteristic line is defined by two points: the intersection of code 1 and the transition voltage between codes 0 and 1 and the intersection of code $2^N - 1$ and the transition voltage between codes $2^N - 2$ and $2^N - 1$. The region along the x axis between these two points is divided into $2^N - 2$ steps, where the width of each step is 1 LSB, and each step corresponds to one code. The characteristic line is given by the following equation.

$$x(y) = \frac{y-1}{m} + V(1); \text{ for } y = 0, \ldots, 2^N - 1$$  \hspace{1cm} (2.1)

where $y$ is the code number,

$x(y)$ is the input for code $y$,

$m$ is the slope,

$$m = \frac{2^N - 2}{V(2^N - 1) - V(1)}$$

and $V(y)$ is the transition Voltage between codes $y$ and $y - 1$.

The slope of the characteristic line is the gain of the A/D converter. Because the converter is assumed to use an offset-binary output code, the offset of the A/D converter is the input that produces code $2^{N-1}$. This is about equal to the input voltage at the intersection between the characteristic line and $y = 2^{N-1}$.

$$\text{gain} = m$$

$$\text{offset} = V(2^{N-1})$$

$$\text{offset} = \frac{2^{N-1} - 1}{m} + V(1)$$

If the x axis is measured in LSBs instead of in Volts, the slope of the characteristic line is $1 \frac{\text{code}}{\text{LSB}}$, and the equation of the line takes a simple form.
\[ x(y) = y - 1 + V(1) \] (2.3)

Nonlinearity in the real transfer curve is described in comparison to the characteristic line, and is divided into two categories, differential and integral nonlinearity. Differential nonlinearity (DNL) is defined for each code as the deviation of the corresponding step width from 1 LSB.

\[ DNL(y) = V(y+1) - V(y) - 1 \text{ LSB}; \quad \text{for } y = 1, \ldots, 2^N - 2 \] (2.4)

\[ DNL(y) = 0; \quad \text{for } y = 0 \text{ and } 2^N - 1 \]

where \( y \) is the code number, and

\( V(y) \) is the transition Voltage between codes \( y \) and \( y - 1 \), in LSBs.

Integral nonlinearity (INL) is defined for each code as the transition voltage between this code and the next lowest code minus the input voltage corresponding to this code determined from the characteristic line.

\[ INL(y) = V(y) - x(y); \quad \text{for } y = 0, \ldots, 2^N - 1 \] (2.5)

where \( y \) is the code number,

\( V(y) \) is the transition Voltage between codes \( y \) and \( y - 1 \), in LSBs; for \( y = 1, \ldots, 2^N - 1 \),

\[ V(0) = V(1) - 1 \text{ LSB}, \] and

\( x(y) \) is defined in Equation 2.3.

Note that the INL for any code can also be computed as the sum of the DNL for all smaller codes.

\[ INL(y) = \sum_{i=1}^{y-1} DNL(i) \] (2.6)

To prove this, substitute Equation 2.3 into Equation 2.5.

\[ INL(y) = V(y) - V(1) - (y - 1) \]

\[ INL(y) = V(2) - V(1) + V(3) - V(2) + \cdots + V(y) - V(y - 1) - (y - 1) \]

\[ INL(y) = \sum_{i=1}^{y-1} V(i + 1) - V(i) - \sum_{i=1}^{y-1} 1 \]

\[ INL(y) = \sum_{i=1}^{y-1} DNL(i) \]

If the gain and offset are constant, they do not limit converter performance because, if necessary, their effects are eliminated by common gain-control and offset-nulling circuits on the system level. Since
video systems do not correct linearity errors, however, A/D converter nonlinearity limits video system performance. The nonlinearity of the converter is usually summarized as the maximum absolute DNL and INL. Because the INL is just the sum of DNL (see Equation 2.6), the maximum absolute DNL is equal to the largest change in the INL from one code to the next. Therefore, the maximum absolute DNL is less than or equal to twice the maximum absolute INL. For example, if the INL is bounded between ±0.5 LSB, the DNL must be bounded between ±1 LSB. The converse may not hold because if the DNL does not alternate between positive and negative values, the INL will accumulate in one direction or the other. However, if the DNL is bounded between ±1 LSB, the distance between transitions is less than 2 LSB, and the A/D converter has a desirable characteristic: each code exists for some distinct input range (i.e., no missing codes).

23.2. Dynamic characterization

Dynamic characterization, involving input-signal frequencies approaching half the sampling rate, is also important to measure the performance of video-rate A/D converters. Because these requirements are not standardized, various dynamic tests, ranging from none to application-oriented tests such as differential gain and phase are often used. Several dynamic tests are presented below. Because no single test is able to entirely characterize the conversion, some combination of tests will have to be done.

23.2.1. Bandwidth

A conceptually simple method of dynamic characterization is to measure the large-signal bandwidth. The bandwidth must be much larger than the maximum input frequency to ensure a flat passband and to limit the difference between the group delay of high and low frequency input signals. This method has been used by itself to dynamically characterize at least one high-speed A/D converter. Although necessary, a high-enough bandwidth is not adequate proof of dynamic performance because it does not guarantee high-enough linearity or low-enough noise.
23.2.2. Beat Frequency Test

Another method of characterization is to examine the reconstructed output of the A/D converter while the input is a spectrally pure sine wave. For low-enough-frequency input, the amplitude of the output will change by an amount less than or equal to 1 LSB per sample. Under this condition, missing codes and other gross nonlinearities should be readily apparent with an oscilloscope. For high-frequency input, however, the output will often change by more than 1 LSB per sample, and such examination will be impossible. To overcome this limitation, the input frequency can be made slightly greater (or less) than the sampling frequency so that the input completes just more (or less) than one cycle between consecutive samples. The output of the converter will then be a quantized beat signal whose frequency is the difference between the input frequency and the sampling rate. The beat signal frequency can be made as small as desired, and a visual check for distortion should again be easy. Because the converter is not expected to be able to accurately digitize input signals with frequency greater than half the sampling rate, however, such a test is severe. To reduce the severity of the test, the input frequency can be made slightly less than one-half the input sampling rate. If the A/D converter output is sampled at one-half its input sampling rate, the beat frequency is the difference between the frequency of the sine wave and one-half the sampling rate. This method of generating the beat signal uses an input frequency for which the converter should achieve satisfactory performance.

The beat signal can also be examined analytically. For example, as described in the next two sections, it can be used to determine the linearity and signal-to-noise ratio of the converter for the input that produced the beat signal. If the results of these measurements are compared against those from when the frequency of the input signal is equal to the beat signal frequency, a measure of the associated reduction in the performance of the input sample-and-hold circuit is obtained.

23.2.3. Dynamic Linearity

The most obvious method of dynamic characterization of an A/D converter is the extension of the static linearity characterization to input frequencies up to half the sampling rate. A code-density test can be used for this purpose. Such a test involves making a histogram of the output codes obtained while
an input with a known probability density function is applied. The histogram is normalized by the density function, and the DNL and INL are calculated. Because the test is statistical in nature, all output codes do not have to be collected. Instead, a random selection of output codes can be used; therefore, the histogram hardware need not operate in real time. However, because the test is statistical, it cannot be used to check for monotonicity, noise, or linearity dependence on the direction in which the input signal changes. Furthermore, although the test measures the DNL well, it has difficulty in measuring the INL because so many samples are required for accurate measurement that, by the time the measurement is complete, test conditions that affect the INL (such as the reference voltage) may have changed.

2.3.2.4. Signal-to-Noise Ratio

Signal-to-noise ratio (SNR) characterization involves testing the output that the A/D converter produces while the input is a spectrally pure sine wave. The ratio, in decibels, of the output power at the input signal frequency to the output power for all other frequencies between DC and half the sampling rate is the SNR. When greater than about 10 dB, the SNR is about equal to the ratio of the measured output power before the bandstop filter in Figure 2.3 to that after the filter.

If the A/D converter is perfectly linear and if the changes in input amplitude are much larger than the quantization step width, it can be shown that the quantization noise power is \( \frac{q^2}{12} \), where \( q \) is the quantization step width and is equal to 1 LSB. Such noise is ideally independent of both the level and frequency of the input as long as the level does not overload the converter but is big enough to randomize

---

**Figure 2.3 - SNR test configuration**

---
the output codes. Note that inputs whose amplitude is similar to \( q \) can also be encoded with a noise power of \( \frac{q^2}{12} \) by superimposing on them a dither signal, a large-amplitude signal that is removed after quantization. For a fixed input frequency, the SNR thus increases directly with the input level until the overload point is reached. The maximum SNR occurs for a full-scale sine wave input. Let the peak-to-peak amplitude of a full-scale sine wave be \( 2A \); its mean square amplitude is then \( \frac{A^2}{2} \). For N-bit resolution, the magnitude of the quantization step width is: \( q = \frac{2A}{2^N} \). The familiar form of the maximum SNR is now derived for an ideal A/D converter with N-bit resolution.

\[
\text{maximum SNR} = \begin{bmatrix}
\frac{A^2}{2} \\
\frac{q^2}{12}
\end{bmatrix}
\]

\[
= \begin{bmatrix}
\frac{12A^2}{2} \\
\frac{(2A)^2}{2^{2N}}
\end{bmatrix}
\]

\[
= \left[1.5(2^{2N})\right]
\]

After taking the log of the above expression and multiplying by 10, it is seen that the maximum SNR, in dB, equals:

\[
\text{maximum SNR} = 6.02N + 1.76 \text{ dB}
\] (2.7)

In addition to measuring the maximum SNR, the SNR can be measured as a function of both the root-mean-squared (RMS) input level and frequency, and the results can be summarized on a graph of SNR on the y axis versus RMS input level on the x axis with input signal frequency as a parameter. Figure 2.4 shows such a graph for an ideal 8-bit A/D converter. Ideally, for constant input frequency, the SNR decreases by 6 dB per octave for decreasing RMS input signal level until the input reaches the noise floor. The ratio, in dB, of the level of a full-scale input to the level of the noise floor is the dynamic
range. In real A/D converters, however, owing to nonlinearity effects, the slope of the curve may be less than 6 dB per octave. Also, although the SNR is ideally independent of the input frequency, in real A/D converters, the maximum SNR decreases with increasing input frequency because of limitations in the sample-and-hold circuit.

Figure 2.3 shows that a digital-to-analog (D/A) converter can be used to test the A/D converter. Here, so that the A/D converter limits the test results, the static and dynamic characteristics of the D/A converter (DAC) must be at least four times better than those of the A/D converter. To eliminate the effect of the D/A converter, the output of the A/D converter can be analyzed digitally. Such analysis
involves storing the output codes obtained while a spectrally pure sinusoidal input is applied. The output codes are transformed from the time domain into the frequency domain by a discrete Fourier transform (DFT). The power at the fundamental frequency of the sine wave is considered signal power, and the power at all other frequencies above DC and up to and including half the sampling frequency are considered noise power. Since only a finite number of output codes are stored, this procedure is equivalent to taking the transform of a sequence of output codes first multiplied by a rectangular window function equal to one for the sampled codes and zero otherwise. The transform of such a rectangular window function is a frequency-sampled $\frac{\sin nfT}{nfT}$ function, where $f$ is the frequency, and has non-zero side lobes at frequencies other than multiples of the sampling frequency; see Figure 2.5.

![Figure 2.5 - Rectangular window and its spectrum](image)

Therefore, unless the window contains an integral number of periods of the input sine wave, the side lobes will distort the sampled output spectrum. To reduce this side-lobe distortion, the output samples
can be multiplied by another window function before the transformation. A Hanning window is usually selected for this purpose. Because such a window widens the main lobe in the output spectrum slightly, the spectral power at a few frequencies on both sides of the input frequency must be included in the signal power.

The measured A/D converter noise consists of both intrinsic quantization error and extrinsic harmonic distortion, where the distortion arises from nonlinearity. The total harmonic distortion (THD) can be computed from the DFT output for a full-scale sinusoidal input by dividing the output power at the fundamental frequency by the sum of the output power at all other harmonics of the input.

\[
\text{THD} = 10 \log_{10} \frac{p_1}{\sum_{k=2}^{\infty} p_k}
\]

where \( p_k \) is the output power at the \( k \)th harmonic.

In practice, harmonics with frequency greater than half the sampling frequency are aliased back into the base band, and all harmonics with amplitude greater than the noise floor should be included in the summation. If nonlinearity limits the maximum SNR, the \( \text{THD} = \text{maximum SNR} \).

SNR characterization overcomes a limitation that occurs in measuring INL with a code-density test; that is, because it takes many fewer samples to measure the SNR than the INL, the SNR test is not as sensitive to changes in test conditions as is the code-density test. Through the THD, the SNR test can be used to estimate the maximum absolute INL.

\[
\text{maximum } |\text{INL}| \approx N - \frac{\text{THD}}{6}
\]

The SNR test, however, gives little information about DNL and missing codes because the SNR is not measured as a function of the DC offset of low-level sinusoidal inputs. While this connection could be made, the DNL is efficiently measured using the code density test.

2.3.2.5. Noise Power Ratio

Like the SNR test, the noise power ratio (NPR) is a measure of the amount of noise added to a sig-
nal by the A/D converter under specified conditions. Instead of applying a deterministic input to the converter as in the SNR test, the NPR test requires a random input whose amplitude for all time follows a Gaussian probability density function. Figure 2.6 shows the NPR test configuration; the NPR is the ratio, in dB, of two measures of the total output power in one given frequency band, the slot power.

First, the slot power is measured while the Gaussian input contains this band; second, the slot power is measured with this band filtered out of the input. The slot power in the second case is caused by quantization error and nonideality in the A/D converter.

The NPR can be measured as a function of the RMS input noise level with slot frequency as a parameter, where the RMS input noise level is the standard deviation, $\sigma$, of the Gaussian input; see Figure 2.7. For small $\sigma$, the probability that the input exceeds full-scale at any time is negligible. Here, the slot power consists mostly of quantization error power and is nearly independent of the input noise level; therefore, the slope of a NPR versus RMS input noise level curve is about 6 dB per octave in this region. As $\sigma$ increases, however, the probability of clipping increases. When clipping occurs, it produces intermodulation products that, along with the quantization noise, contribute to the noise floor. Such distortion becomes significant when the probability of clipping becomes greater than about $10^{-5}$, corresponding to $\sigma = \frac{1}{4.42} V_{FS}$, where the input range is $\pm V_{FS}$. The noise power ratio curves thus reach their peak values for an input level of about -12.9 dB. Although the peak NPR occurs for such a small input level, NPR testing places dynamic stress on the A/D converter by moving the input both randomly and rapidly. Because Gaussian noise with suitable bandwidth models the characteristics of a frequency-division-
multiplexed (FDM) signal, NPR testing is widely used to characterize the transmission of such signals.\textsuperscript{26}

In this sense, it is an application-oriented specification and is not considered further in this thesis.

2.3.2.6. Differential Gain and Phase

Differential gain and phase are also application-oriented measurements used for testing systems processing standard color television pictures.

A color image consists of three components: brightness, hue, and saturation. The brightness is the apparent intensity; the hue is the observed color; the saturation is the degree to which the observed color is pure and not diluted by white. The brightness determines the variation of gray from black to white and is referred to in television applications as the luminance. Together, the hue and saturation determine the
degree to which a color differs from a gray with the same brightness and are referred to in television applications as the chrominance. The composite video signal consists of both the luminance and the chrominance.

According to the National Television System Committee (NTSC) standards, the part of the frequency spectrum assigned to television broadcasting is allocated in 6-MHz increments to each broadcasting station. Figure 2.8 shows the envelope of the spectrum of a NTSC channel.

![Figure 2.8 - Frequency spectrum of a NTSC channel](image)

First, the saturation and hue respectively modulate the amplitude and the phase of a single color subcarrier within the composite video signal at 3.579545 MHz. Then the composite video signal modulates the amplitude of a picture carrier at 1.25 MHz, and the sound modulates the frequency of a sound carrier whose frequency is 4.5 MHz above that of the picture carrier.28,29

Figure 2.9 shows the detailed spectrum of the demodulated luminance and chrominance signals; the spectrum of the composite video signal is the superposition of the luminance and chrominance spectrums. As a result of horizontal sweep sampling, the luminance consists of discrete energy components at harmonics of the line-scanning frequency, 15.734 kHz, or equivalently, at even harmonics of one-half the line-scanning frequency. For adequate color bandwidth and to make color television transmission compatible with black-and-white television reception, the color subcarrier frequency was chosen to be 455
times one-half the line-scanning frequency; therefore, the chrominance consists of discrete energy components at odd harmonics of one-half the line-scanning frequency. If the television image is stationary, owing to vertical sweep sampling, discrete sidebands separated by the frame rate (60 Hz) occur in bunches around both luminance and chrominance components.

Because the saturation and hue simultaneously modulate the amplitude and phase of the color subcarrier, the color quality in the NTSC system is sensitive to both amplitude and phase distortions of this carrier. In particular, since the chrominance rides on top of the luminance, it is important that both the amplitude and phase of the chrominance signal have no more than a weak dependence on the amplitude of the luminance. Differential gain is the percentage change in the chrominance amplitude and differential phase is the difference in the chrominance phase both as a function of specified large changes in the luminance.30

As described above, the luminance and chrominance are separated from each other in frequency space. In a perfectly linear system, they will be independent of each other. Nonlinearity anywhere in the system, however, causes intermodulation and allows interaction between the luminance and
chrominance. While video-rate A/D converters are often characterized by differential gain and phase measurements, dynamic linearity measurements are completely adequate and more basic in their revelations about the characteristics of the converter; therefore, differential gain and phase are not considered further in this thesis.

2.4. Specifications

The requirements on the A/D converters for standard color television systems and real-time video processing systems are presented below.

2.4.1. Encoding of Composite Video Signals

Standard practice in digitizing a NTSC color signal is to encode the composite video signal into pulse-code modulation (PCM) form, in which the signal is sampled periodically, quantized into discrete values, and represented as a series of pulses in a binary code. A PCM NTSC picture consists of about 220,000 picture elements (pixels) with about 450 pixels per line and about 485 unblanked lines.

The bandwidth of the composite video signal is about 4.2 MHz. Owing to the discrete nature of the spectrum of the composite video signal, with comb filters, sub-Nyquist encoding of NTSC signals is possible. To avoid the complicated filtering operation needed to overcome aliasing caused by sub-Nyquist encoding and to cope with the frequency responses of real anti-alias filters, however, a conversion rate of at least 10 Ms/s is needed.

It turns out, furthermore, that the visibility of beat patterns between the quantizing noise and the color signal on the television screen is reduced if the signal is sampled at an integer multiple of the color subcarrier frequency. To avoid aliasing, the multiple is greater than 2. Because the color subcarrier frequency, $f_c$, is an odd multiple of one-half the line frequency, if the video signal is sampled at 3 times the subcarrier frequency, the sampling rate, $f_s$, becomes an odd multiple of one-half the line frequency. The number of samples in each group of 2 sequential lines of the television picture is therefore odd, and the number of samples per line is not constant; instead it alternately increases and decreases by one sample per line. Figure 2.10a shows that, as a result, samples in sequential lines are misaligned. Here, unless the
samples are somehow vertically aligned, comb filters that separate the luminance and chrominance are difficult to build.33 One way to vertically align the samples, is to use phase alternating line encoding (PALE),34 in which the encoding phase is reversed from line to line. On the other hand, if the video signal is sampled at 4 times the color subcarrier frequency, the sampling rate becomes an even multiple of the line frequency, and samples in all lines of the television image are vertically aligned; see Figure 2.10b. Here, comb filters that separate the luminance and chrominance signals are easy to build. In sum-
mary, while sampling at 3 times the color subcarrier frequency requires the lowest bit rate, sampling at 4 times the color subcarrier frequency requires the simplest comb filter. Both sampling rates are used in practice;\textsuperscript{35} therefore, the conversion rate is either 10.7 or 14.3 Ms/s for NTSC systems. For Phase Alternating Line (PAL) and Séquential Couleur à Mémoire (SECAM) systems, for which the color subcarrier frequency is about 4.43 MHz,\textsuperscript{36} the corresponding conversion rate is either 13.3 or 17.7 Ms/s.

Although 8-bit resolution is usually used to quantize a composite video signal into one of 256 levels, subjective testing has shown that this application requires about 7-bit resolution.\textsuperscript{37,38} About 7-bit linearity, 40 dB of signal-to-noise ratio (SNR) for full-scale input signals, and 40 dB of dynamic range (DR) are also required. It is important that the A/D converter attain this performance for input signals of any magnitude up to and including full-scale over the entire video band, DC to 4.2 MHz; therefore, the input sample-and-hold operation must also achieve this level of performance.

2.4.2. Real-Time Image Processing

Requirements for other real-time image processing systems (such as automated manufacturing, EQTV, HDTV, medical imaging, motion recognition, pattern recognition, robotics, special effects, and teleconferencing) are much more varied than for standard television, but have as a common factor conversion rates in the 5-100-Ms/s range. For example, in the image-processing lab at Berkeley, a pattern recognition system uses conversion rates of both 5 and 10 Ms/s.\textsuperscript{22} Also, according to the American Television System Committee (ATSC) standards, HDTV systems with 1125 lines/frame will use a sampling rate of 74.25 MHz.\textsuperscript{39} In some applications, resolution greater than 8 bits is required. For example, in digital video studios, where high picture fidelity is required, 9-bit or 10-bit resolution is required to allow extensive digital signal processing of the video image without SNR degradation.\textsuperscript{40} Also, in pattern recognition systems searching for details in the presence of widely different amounts of ambient light, more than 8-bit resolution is required to achieve wide dynamic range without analog automatic gain control (AGC). The key point is that 8-bit resolution and 15-MHz conversion rate are not the ultimate specifications for video-rate A/D conversion.
Chapter 3 - Architecture Review and Comparison

3.1. Introduction

Currently, there are three conversion architectures that have been used in CMOS technologies to try to meet the video-rate objectives. In chronological order, they are: flash, subranging, and pipelined architectures. In this chapter, these architectures are reviewed and their theoretical and real performances are compared against each other and against the video-rate specifications. Also, oversampling architectures are considered for this application but shown to require too high sampling rates for current CMOS technologies.

3.2. Flash Architecture

One architecture for a N-bit A/D conversion compares the input simultaneously to $2^N - 1$ linearly graded references. Because the comparisons are made at evenly spaced references in the same interval of time, such a conversion is said to use a parallel or flash architecture. Figure 3.1 shows a block diagram of a N-bit flash A/D converter. It consists of a resistor string, a bank of latched comparators, and a digital encoding circuit. The resistor string divides the reference into $2^N$ regions of equal potential difference, and provides the voltages at the boundaries of these regions as reference voltages to the bank of comparators. The analog input is also connected to every comparator. With their outputs unlatched, the comparators track the difference between the input and their references. All the comparators are then latched simultaneously to digitize the input. Because the comparator outputs are only high for comparators whose references are less than the input, the comparator bank outputs form a thermometer code in which the high outputs rise only to the reference nearest to but less than the input. To complete the conversion, the digital outputs of the comparators are decoded from thermometer code to binary.

3.2.1. Advantages

The primary advantage of the flash conversion architecture is its high conversion rate. By pipelining the digital decoding operation, it occurs while another input is sampled and digitized; therefore, only
2 clock phases are required per conversion, corresponding to the latched and unlatched states of the comparators. Furthermore, because the pipelined information is entirely digital, it can be rapidly transferred to 1-bit accuracy between registers. The speed of this architecture is therefore only limited by the speed of the comparators and logic.

If the reference is divided by a resistor string, the divided reference exhibits inherent monotonicity; that is, the reference voltage between any point on the string and the end with the lowest voltage is a non-decreasing function of increasing distance between the two points. The transfer curves of resistor string based flash converters can therefore be made monotonic. This is an important advantage in control
systems in which nonmonotonic behavior can cause oscillations but unimportant in most video-rate applications.

3.2.2. Limitations

Although flash architectures usually yield the highest conversion rates, they tend to have large area, power dissipation, and input capacitance. Because flash converters require one comparator per LSB, all three quantities are proportional to $2^N$, where $N$ is the resolution. This exponential relationship between its key parameters and resolution limits the resolution of flash converters.\textsuperscript{41} For example, in discrete and modular form, flash converters have been used for a long time with 4-bit or less resolution. While monolithic techniques have extended the limit on resolution to 10 bits,\textsuperscript{3} in CMOS technologies, the resolution is currently limited to no more than 8 bits.\textsuperscript{42}

In a flash converter, the offsets of the comparators pose limits to the performance of the converter. The offset is a random variable whose mean represents the systematic component of the offset present in all the comparators and whose standard deviation represents the random component of the offset and accounts for variations in the offsets between different comparators. While nonzero mean causes only an input-referred converter offset, nonzero standard deviation causes nonlinearity. For maximum $DNL < 0.5 \, \text{LSB}$ and maximum $INL < 0.5 \, \text{LSB}$, the variations in the offsets of all the comparators in a flash converter must be between $\pm 1/2 \, \text{LSB}$. To meet this requirement in CMOS technologies, the offsets on the comparators are usually canceled by closing a feedback loop around a gain stage in the comparator and storing a voltage about equal to the offset on a capacitor. Figure 3.2 shows such a circuit and a transfer curve of a gain stage with nonzero offset and finite gain. With the feedback loop closed, the input of the gain stage must equal the output, and a line through the origin of the transfer curve with a slope of 1 is drawn to represent this equality; therefore, the voltage stored on the capacitor, $V_c$, is determined by the intersection of this line and the transfer curve.
Figure 3.2 - Offset cancellation technique

\[ V_e = \left( \frac{A}{1+A} \right) V_{\text{off}} \]  

(3.1)

where \( A \) is the open loop gain

and \( V_{\text{off}} \) is the offset.

When the feedback loop opens, if the left side of the capacitor is used as the input to the offset-canceled gain stage, the capacitor is in series with the gain stage; therefore, if other effects are ignored, the amount of uncanceled offset, \( V_{\text{uncanceled}} \) is:

\[ V_{\text{uncanceled}} = \frac{V_{\text{off}}}{1+A} \]  

(3.2)

The offset is canceled to the extent that it is small and the gain is large.
In high-speed comparators, the random offset tends to be large. The random portion of the offset has a component that is proportional to the overdrive, $V_{gs} - V_t$, of the input transistors in the comparator. The transition frequency, $f_T$, of any transistor in a CMOS process is also proportional to the overdrive. Because high-speed circuits need to use transistors with large $f_T$, they have a random offset component that is inherently larger than that of corresponding low-speed circuits.

Also, in high-speed comparators, the gain tends to be small. To prevent oscillation while the feedback loop is closed, the magnitude of the gain is usually decreased as a function of increasing frequency at 6 dB/octave; therefore, the gain stages within high-speed offset-cancelled comparators usually each have one pole within the feedback loop. While the gain of each stage is limited by the one-pole frequency response, because the maximum gain-bandwidth product (GBW) is constant for a gain stage with one pole, the gain needed for adequate offset cancellation can be obtained at the expense of bandwidth to some extent; however, high bandwidth is also needed for video comparators. In flash converters that do not use an input S/H amplifier, the comparators must track the input accurately before they are latched. Because the bandwidth of a gain stage in a comparator is limited, the outputs of the gain stages are shifted in phase. If the bandwidth is greater than about 50 MHz, the change in the phase will be less than about 5°; this is a necessary condition for acceptable NTSC color television viewing. The GBW is at most limited by $f_T$; in the MOSIS 3-micron CMOS process, $f_T = 800MHz$ for a n-channel transistor with 0.5 V of overdrive and $f_T = 400MHz$ for a corresponding p-channel transistor. With a bandwidth of 50 MHz, the gain will therefore be less than about 10, and no more than 90% of the inherently high offset can be canceled by a circuit such as the one in Figure 3.2.

For all these reasons, offset cancellation at video rates is difficult. Although CMOS offset cancellation techniques have overcome these problems at an 8-bit level, the accuracy with which the offset can be canceled at video rates may cause a limitation at higher resolutions. The key point is that, for video-rate conversion, an important limitation of the flash architecture is that it requires offset cancellation.

Because of the difficulty of realizing an operational amplifier (op amp) in CMOS technologies that is fast enough to drive the inherently large input load, flash converters usually do not use an input S/H
amplifier, causing another limitation. When each comparator must do its own S/H operation, differences in the instantaneous inputs of each comparator at the sampling moment or in the sampling moments themselves introduce additional distortion into the conversion. The equation of a sine wave input, \( V_i \), with amplitude, \( A \), and angular frequency, \( \omega_i \), is:

\[
V_i = A \sin (\omega_i t). 
\] (3.3)

For small amounts of time skew, the extra error power, \( E_p \), is:

\[
E_p = \frac{A^2 \omega_i^2 \sigma_t^2}{2}, 
\] (3.4)

where \( \sigma_t \) is the standard deviation of the sampling period.

The average power in a sinusoidal input is \( \frac{A^2}{2} \). If all other noise sources are ignored, using Equation 3.4, the allowable \( \sigma_t \) for a SNR of 40 dB is:

\[
\sigma_t \leq \frac{0.01}{\omega}. 
\] (3.5)

For the highest frequency video signal, 4.2 MHz, \( \sigma_t \leq 378 \text{ps} \).

A similar result is obtained from a more qualitative calculation than above. The maximum slope for any time of a full-scale sinusoidal input is:

\[
\frac{d(V_i)}{dt} \bigg|_{\text{max}} = A \omega_i. 
\] (3.6)

If the input is full scale,

\[
1 \text{ LSB} = \frac{2A}{2^N}, 
\] (3.7)

where \( N \) is the resolution.

The minimum time that the input takes to change in amplitude by 1 LSB, \( t_1 \), is:

\[
t_1 = \frac{1}{2^{N-1} \omega_i} \text{ seconds}. 
\] (3.8)

For 8-bit resolution and a 4.2-MHz input, \( t_1 \geq 296 \text{ps} \). Thus if the timing jitter is less than 296 ps in an
otherwise ideal 8-bit A/D converter, the amplitude error in sampling a sinusoidal input with any frequency up to 4.2 MHz and any amplitude up to full scale is less than 1 LSB. The conversion is then accurate to at least 7 bits, and the corresponding maximum SNR is about 42 dB.

Although it is possible to obtain 40 dB of SNR for a full-scale sine wave input at 4.2 MHz without a S/H amplifier in a CMOS flash converter, the key point is that the difficulty of using a dedicated S/H amplifier is an important disadvantage of the flash architecture. With a dedicated S/H amplifier, the entire issue of timing jitter between the sampling instants of the many comparators is eliminated.

Although flash architectures provide the highest conversion rates of all the architectures under consideration, they are often limited by reduced performance at high input signal frequencies.

3.3. Subranging Architecture

The limitations of the flash architecture are caused by the exponential relationship between its key parameters and resolution. For resolutions less than about 4 bits, the area, power dissipation, and input capacitance are at manageable levels; therefore, multistage architectures, such as the subranging architecture, that use low-resolution flash converters as building blocks, can overcome these limitations.

Subranging A/D converters are a subclass of successive approximation converters in the sense that the conversion takes place in a sequence of comparison operations each of which results in the elimination of a group of subregions of the conversion range from further examination about whether they contain the input. In successive approximation converters, the search is usually binary, so that after each comparator decision, the number of remaining subregions that must be tested is reduced by a factor of two. Consequently, the successive approximation technique usually requires N intervals of time where each interval involves one comparison operation, and N is the resolution of the conversion. As a result, successive approximation architectures are not usually fast enough for video-rate applications. In subranging A/D converters, on the other hand, the number of bits determined on each cycle of operation is greater than 1 but less than N; therefore, subranging architectures represent a compromise between the flash and successive approximation techniques.
Figure 3.3 shows a conceptual block diagram of a 2-step, N-bit subranging A/D converter.

![Block Diagram of a 2-step, N-bit subranging A/D converter](image)

It consists of a sample-and-hold block, two N/2-bit A/D subconverters, a N/2-bit D/A converter, and a subtractor. To begin a conversion, the input is sampled and held. The held input is then converted into a digital code by the coarse A/D subconverter and back into an analog signal by a D/A converter. The D/A converter output is subtracted from the held input, producing a residue that is digitized by the fine A/D converter. After the digital outputs from both subconverters are decoded into binary, the decoded outputs from the second stage are appended to those from the first stage to produce a N-bit binary output.

### 3.3.1. Advantages

The primary advantage of the subranging conversion architecture is that it requires less hardware than a flash architecture. Although Figure 3.3 shows two separate A/D subconverters, in CMOS technologies, these functions can be merged together in the actual implementation. For example, one resistor string with $2^N$ elements may be used to divide the reference for both of the A/D subconversions and the D/A conversion. An array of $2^{N/2} - 1$ voltage comparators may be time shared between the two A/D functions. Figure 3.4 shows how this may be done for one offset-canceled comparator with a 4-phase, nonoverlapping clock. After the input is sampled at the end of $\phi_1$, the comparators are referenced...
through transmission gates to $2^{N/2} - 1$ coarsely spaced taps on the resistor string during $\phi_2$ to decide the $N/2$ most significant bits at the end of this phase. During $\phi_3$, the appropriate subset of finely spaced D/A converter output levels are selected; this operation is not shown in the figure. Finally, the inputs of the comparators are referenced to these D/A output levels through transmission gates during $\phi_4$ to encode the remaining bits at the end of this phase. For 8-bit resolution, for example, this corresponds to 16 comparators (one for overflow) as opposed to one comparator in the successive approximation case. This number is much smaller, however, than the 256 comparators required in the flash case. Even if two separate comparator arrays are used, only $2^{N/2} + 1$ comparators are required, which is equal to 31 for 8-bit resolution. In either case, the decrease in the number of comparators from that required in a flash converter causes corresponding reductions in the area and input capacitance of subranging converters.
Another advantage of the subranging architecture is that it does not require the use of an op amp. Because high-speed op amps are difficult to realize, a common goal in the design of subranging A/D converters is to avoid using op amps. To meet this goal, the S/H function is embedded into the A/D subconversions as in a flash converter; see Figure 3.4, in which the input is sampled at the end of $\phi_1$. Since a subranging architecture requires fewer comparators than a flash architecture, the problem of sampling jitter between comparators is correspondingly reduced. Furthermore, because subranging converters use so few comparators and need no op amps, their power dissipation can be the smallest of the architectures under consideration.

3.3.2. Limitations

Although the ability to operate without an op amp is a key advantage of subranging architectures, the absence of op amps causes some limitations. Without op amps, it is impossible to build parasitic-insensitive S/H amplifiers. Without interstage S/H amplifiers, the current sample must propagate through the entire converter before a new conversion can begin. In a 2-stage subranging converter, 4 clock phases per conversion are required, one each for sampling, coarse A/D subconversion, D/A conversion, and fine A/D subconversion. Two-step subranging converters are thus about half as fast as flash converters. Furthermore, such subranging converters require 2 additional clock phases per additional stage. One extra phase is for D/A conversion and the other is for another A/D subconversion. Extra stages therefore reduce the maximum conversion rate, and in currently available CMOS technologies, video-rate subranging converters are limited to only two stages. This means that their area, power dissipation and input capacitance are proportional to $2^{N/2}$ and still exponentially related to the resolution.

Not only does the absence of op amps in a subranging converter mean that there are no interstage S/H amplifiers, but it also means that there is no input S/H amplifier. As mentioned in the last section, the S/H function must then be merged into the A/D subconversions. Although the problem of sampling jitter between comparators is diminished from that in a flash converter because of the reduction in the number of comparators, the problem is not eliminated. Also, for accurate conversions over the entire input range, all the S/H circuits must work accurately. Because accurate high-frequency sampling is a difficult
operation to do even once per sample, the probability of multiple S/H circuits working well enough is less than that of just one dedicated S/H circuit working well enough. Therefore, without op amps, subranging converters still have difficulty in sampling high-frequency input signals.

If there is no op amp in a subranging converter, the interstage gain is equal to one, and the effect of nonlinearity in the second stage is unattenuated on the linearity of the entire conversion. Therefore, as in a flash converter, the variations in the offsets of all the comparators in a subranging converter without an op amp must be between ±1/2 LSB. The need for offset cancellation here is unchanged from that in a flash converter, and the associated speed tradeoffs are still problematic.

3.4. Pipelined Architecture

It has been shown above that multistage conversion architectures lessen some limitations of flash architectures by reducing the total number of comparators. An enhanced multistage approach iterates the conversion in hardware with interstage S/H amplifiers so that the stages can work concurrently on samples and residues of different inputs; this approach is called pipelining. Pipelining eliminates the need for the converter to wait for all the stages to finish their part in processing the present input before another input can be sampled. Using a pipelined mode of operation in multistage architectures makes the maximum conversion rate almost independent of the number of stages. Pipelined configurations have been previously applied in high-performance, board-level converters, but had not been applied to monolithic, CMOS A/D converters because of the difficulty of realizing high-speed, S/H gain functions in CMOS technologies.

Figure 3.5 shows a block diagram of a general pipelined A/D converter with k stages. Here the total resolution, N, is:
As in a subranging converter, each stage contains a low-resolution A/D subconverter, a low-resolution D/A converter, and a subtractor. The new elements in a pipelined architecture are the S/H amplifiers that partition the pipeline into stages. Although each S/H amplifier consists of one circuit that provides sampling, holding, differencing, and amplifying functions, to simplify the explanation of the operation of pipelined converters, Figure 3.5 shows three separate blocks to do these operations: a S/H block, a subtractor, and an amplifier. To begin a conversion, the input is sampled and held. As in a subranging converter, the held input is then converted into a digital code by the first-stage A/D subconverter and back into an analog signal by the first-stage D/A converter. The D/A converter output is subtracted from the held input, producing a residue that is amplified and sent to the next stage, where this process is repeated. At this point, another input is sampled, and while the first stage processes the new input, the second stage
processes the amplified residue from the first stage. Once the pipeline is full, each stage processes the amplified residue from the previous stage, and the residues correspond to successively sampled inputs. Because sequential stages simultaneously work on residues from successively sampled inputs, the digital outputs of each stage shown in Figure 3.5 correspond to inputs sampled at different times. To align the digital outputs in time so that all the outputs correspond to one sample of the input, extra digital registers are required. For example, Figure 3.6 shows a block diagram of a 2-stage, pipelined A/D converter with these registers.

![Block diagram of a 2-stage pipelined A/D converter](image)

After the digital outputs are aligned in time and converted to binary form, the outputs are appended together to produce a N-bit binary output.

### 3.4.1. Advantages

A primary advantage of the pipelined architecture is its high throughput rate. The high throughput rate of the pipelined architecture stems from concurrent operation of the stages. At any time, the first stage operates on the most recent sample, while the next stage operates on the residue from the previous sample, and so forth. If the A/D subconversions and D/A conversions are done with flash converters, a pipelined architecture only needs two clock phases per conversion, the same as a flash architecture; however, the pipelined information here is analog and more time is required to generate and transfer the
analog residue with enough accuracy than to transfer the pipelined digital information with 1-bit accuracy in the flash converter. Although flash architectures therefore make the fastest converters, pipelining a multistage architecture makes the maximum conversion rate almost independent of the number of stages. This allows a degree of freedom not permitted in the design of video-rate subranging converters; that is, the number of stages in a pipelined, multistage converter can be greater than two without reducing the maximum conversion rate.

Another primary advantage of the pipelined architecture is its small required area and consequent manufacturing cost. The area of pipelined converters is small compared to those of flash converters because pipelined converters require fewer comparators than flash converters. For example, the 9-bit prototype pipelined converter described in Chapter 5 uses 28 comparators and requires a core area of 8500 square mils in a 3-micron, CMOS technology. If a 9-bit flash converter could be built in the same technology, it would use 512 comparators, and based on the area required by existing flash converters, it would be more than ten times larger than the pipelined prototype. Not only is the area small for pipelined converters but also it is linearly related to the resolution because, if the necessary accuracy can be achieved through calibration or trimming, the resolution can be increased by adding stages to the end of the pipeline without increasing the number of clock phases required per conversion. In contrast, flash and subranging architectures need exponential, rather than linear, increases in area to increase their resolution and also require trimming or calibration for greater than 8-bit or 9-bit linearity; therefore, while the areas of pipelined and subranging converters are similar for about 8-bit or 9-bit resolution, pipelined architectures will require much less area than subranging architectures for increased resolution.

Other advantages of the pipelined architecture stem from the use of S/H amplifiers to isolate the stages. First, because a S/H amplifier can also be used on the input of the A/D converter, pipelined architectures can accurately sample high-frequency input signals. Second, the interstage gains from these amplifiers diminish the effects of nonidealities in all stages after the first stage on the linearity of the entire conversion; furthermore, this allows the converter to use a digital correction technique with which
nonlinearities in the A/D subconversions and offsets in both comparators and op amps have little or no effect on the overall linearity. The affects of interstage gain and digital correction on reducing the sensitivity of pipelined converters to error sources are examined in Chapter 4.

3.4.2. Limitations

Because interstage gain diminishes the effects of error sources in all stages after the first, and because with ideal D/A converters, digital correction eliminates the effect of A/D subconverter nonlinearity, the D/A converter in the first stage determines the linearity of the entire A/D converter. Such D/A converters can be implemented with resistor strings for linearity in the 8-bit to 9-bit range. For integral linearity greater than 9 bits, the design of such a D/A converter is not trivial and either requires calibration or trimming.

The main disadvantage of pipelined A/D converters is that they require the use of op amps to realize parasitic-insensitive, S/H amplifiers to do analog subtraction and amplification at the sampling rate of the A/D converter. Although the S/H amplifiers improve many aspects of the converter performance, the op amps within the S/H amplifiers limit the speed of the pipelined converters. The 3-micron, CMOS prototype described in Chapter 5 is able to do these functions at 5 Ms/s. The maximum speed of such processing increases in scaled technologies, and video conversion rates should be achievable in 1.5-2-micron CMOS technologies.

3.5. Oversampling Architecture

In the conversion architectures presented so far, the input is sampled and quantized with N-bit resolution at a rate greater than but close to the Nyquist frequency. In oversampling A/D architectures, on the other hand, the input is sampled and quantized with less than N-bit resolution at a rate that is many times larger than the Nyquist frequency. The ratio of the sampling frequency of the oversampled architecture to that of the other architectures is called the oversampling ratio. Because averaging the samples reduces the error made per sample, the resolution required to obtain about 6N dB of SNR is less than N bits. Figure 3.7 shows a PCM coder and the corresponding output noise power
The output noise spectral density, \( N(f) \), is constant between \( \pm \frac{f_s}{2} \).

\[
N(f) = \frac{q^2}{12f_s} \tag{3.10}
\]

where \( q \) is the magnitude of the quantization step and \( f_s \) is the sampling frequency.

For Nyquist sampling, the total in-band noise, \( N_T \), is:

\[
N_T = \int_{-f_s/2}^{f_s/2} N(f) \, df = \frac{q^2}{12}. \tag{3.11}
\]

From Equation 3.10, it can be seen that increasing the sampling frequency spreads the total quantization noise, \( N_T \), out over an increased bandwidth. Using a low-pass filter whose bandwidth is \( f_{BW} \), the out-of-band noise is rejected, and the corresponding in-band noise, \( N_{BW} \), becomes:
Although simple oversampling and filtering of the quantization process therefore yields only 3 dB of improvement in SNR for each doubling of the oversampling ratio, using feedback and integration around the quantization process increases the effect of oversampling by making the noise spectral density an increasing function of frequency so that the in-band noise component is further reduced.

Figure 3.8 shows a block diagram of a first-order, sigma-delta modulator.

\[
N_{BW} = \int_{-f_w}^{f_w} N(f)df = \frac{a^2}{12} \frac{2f_{BW}}{f_s}.
\]

It consists of an S/H block, an integrator, a 1-bit A/D converter, a 1-bit D/A converter, and a subtractor. As a result of the negative feedback, the digital output oscillates in such a way that its average is equal to the average input. If the quantization noise is uncorrelated with the input, the A/D converter can be modeled by a linear gain, A, and an added noise error, \( e \). This replacement is made in Figure 3.9, which shows a sampled-data model of the modulator at discrete time, \( n \). The difference equation that describes the output, \( o_n \), is:
\[ o_n = i_n + e_n - e_{n-1} \]  \hspace{1cm} (3.13)

where \( i_n \) is the input at time, \( n \).

The z-transform of the noise transfer function, \( \frac{O(z)}{E(z)} \), is:

\[ \frac{O(z)}{E(z)} = 1 - z^{-1}. \]  \hspace{1cm} (3.14)

In the z-plane, the noise transfer function therefore has a pole at the origin and a zero at \( z = 1 \). To find the frequency response, let \( z = e^{j\omega T} \), where \( T \) is the sample period and \( \omega \) is the angular frequency. The magnitude of the frequency response, \( \left| \frac{O^2(f)}{E^2(f)} \right| \), is:

\[ \left| \frac{O^2(f)}{E^2(f)} \right| = 4 \sin^2 \left( \frac{\omega T}{2} \right) = 4 \sin^2(\omega f T) \]  \hspace{1cm} (3.15)

The sigma-delta modulator changes the frequency response of the quantization noise so that most of the noise falls out of the baseband. If the baseband is much smaller than the sampling frequency, after integration as shown in Figure 3.10, the average output noise power in the baseband turns out to be proportional to \( \left( \frac{f_{BW}}{f_s} \right)^3 \). For a first-order modulator, the noise in the baseband is therefore reduced by 9 dB per octave increase in the sampling frequency instead of by 3 dB per octave when only oversampling is
Figure 3.10 - Output noise power spectrum of first-order sigma-delta modulator

applied. Although this technique of adding an integrator and a feedback loop increases the noise reduction by 6 dB per octave each time it is applied, more than two loops can cause instability problems and are not considered further here.

The modulator generates a bit stream of ones and zeroes at the oversampling rate and pushes most of the quantization noise out of the baseband. To convert into PCM form, the output of the modulator is passed through a digital low-pass filter, from which the output is resampled at the down-sampled frequency, $f_{ds}$. Figure 3.11 shows such an arrangement.

Figure 3.11 - Block diagram of an oversampled A/D converter
While maintaining adequate passband ripple, the filter smooths the modulator output, attenuates the out-of-band quantization noise, and decimates the sample rate. Owing to these many requirements, the filter is often composed of at least two stages, where the first stage is a finite impulse response (FIR) filter and does most of the decimation and the second stage is an infinite impulse response (IIR) filter and compensates for droop in the passband introduced by the FIR filter.

3.5.1. Advantages

Oversampling A/D conversion requires simple analog circuitry and is naturally compatible with digital circuit technology and with the associated advantages of scaling. For a 1-bit quantizer in a first-order modulator, the analog circuits consist of one switched-capacitor integrator and one regenerative comparator. In a single-ended version, of the two capacitors required in the integrator, only the sampling capacitor must be as linear as the conversion; there is no linearity requirement on the integrating capacitor. Furthermore, because any two points can represent a straight line, the 2-level D/A converter is inherently linear. Another important advantage is that the anti-alias requirement is reduced by the oversampling ratio.

3.5.2. Limitations

While the advantages of oversampling stem from the high sampling rate, the main limitation in using oversampling for video-rate applications is that the required oversampling rates are prohibitively high. Assume that a 1-bit quantizer provides a peak SNR of about 6 dB. To reach a SNR of 40 dB with 3 dB of noise reduction per octave from straightforward oversampling, about 12 octaves of oversampling are required. For a down-sampled video frequency of 10 MHz, the input sampling rate must be about 40 GHz and is too high to be useful. With one feedback loop and the associated 9 dB per octave noise reduction, about 4 octaves of oversampling are required. This corresponds to an input sampling rate of about 160 MHz and a sampling period of about 6 ns and is still too fast for present CMOS technologies. Although second-order modulation increases the noise reduction to 15 dB per octave of oversampling, because the associated SNR peaks for input magnitude about 6 dB less than full scale, and owing to an
offset of about 10 dB in the SNR curve introduced here, the effect of oversampling on the SNR is reduced by about 16 dB; therefore, about 3.5 octaves of oversampling and a sampling rate of about 110 MHz are required, which is still too fast for present CMOS technologies. Furthermore, such second order modulators require more complicated analog front ends and digital filters than their first order counterparts, and since they reduce the required sampling rate by only a small amount for low SNR applications, interest in their use in many video-rate applications is diminished.

If instead of a 1-bit A/D converter, a multi-bit flash quantizer is used, the magnitude of the quantization noise, \( q \), is reduced before noise shaping by the increased resolution. While using a multi-bit A/D converter in Figure 3.8 means that D/A converter must have the same resolution as the quantizer and linearity at least as high as the corresponding peak SNR, a resistor string based D/A converter can meet these requirements for about 8-bit or 9-bit linearity, as mentioned about pipelined A/D converters; therefore, a multi-bit quantizer can be a useful element in reducing the required oversampling ratio for many video-rate applications. The use of multiple feedback loops or multi-bit quantizers, however, increases the accuracy requirements on the analog blocks within the oversampled modulators and diminishes their compatibility with digital technologies. For example, with a 4-bit quantizer, the magnitude of the quantization noise is reduced by 18 dB from that of a 1-bit quantizer, and for one feedback loop, the required input sampling rate decreases by 2 octaves to about 40 MHz. Although this rate is too high for 3-micron CMOS technologies, because it is probably within range of scaled technologies, oversampling architectures are of potential interest in future video-rate applications.

3.6. Performance Comparisons of Real Prototypes

In the remainder of this chapter, the performances of published CMOS A/D converters with conversion rates of at least 5 Ms/s and resolutions of at least 7 bits are reviewed. Seven converters are found to meet these requirements. Their performances are compared against each other and against the video-rate specifications.
3.6.1. Speed Comparison

Figure 3.12 shows a scatter plot of the conversion rate on the y axis versus the resolution on the x axis.

![Figure 3.12 - Scatter plot of conversion rate vs. resolution](image)

Seven points are plotted according to published test data; each point represents one converter and is identified by authors, institution, and the minimum feature size of the technology.

All seven converters have enough resolution to encode composite video signals. There are two groups according to conversion rate, one around 20 Ms/s and the other around 5 Ms/s. All the points in the higher rate group represent flash converters, and both points in the lower rate group represent multi-stage converters. In the 5-Ms/s group, the 8-bit point represents a subranging converter, and the 9-bit point represents a pipelined converter. The conversion rates of both multi-stage converters are inadequate for most video-rate applications; however, both are implemented in 3-micron CMOS technologies, and they should be able to achieve video rates in scaled technologies. Furthermore, the pipelined converter demonstrates the ability to attain more than 8-bit resolution at a high conversion rate. While the conversion rates of the flash converters are all high enough for video-rate applications, none has more than 8-bit resolution and only those reported by Hitachi, NEC, and Plessey are able to achieve at least 40
dB of SNR or equivalent over the entire 4.2-MHz NTSC video band. Because even these flash converters only barely attained this specification, the difficulty that flash architectures have in accurately sampling high frequency input signals is confirmed. In contrast, as will be shown in Chapter 6, the pipelined prototype achieved about 50 dB of SNR over the entire video band.

3.6.2. Area Comparison

To attain the high conversion rates, the flash converters use large areas. Figure 3.13 shows another scatter plot for the same chips with area on the y axis and resolution on the x axis.

---

Figure 3.13 - Scatter plot of area vs. resolution
To predict required areas for different resolutions, area proportional to $2^N$ and normalized to the data is drawn for each of the 3 different minimum feature sizes used for the flash converters. Area proportional to $2^{N/2}$ is drawn through the subranging point and area proportional to $N$ is drawn through the pipelined point.

Although the multi-stage architectures have 8-to-9-bit resolution instead of 7-to-8-bit resolution and use 3-micron CMOS technologies instead of scaled technologies, the multi-stage architectures require much less area than the flash architectures. For the same resolution and technology, it is this large difference in area that generates interest in multi-stage circuit techniques for video-rate A/D conversion; furthermore, for increasing resolution, the area gap between pipelines and other architectures widens exponentially.

3.6.3. Summary

Although flash A/D conversion architectures must be used in 3-micron CMOS technologies to attain video conversion rates with about 8-bit resolution, multi-stage architectures should be able to achieve video conversion rates in scaled CMOS technologies. This is important because multi-stage architectures have many desirable characteristics, such as small area, power dissipation, and input capacitance. No data have been published about subranging or pipelined converters with about 8-bit resolution in scaled CMOS technologies; however, the maximum conversion rates and areas of the two architectures should be similar. The difference between the two architectures boils down to the op amps. Op amps are required in pipelined converters to implement parasitic-insensitive S/H amplifiers, which while limiting the maximum conversion rate, give pipelined architectures concurrent stage operation, high-quality sampling characteristics, and high error tolerance. Because op amps are not required in subranging converters, they are often not used to avoid any associated conversion-rate limitation. When op amps are not used, subranging architectures consume the least power of the architectures under consideration but have sequential stage operation and lower-quality sampling characteristics and lower error tolerance than pipelined architectures. If op amps are used in a subranging converter to implement an input S/H amplifier or an interstage amplifier for example, the conversion might as well be pipelined by using identical op amps.
to realize interstage S/H amplifiers. Finally, for greater than about 8-bit resolution, pipelined multi-stage architectures are of potential interest because their characteristics degrade linearly instead of exponentially with increasing resolution as in flash and subranging architectures.
Chapter 4 - Effects of Error Sources in Pipelined ADC Architectures

4.1. Introduction

The primary error sources present in a pipelined A/D converter are gain errors in the S/H circuits and amplifiers, offset errors in the S/H circuits and amplifiers, A/D subconverter nonlinearity, D/A subconverter nonlinearity, and op-amp settling-time errors. As shown below, not only do the gain errors have little effect on linearity, but also with interstage gain greater than 1 and digital correction, the effects of offset and A/D subconverter nonlinearity are reduced or eliminated; therefore, the D/A subconverter nonlinearity and op-amp settling-time errors limit the performance of pipelined A/D converters.

4.2. Gain Errors

A block diagram of one stage in a pipelined A/D converter with gain error in the S/H amplifier is shown in Figure 4.1.

---

Figure 4.1 - Block diagram of one stage in a pipelined ADC with gain error
Here, the S/H amplifier replaces the S/H and interstage gain functions shown in Figure 3.5. To model a gain error in the S/H amplifier, the nonideal amplifier is replaced by an ideal amplifier in series with the gain error; the replacement is surrounded by a dotted line. A pipelined A/D converter consists of an input S/H amplifier followed by a concatenation of such stages. A gain error in the input S/H amplifier changes the conversion range of the A/D converter and does not affect the linearity of the conversion. The gain errors in the interstage S/H amplifiers, however, do affect the linearity of the conversion; the magnitude of this effect is now calculated.

For a transfer curve such as in Figure 2.2a, the digital code output from stage $i$, $code_i$, is:

$$code_i = \text{integer} \left( \frac{In_i}{ref_i} + 1 \right)2^{n_i-1}$$

where $-ref_i < In_i < ref_i$.  

Here, the "integer" function denotes that the code must be an integer; the conversion from decimal to integer is made by truncating the fractional part of the argument. Although the integer function is not written in the equations below, all digital codes refer to integers. The residue out of stage $i$, $residue_i$, is:

$$residue_i = In_i - \left( \frac{code_i + 0.5}{2^{n_i-1} - 1} \right)ref_i.$$  

Figure 4.2 shows a plot of the residue versus input for stage $i$ with $n_i = 2$ bits; the residue is positive for half the input range and negative for the other half.
The reference for stage $i$, $ref_i$, is:

$$ref_i = \frac{ref_{i-1}}{2^n} G_i - 1,$$  \hspace{1cm} (4.3)

where $G_i$ is the ideal value of the interstage gain between stages $i$ and $i+1$, and $ref_i$ is the full-scale positive input.

Consider a 2-stage pipelined A/D converter whose elements are ideal except for the value of the interstage gain. Therefore, while $code_1$ and $residue_1$ are ideal, the amplified residue is incorrect and may cause $code_2$ to be in error. From Equations 4.1 and 4.2,

$$code_2 = \left[ (1 + \varepsilon_i) G_1 \frac{residue_1}{ref_2} + 1 \right] 2^{n_2-1}. \hspace{1cm} (4.4)$$

Substituting Equation 4.3 into Equation 4.4 yields:

$$code_2 = \left[ \frac{(1 + \varepsilon_i)residue_1}{\frac{ref_1}{2^n}} + 1 \right] 2^{n_2-1}. \hspace{1cm} (4.5)$$
Because

\[-1 \leq \frac{\text{residue}_1}{\text{ref}_1} \leq 1,\]

the smallest \(\varepsilon_i\) that can change \(\text{code}_2\) by \(\pm 1/2\) LSB occurs when

\[\frac{\text{residue}_1}{\text{ref}_1} = \pm 1.\]

Then

\[\text{code}_2 = (1 \pm 1)2^{n_s-1} \pm \varepsilon_i 2^{n_s-1}.\]

The first term on the right side of Equation 4.8 is independent of \(\varepsilon_1\). If \(|\varepsilon_1| \leq \frac{1}{2^n}\), \(\text{code}_2\) and the entire conversion are accurate within \(\pm 1/2\) LSB. This means that, to attain a conversion accuracy of \(\pm 1/2\) LSB, the interstage gain must only be accurate within 1 part in \(2^n\) rather than 1 part in \(2^{n+1}\) as may have been expected. This improvement results from centering the residue around zero as shown in Figure 4.2 instead of making it either entirely non-negative or non-positive. In the centered case, the maximum magnitude of the residue is half that of the non-negative or non-positive cases. Because the effect of the gain error is minimum when the residue is zero and maximum when the absolute value of the residue is maximum, the allowable gain error with a centered residue is two times larger than that with either a non-negative or non-positive residue. For example, consider a two-stage pipelined A/D converter in which both stages have 4-bit resolution and in which the only error is in the gain of the interstage amplifier. If the residue is centered around zero, Equation 4.8 shows that the gain must be accurate within \(\pm 6\%\) to attain \(\pm 1/2\)-LSB accuracy. Under the same conditions except with a non-negative or non-positive residue, the interstage gain must be accurate within \(\pm 3\%\). In general, because the interstage gain must only be accurate enough to preserve the combined linearity of the stages after the amplifier, the effect of interstage gain errors on linearity is small.
4.3. Offset Errors

A block diagram of one stage in a pipelined A/D converter with offset error in the S/H amplifier is shown in Figure 4.3.

To model an offset error in the S/H amplifier, the nonideal amplifier is replaced by an ideal amplifier in series with the offset error; the replacement is surrounded by a dotted line. A pipelined A/D converter consists of an input S/H amplifier followed by a concatenation of such stages. An offset error in the input S/H circuit causes an input-referred converter offset but does not affect the linearity of the conversion. Without digital correction, the offset errors in the interstage S/H amplifiers, however, do affect the linearity of the conversion; the magnitude of this effect is now calculated.

Consider a 2-stage pipelined A/D converter whose elements are ideal except for an offset that occurs after the interstage gain. Although code$_1$ and residue$_1$ are ideal, the second stage input is incorrect and may cause code$_2$ to be in error. From Equations 4.1 and 4.2,
Substituting Equation 4.3 into Equation 4.9 yields:

\[ \text{code}_2 = \left[ \frac{G_1 \text{residue}_1 + \text{offset}}{\text{ref}_2} + 1 \right] 2^{n_1 - 1}. \] (4.9)

After simplification,

\[ \text{code}_2 = \left[ \frac{\text{residue}_1}{\text{ref}_1} + \frac{\text{offset}}{G_1 \text{ref}_1} 2^{n_1 - 1} \right]. \] (4.10)

The first and third terms on the right side of Equation 4.11 are independent of the offset. Because \( \frac{2(\text{ref}_1)}{2^N} \) is 1 LSB, if \( \left| \frac{\text{offset}}{G_1} \right| < 1/2 \text{ LSB} \), \text{code}_2 and the entire conversion are accurate within ±1/2 LSB.

In this 2-stage example, the effect of offset in the second stage on the linearity of the conversion is divided by the interstage gain, and the general effect of offset in any stage on the linearity is divided by all the interstage gains before the offset. To the extent that the interstage gain is large, the effect of offset in stages after the first on linearity is therefore small.

4.4. Nonlinearity in A/D Subconversions

The effects of nonlinearity in the A/D subconversions depend on whether digital correction is used. If digital correction is not used, while the first-stage A/D subconversion must be as linear as the resolution of the entire converter, the linearity requirements on later-stage A/D subconversions are divided by the combined interstage gain before the stage. With digital correction, on the other hand, the linearity of each A/D subconversion must only be commensurate with its resolution.

4.4.1. Nonlinearity in A/D Subconversions without Digital Correction

A block diagram of one stage in a pipelined A/D converter with nonlinearity in the A/D subcon-
A 2-bit stage is used as a representative example. Nonlinearity in the A/D subconverter is modeled as an input-referred linearity error. The effect of this nonlinearity is studied by examining plots of the residue versus the input. Two such plots are shown in Figures 4.5 and 4.6.

In Figure 4.5, both the A/D subconverter and the D/A subconverter are assumed to be ideal.
the A/D subconverter, the A/D subconverter and D/A subconverter outputs are constant; therefore, the residue rises with the input. When the input crosses a decision level, the A/D subconverter and D/A subconverter outputs increase by 1 LSB at a 2-bit level, so the residue decreases by 1 LSB. Here, the residue is always between ±1/2 LSB and consists only of the part of the input that is not quantized by the first stage. With the interstage gain equal to 4, the maximum residue is amplified into a full-scale input to the next stage; therefore, the conversion range of the next stage is equal to the maximum residue out of the first stage.

A similar curve is shown in Figure 4.6 for a case when the A/D subconverter has some nonlinearity, but the D/A subconverter is still ideal.

![Residue vs. input for nonideal A/D subconverter and ideal D/A subconverter](Image)

In this example, two of the A/D subconverter decision levels are shifted, one by -1/2 LSB and the other by +1/2 LSB, both at a 2-bit level. When the input crosses a shifted decision level, the residue decreases by 1 LSB. If the decision levels are shifted by less than 1/2 LSB, the residue is always between ±1 LSB. Here, the residue consists of both the unquantized part of the input and the error caused by the A/D subconverter nonlinearity. Even if the nonlinearity is small, the digital output code of this stage is incorrect if the stage input is close enough to a decision level; moreover, without digital correction, an error in the
digital output code of any stage in the conversion causes an error in the conversion. Because the input to
the first stage is the input to the converter, to obtain ±1/2-LSB accuracy in a N-bit multistage conversion,
the first-stage code must be correct whenever the converter input is more than 1/2 LSB away from an
ideal decision level at a N-bit level. In other words, if digital correction is not used, the first-stage A/D
subconversion must be as linear as the entire conversion. In later stages, the input is a multiplied residue
from a previous stage. The corresponding linearity requirement in later stages is therefore divided by the
combined interstage gain before the stage.

4.4.2. Nonlinearity in A/D Subconversions with Digital Correction

Because the D/A subconverter in Figure 4.4 is assumed to be ideal, the increased residues in Figure
4.6 are accurate for the codes to which they correspond; therefore, at this point, no information is lost. If
the interstage gain is still 4, however, information is lost when the larger residues saturate the next stage
and produce missing codes in the conversion. Therefore, if the conversion range of the next stage is
increased to handle the larger residues, they can be encoded and the errors corrected. This process is
called digital correction and is described next.

A block diagram of an example of a 2-stage pipelined A/D converter with digital correction is
shown in Figure 4.7. The new elements in this diagram are the pipelined latches, the digital correction
logic circuit, and the amplifier with a gain of 0.5. The amplifier with a gain of 0.5 is conceptual only and
is drawn to show that the interstage gain is reduced by a factor of 2 so that nonlinearity error in an
amount between ±1/2 LSB at a n1-bit level in the first-stage A/D subconversion does not produce resi-
dues that saturate the second stage. If the first stage is perfectly linear, only half the conversion range of
the second stage is used. Therefore, 1 bit from the second stage is saved to digitally correct the outputs
from the first stage; the other n2-1 bits from the second stage are added to the total resolution. After the
pipelined registers align the outputs in time so that they all correspond to one input, the digital correction
block detects overrange in the outputs of the second stage and changes the output of the first stage by 1
LSB at a n1-bit level if overrange occurs. Because the ideal residue is always between ±1/2 LSB, over-
range occurs when the residue is greater than 1/2 LSB or less than -1/2 LSB at a n1-bit level. For the
block diagram of Figure 4.4 and the corresponding ideal residue plot of Figure 4.5, when the residue reaches 1/2 LSB, the first-stage digital output should increase by 1 code, and the corresponding residue should decrease to -1/2 LSB. If the real residue exceeds 1/2 LSB, the first-stage digital output has not changed state and is 1 code too small; therefore, the digital correction logic adds 1 to the first-stage digital output. This happens when the second-stage code is greater than $\frac{3}{4} 2^n$. Conversely, if the real residue is less than -1/2 LSB, the second-stage code is less than $\frac{1}{4} 2^n$; the first-stage digital output is 1 code too large, and the correction logic subtracts 1 from the first-stage digital output. When the real residue is between ±1/2 LSB, no correction is made to the first-stage digital output. Figure 4.8 shows a block diagram of such a digital correction circuit.
Since subtraction is equivalent to addition after an appropriate offset, the required correction logic can be simplified through the introduction of offsets that eliminate the need for the correction logic to do subtraction. Figure 4.9 shows a block diagram of one 2-bit stage in a pipelined A/D converter with an offset in the A/D subconverter.
With this offset, the ideal residue versus input plot changes from that shown in Figure 4.5 to that shown in Figure 4.10. It can be seen that the A/D subconverter offset uniformly changes the location of the decision levels. If nonlinearity could move the decision levels in Figure 4.5 by as much as ±1/2 LSB at a 2-bit level, the combination of the -1/2-LSB A/D subconverter offset introduced in Figure 4.9 and the same nonlinearity can move the decision levels by 0-1 LSB from their original positions shown in Figure 4.5. The first-stage digital output is therefore always less than or equal to its ideal value, and correction requires either no change or addition; however, if the shift in a decision level from its original position is more than 1/2 LSB, the amplified residue will exceed the conversion range of the second stage, and the correction will fail. To overcome this limitation, an offset can be introduced to the D/A subconverter output; see Figure 4.11.
Figure 4.10 - Ideal residue vs. input with an A/D subconverter offset

Figure 4.11 - Block diagram of one stage with both A/D and D/A subconverter offsets
Figure 4.12 shows that such an offset shifts the x axis of the residue versus input plot up by 1/2 LSB.

Now if the decision levels in Figure 4.12 move by ±1/2 LSB, the amplified residue will remain within the conversion range of the second stage, and the correction will work. Here, the correction circuit adds 1 code to the first-stage output when the residue is greater than 0; therefore, the second-stage MSB directly determines whether to add 1 code to the first-stage output. Figure 4.13 shows a block diagram of such a simple digital correction circuit. In practice, such offsets are easily introduced in resistor-string based A/D and D/A subconverters by moving the tap locations on the resistor strings. If the last stage in the pipelined A/D converter uses an A/D subconverter identical to those in the previous stages (that is, with the offset), the offset in the last stage shifts the entire transfer curve by 1/2 LSB and produces a mid-tread rather than a mid-rise characteristic; see Figure 4.14. A mid-tread characteristic reduces the converter output noise when the input is less than ±1/2 LSB and biased at zero.

Digital correction improves linearity by allowing the converter to postpone decisions on inputs that are near the first-stage A/D subconverter decision levels until the residues from these inputs are amplified to the point where similar nonlinearity in second-stage A/D subconverter is insignificant. The capability of the correction to overcome first-stage A/D subconverter nonlinearity therefore relies on the second stage to contribute less nonlinearity than the first to the entire conversion. In a 2-stage converter with an
interstage amplifier, if the interstage gain is greater than 1, the input-referred contribution of nonlinearity in the second stage is divided by the interstage gain. With an interstage gain equal to 1, however, non-
linearity in the second-stage A/D subconversion refers directly to the input and must be made less than or equal to the complete nonlinearity specification. Because subranging converters that do not use inter-stage amplifiers have an interstage gain no larger than 1, nonlinearity in the fine A/D subconversion of such converters contributes unattenuated nonlinearity to the entire A/D conversion; therefore, digital correction is not useful in such converters.

Although a 2-stage example has been used to present digital correction, the correction can be applied to all stages except the last in a pipelined A/D converter with more than 2 stages. If the interstage gains in such a converter are all greater than 1, nonlinearity in the last stage has the least effect on the linearity of the entire conversion; therefore, the correction algorithm should first allow the outputs of the last stage to correct those of the next-to-last stage. Then the correction should continue backwards with the outputs of each stage used to correct those of the previous stage until the first stage outputs are corrected. In this way, for example, instead of using the raw second-stage outputs, the corrected second-stage outputs are used to correct the first-stage outputs.

To do the digital correction, besides the simple correction logic circuit, extra comparators are required. If flash subconverters are used in all stages, all stages after the first require twice as many comparators than without digital correction. None of the comparators need to be offset canceled, however, because the digital correction algorithm eliminates the effect of offsets in all comparators except those in the last stage, and the combined interstage gain before the last stage reduces the effect of offsets in the last-stage comparators to the point where it is unimportant. Furthermore, the offset errors in the interstage S/H amplifiers do not affect linearity if digital correction is used. In Figure 4.3, for example, the offset can be referred to the input of the amplifier by dividing the offset by the closed-loop gain of the amplifier. Because addition is commutative, the divided offset can be pushed to the left of the first-stage subtractor. To move the divided offset to the input branch, where it causes an input-referred offset to the stage, an equal but opposite offset must be inserted in the first-stage A/D subconverter branch; see Figure 4.15. If the correction range is not exceeded, the effect of the offset in the A/D subconverter branch is eliminated by the digital correction. Using this process repetitively, the offsets in all the interstage
amplifiers in a pipelined A/D converter with digital correction can be referred to the input of the converter. A pipelined A/D converter with digital correction therefore does not need offset cancellation on any comparator or op amp, and the speed versus accuracy tradeoffs associated with offset cancellation at video rates (presented in Chapter 3) are eliminated.

4.5. Nonlinearity in D/A Subconversions

It is shown above that with digital correction, the effect of nonlinearity in all the A/D subconverters except the last can be eliminated if the D/A subconverters are ideal. As a result, the A/D subconverters must only be about as linear as their resolution; however, the D/A subconverters must be more linear than their A/D counterparts for the correction algorithm to work. This requirement on the D/A subconverters is not caused by the correction algorithm; instead the correction lessens the linearity requirement on the A/D subconverters and does not change that of the D/A subconverters. However, the digital correction algorithm does increase the accuracy requirement on the gain of the amplifier following each D/A subconverter. Without digital correction, the interstage gain must only be accurate enough to preserve the
combined linearity of the stages after the amplifier. With digital correction, the gain must be accurate enough to preserve the combined linearity of the stages including and after the amplifier.

A block diagram of one 2-bit stage in a pipelined A/D converter with nonlinearity in the D/A sub-converter is shown in Figure 4.16.

Nonlinearity in the D/A subconverter is modeled as an output-referred linearity error. The effect of this nonlinearity is studied by examining a plot of the residue versus the input. With an ideal D/A subconverter, the D/A subconverter output increases by 1 LSB when the input crosses an A/D subconverter decision level, and the residue decreases by exactly 1 LSB at each decision level crossing; see Figures 4.5 and 4.6. With a nonideal D/A subconverter, on the other hand, the amount that the residue jumps is not necessarily equal to 1 LSB. Figure 4.17 shows such a plot with a nonlinear D/A subconverter and an ideal A/D subconverter. In this example, two of the D/A subconverter output levels are changed, one by -1/2 LSB and the other by +1/2 LSB. Although the digital outputs of this stage are correct, the residue...
output of the stage corresponding to two of the four output codes is incorrect exactly by the amount of 
D/A subconverter nonlinearity. Not only will the digital outputs of the remaining stages be incorrect, but 
also if digital correction is used, the digital outputs of this stage will be changed so that they too are 
incorrect. Therefore, with or without digital correction, the D/A subconverter in any stage must be at 
least as linear as the combined resolution of this and the later stages. If the D/A subconverters are identi-
cal, and if the interstage gains are all greater than 1, the D/A subconverter in the first stage determines the 
linearity of the entire A/D converter. Such D/A subconverters can be realized with resistor strings for 
linearities in the 8-bit to 9-bit range. For integral linearity greater than 9 bits, the design of such a D/A 
subconverter is not trivial and either requires calibration or trimming.

4.6. Operational Amplifier Settling-Time Errors

The main disadvantage of pipelined A/D converters is that they require the use of op amps to real-
ize parasitic-insensitive, S/H amplifiers to do analog subtraction and amplification at the sampling rate of 
the A/D converter. Although the S/H amplifiers improve many aspects of the converter performance, the
op amps within the S/H amplifiers limit the speed of the pipelined converters and pose the main obstacle to attaining video rates in pipelined A/D conversion.

For video-rate A/D conversion at 3 times the color subcarrier frequency, the conversion time is \( \frac{1}{10.7 \text{ MHz}} = 93 \text{ ns} \). As stated in Chapter 3, to pipeline a multi-stage conversion, a 2-phase nonoverlapping clock is required. Figure 4.18 shows both phases of such a clock.

Allowing about 10 ns to make sure that the 2 phases are not high at the same time, the duration of each phase is less than about 40 ns, and all the nodes in the converter must have settled to enough accuracy in less than this amount of time. The most critical node is the output of the input S/H amplifier; the accuracy requirements of the outputs of the other S/H amplifiers are divided by the combined interstage gain up to and including the amplifier under consideration. In Chapter 5, the influence of the settling-time requirement on the design of the amplifiers is considered.
5.1. Introduction

The goals in the design of the prototype were to obtain at least 8-bit resolution and linearity at at least 10 Ms/s conversion rate in the minimum area with a standard CMOS process so that the prototype would be small enough to integrate on the same chip as a video-rate digital signal processor. Although the prototype does occupy a small area (8500 mils²), it has 9-bit resolution, 9-bit differential linearity, and 8-bit integral linearity at a conversion rate of 5 Ms/s. In this chapter, the design of the prototype is described from both high-level and detailed perspectives.

5.2. High-Level Design Considerations

Several important design considerations for the prototype converter are now presented. To minimize design time, assume that all stages are identical. Fast op amps and flash subconverters are used to do the conversion at as high a speed as possible. The most basic architectural decision is to choose the resolution per stage; for efficient use of the conversion range of each stage, this choice determines the corresponding value of interstage gain. To attain maximum throughput rate, the resolution per stage should be small so that the interstage gain is small and the corresponding closed-loop bandwidth of the gain block is large. Conversely, large resolution and corresponding gain per stage are desirable to achieve high linearity because the contributions of nonidealities in all stages after the first are reduced by the combined interstage gain preceding the nonideality. Thus the speed and linearity requirements conflict in determining the optimum resolution per stage. It is shown below under certain simplifying assumptions that to minimize the amount of required hardware, the optimum resolution per stage is about 3 or 4 bits per stage, which is about midway between the high and low ends. This compromise in the resolution per stage keeps both the number of op amps and the number of comparators small. Also, because a goal of this project was to realize an A/D converter small enough that it could be incorporated within a primarily digital chip, the A/D converter must be able to operate in the presence of large power supply noise caused by the digital circuits. To reduce the sensitivity of the converter to this noise, all
analog signal paths in the prototype are fully differential. Since the prototype has not been integrated on such a digital chip, however, the degree to which it would work in such an environment has not been determined. Finally, to minimize errors caused by variations in the timing, the simplest possible clocking sequence is desired. A 2-phase, nonoverlapping clock is required to do the pipelining. As described later in this chapter, a change in the clock sequence is required to cancel the sample-to-hold-mode transition error; however, this change can be implemented with no changes to the clock generator, keeping the timing generation circuit simple.

5.3. Optimum Architecture

The resolution per stage that minimizes the area required for the entire conversion is now calculated. Denote the resolution per stage by \( n \), the total resolution by \( N \), the total area by \( A_T \), and the area of one conversion stage by \( A_S \). Assume that all stages are identical. Because the number of stages is then \( \frac{N}{n} \),

\[
A_T = \left( \frac{N}{n} \right) A_S. \tag{5.1}
\]

Each stage consists of a S/H amplifier, an A/D subconverter, and a D/A subconverter. If the subconverters share a common resistor string, the total area occupied by both subconverters in one stage is about equal to the number of comparators times the area of one comparator. Let \( A_C \) denote the area of one comparator and all associated switches, resistors, capacitors, and decoding elements. Also let \( A_{S/H} \) denote the area of the S/H amplifier and its switches and capacitors. Then the area of one stage is:

\[
A_S = A_{S/H} + A_C (2^n - 1). \tag{5.2}
\]

Let \( R \) denote the ratio of \( A_{S/H} \) to \( A_C \); assume that \( R \) is constant. Rewriting Equation 5.2 to include \( R \) gives:

\[
A_S = R A_C + A_C (2^n - 1). \tag{5.3}
\]

Substituting Equation 5.3 into Equation 5.1 yields:
\[ A_T = A_C \left( \frac{N}{n} \right) \left( R - 1 + 2^n \right). \]

Equation 5.4 shows that if \( n \) is small, \( A_T \) is large because \( \frac{N}{n} \) is large, and if \( n \) is large, \( A_T \) is large because \( 2^n \) is large. Thus, the total area is minimized for medium values of the resolution per stage. The value of \( n \) that produces minimum area depends only on \( R \). If \( R = 10 \), a few simple calculations show that \( n = 3 \frac{\text{bits}}{\text{stage}} \) gives the minimum area. Figure 5.1 shows a plot of total area versus resolution per stage for an example with \( N = 12, R = 10, \) and \( A_C = 100 \text{ mils}^2 \).

As predicted, the minimum area occurs when \( n = 3 \frac{\text{bits}}{\text{stage}} \). Because the slope of the curve in the region near the minimum is small, both \( n = 2 \frac{\text{bits}}{\text{stage}} \) and \( n = 4 \frac{\text{bits}}{\text{stage}} \) require little extra area. If one bit of digi-
tall correction is always applied per stage, however, one bit of resolution is lost per stage, and the total resolution with digital correction, \( N_{DC} \), becomes:

\[
N_{DC} = N - \frac{N}{n} + 1.
\] (5.5)

Therefore, with digital correction, while \( n = 3 \) \( \frac{\text{bits}}{\text{stage}} \) yields 9-bit resolution, \( n = 2 \) \( \frac{\text{bits}}{\text{stage}} \) produces only 7 bits and \( n = 4 \) \( \frac{\text{bits}}{\text{stage}} \) gives 10 bits. These differences in the real resolution make it necessary to adjust the associated areas in Figure 5.1 to reach a correct conclusion. As a basis of comparison, the area required for 9-bit resolution is considered. To achieve 9-bit resolution with \( n = 2 \) \( \frac{\text{bits}}{\text{stage}} \), instead of 6 stages of conversion, 8 stages are required, increasing the required area by 33% from that shown in Figure 5.1. With \( n = 4 \) \( \frac{\text{bits}}{\text{stage}} \), the resolution of the third stage can be reduced by 1 bit to produce 9-bit total resolution. The resulting area is reduced by 17% from that shown in Figure 5.1, and the area is less than but close to that required for \( n = 3 \) \( \frac{\text{bits}}{\text{stage}} \). Therefore, when the ratio of op-amp area to comparator area is about 10, the resolution per stage that requires the least area is about 3 or 4 \( \frac{\text{bits}}{\text{stage}} \).

To simultaneously meet the speed, linearity, and area requirements, the prototype is divided into 4 stages with 3 bits produced per stage. A block diagram of one stage is shown in Figure 5.2. The A/D subconversions are done with flash converters, so each stage needs 7 comparators. The S/H amplifier block replaces both the S/H circuit and interstage amplifier shown in some earlier figures. Because the interstage gain is 4 instead of 8, half the range and one bit from each of the last three stages are saved to digitally correct the outputs of the previous stages. Thus, instead of obtaining 3 bits of resolution from each of these stages, only 2 bits of resolution are obtained from each. In total, 9 bits of resolution are produced, using 28 comparators and 4 op amps.

### 5.4. Sample-and-Hold Amplifier

The design of the S/H amplifier is described below. The topic is divided into two major sections. The first section considers the design of a CMOS S/H circuit not including the amplifier, and the second...
section considers the design of the CMOS amplifier.

5.4.1. S/H Circuit

Until recently, S/H circuits were considered to be complex, stand alone functional units, yet monolithic versions are now available. With CMOS technology, zero-offset switches and zero-gate-current transistors eliminate sample-mode offset and hold-mode droop as sources of error and therefore reduce the complexity of a S/H circuit from that needed in other technologies. However, the same switches that lessen complexity also cause errors that limit the speed and accuracy of CMOS S/H circuits. These errors are described below along with circuit approaches that reduce the magnitudes of their effects.

5.4.1.1. Sources of Error

The major sources of error in CMOS S/H circuits are nonzero acquisition time, finite bandwidth in the sample mode, aperture jitter, thermal noise, and sample-to-hold-mode transition error. Figure 5.3 shows an example input and the effects of some of these errors on this input. Acquisition time is the time
required for the transient error in the S/H circuit associated with the tracking of a switched input to become negligible. The bandwidth in the sample mode is the frequency at which the magnitude of the steady-state output is 3 dB less than the input. Finite bandwidth causes a steady-state tracking error.

Aperture delay is the time difference between the sample-to-hold-mode transition and the real sampling instant. Aperture delay is not in itself a problem. Variation in the aperture delay, however, is a problem and is called aperture jitter. The unwanted random signal resulting from the channel resistance of the
sampling switch is the thermal noise, but is not illustrated in Figure 5.3. The sample-to-hold-mode transition error is the amount that the output changes as a result of the transition from the sample mode to the hold mode. The effects of these sources of error are considered below.

5.4.1.1.1. Acquisition Time

Nonzero acquisition time and finite bandwidth in the sample mode may prevent the S/H circuit from accurately tracking the input before the hold command occurs. To limit the effect of these errors, the time that the circuit spends in the sample mode must be greater than or equal to the acquisition time, and the bandwidth of the circuit in the sample mode must be much greater than the maximum frequency of the input. Figure 5.4 shows a circuit diagram of a simple S/H circuit.

![Circuit diagram of a simple S/H circuit](image)

Figure 5.4 - Circuit diagram of a simple S/H circuit

Assume that the switch is accurately modeled as a clock-voltage-dependent resistor. When $\phi$ is high, the switch is considered to be on, and its resistance is $r_{on}$. When $\phi$ is low, the switch is considered to be off and to have infinite resistance. Assume that the input as a function of time, $v_{in}(t)$, is sinusoidal and starts at its maximum value:

$$v_{in}(t) = A \cos(\omega t),$$  \hspace{1cm} (5.6)

where $A$ is the amplitude of the input, and $\omega$ is the angular frequency of the input.

When $\phi$ changes from low to high, the circuit enters the sample mode, and the input is connected across the capacitor through the switch. Assume that this transition happens at $t = 0$. When $\phi$ changes from
high to low, the circuit enters the hold mode, and the voltage on the capacitor is held constant at its final sample-mode value, which is called the "sample". The sample-to-hold-mode transition must not occur until the S/H circuit has acquired the input. To find the acquisition time, \( t_a \), the circuit can be modeled as a low-pass filter whose input turns on at \( t = 0 \):

\[
v_{in}(t) = \begin{cases} 
0 & \text{if } t < 0 \\
A \cos(\omega t) & \text{if } t \geq 0 
\end{cases}
\]

Here it can be shown that the output, \( v_o(t) \), is:

\[
v_o(t) = -A \left( \frac{\alpha^2}{\alpha^2 + \omega^2} \right) e^{-\alpha t} + A \left( \frac{\alpha^2}{\alpha^2 + \omega^2} \right) \cos(\omega t) + A \left( \frac{\alpha \omega}{\alpha^2 + \omega^2} \right) \sin(\omega t), \quad (5.8)
\]

where \( \alpha = \frac{1}{r_c C} \).

The first term in Equation 5.8 represents the error made in sampling because of nonzero acquisition time. This error decays exponentially with increasing time; the time constant is equal to \( \frac{1}{\alpha} = r_c C \). For inputs of any frequency, the magnitude of the coefficient of the first term is:

\[
A \left( \frac{\alpha^2}{\alpha^2 + \omega^2} \right) \lesssim A. \quad (5.9)
\]

For a full-scale input, to limit the magnitude of the acquisition-time error to \( \pm 1/2 \) LSB at a N-bit level,

\[
A e^{-\alpha t} \leq \frac{2A}{2^{N+1}}. \quad (5.10)
\]

This condition can be rewritten as:

\[
t \geq t_a, \quad (5.11)
\]

where \( t_a = \frac{N}{\alpha} \ln(2) = N (r_c C) \ln(2) \).

For example, about 6.2 time constants are required to acquire the input to 9-bit accuracy.

### 5.4.1.1.2. Finite Bandwidth in the Sample Mode

Equation 5.8 can be rewritten as:
\[ v_o(t) = -A \left( \frac{\alpha^2}{\alpha^2 + \omega^2} \right) e^{-\alpha t} + A \left( \frac{\alpha}{\sqrt{\alpha^2 + \omega^2}} \right) \cos \left( \omega t - \tan^{-1} \left( \frac{\omega}{\alpha} \right) \right) \] (5.12)

The second term in Equation 5.12 is the steady-state output and is usually obtained through phasor analysis. Because the bandwidth of the S/H circuit is limited, the steady-state output differs in both magnitude and phase from the input. If the phase shift is made small enough for NTSC systems, the amplitude error also happens to be acceptable, as shown below. From Equation 5.12, the phase shift, \( \phi \), is:

\[ \phi = -\tan^{-1} \left( \frac{\omega}{\alpha} \right) = -\tan^{-1} \left( \frac{f}{f_{-3\,\text{dB}}} \right) \] (5.13)

where \( f = \frac{\omega}{2\pi} \)

and \( f_{-3\,\text{dB}} = \frac{1}{2\pi r_{on} C} \).

To digitally decode the color information in a PCM NTSC television signal, the precise phase relationship between the sampling frequency and the color subcarrier frequency must be known. Since phase shifts less than 4° or 5° are not discernible to the eye, if the bandwidth of the S/H circuit is greater than about 50 MHz, the change in the phase is acceptable for such signals.

Not only is the phase of the input shifted because of the limited bandwidth, but also because the phase shift is a nonlinear function of the frequency, the time delay, \( t_d \), is frequency dependent:

\[ t_d(f) = \frac{\phi}{2\pi f} = -\frac{\tan^{-1} \left( \frac{f}{f_{-3\,\text{dB}}} \right)}{2\pi f} \] (5.14)

The maximum phase-induced time jitter, \( t_j \), is the difference between the time delays of the highest and lowest frequencies of interest.

\[ t_j = t_d(f = \text{minimum}) - t_d(f = \text{maximum}) \] (5.15)

When the frequency of the input is much less than the 3-dB bandwidth,

\[ \tan^{-1} \left( \frac{f}{f_{-3\,\text{dB}}} \right) = \frac{f}{f_{-3\,\text{dB}}} \] (5.16)
and

\[ t_d = \frac{1}{2\pi f_{-3\text{dB}}} \]  

(5.17)

Therefore, if the -3-dB frequency is much larger than the maximum input frequency, the phase shift is almost a linear function of frequency, and the time delay is almost independent of the input frequency. Substituting Equation 5.17 into Equation 5.15 for the time delay of low input frequencies produces:

\[ t_j = \frac{1}{2\pi f_{-3\text{dB}}} - t_d(f = \text{maximum}). \]  

(5.18)

It turns out that if the bandwidth is chosen to keep the absolute phase shift small enough, the phase-induced time jitter is negligible. For example, with a -3-dB bandwidth of 50 MHz, the phase-induced time jitter is about 8 ps.

Finally, under the same conditions, Equation 5.12 shows that the droop in the passband at 4.2 MHz is about 0.03 dB, which is acceptable for video-rate applications. Therefore, to the extent that the input frequency is small compared to the bandwidth, the error caused by the finite bandwidth of the S/H circuit in the sample mode is small.

5.4.1.1.3. Aperture Jitter

Although the dedicated S/H circuit eliminates the effect of skew between the sampling moments of multiple S/H circuits, aperture jitter in the one S/H circuit still corrupts the sampled version of the input. Like nonlinear phase-induced error, aperture jitter causes variations in the sampling instant that, for time varying inputs, may result in amplitude errors. From Equation 3.8, for 9-bit resolution and a 4.2-MHz input, the minimum time that the input takes to change in amplitude by 1 LSB is 148 ps. Thus if the jitter is less than 148 ps in an otherwise ideal 9-bit A/D converter, the amplitude error in sampling a sinusoidal input with any frequency up to 4.2 MHz and any amplitude up to full scale is less than 1 LSB. The conversion is then accurate to at least 8 bits, and the corresponding maximum SNR is about 48 dB.
5.4.1.1.4. Thermal Noise

The nonzero channel resistance of the switch in the S/H circuit of Figure 5.4 produces broad-band thermal noise that is sampled onto the sampling capacitor along with the input. Because this broad-band thermal noise is band limited by the low-pass filter formed by the switch resistance and sampling capacitor inherent in the S/H circuit, the standard deviation of the expected noise power sampled onto the capacitor is \( \sqrt{\frac{kT}{C}} \), where \( k \) is Boltzmann's constant and \( T \) is the absolute temperature. If \( C = 1 \) \( pF \) as in the prototype, \( \sqrt{\frac{kT}{C}} = 64 \) \( \mu V \). For 9-bit resolution and a 5-V differential reference, 1 LSB = 19.5 \( mV \). Therefore, although the thermal noise poses a fundamental limit to the accuracy of the S/H circuit, such noise is negligible in the prototype.

5.4.1.1.5. Sample-to-Hold-Mode Transition Error

Sample-to-hold-mode transition error is caused by both charge injection and clock feed-through. The amount of charge that moves from the channel of the sampling switch to the sampling capacitor on the sample-to-hold-mode transition depends on the source impedance, the size of the sampling capacitor, and the clock fall time; moreover, the induced error consists of both an offset component that is independent of the magnitude of the input and a gain component that is proportional to the magnitude of the input. Since high-speed circuits use short clock fall times, small sampling capacitors, and large sampling switches, the magnitude of the transition error is greater in high-speed circuits than in low-speed circuits. Differential circuits are used to cancel the offset component of the error. Bottom-plate switching is used to cancel the gain component of the error. Figure 5.5 shows a circuit diagram of a fully-differential S/H circuit that uses bottom-plate switching. Transistors M1 and M2 switch the differential input to the top plates of the sampling capacitors, \( C_{S1} \) and \( C_{S2} \). Transistors M3 and M4 are used to ground the bottom plates of the sampling capacitors, which are connected to the inputs of the op amp. To acquire the inputs, M1-M4 are closed. Instead of turning all switches off at the same time, M3 and M4 are opened before M1 and M2; see Figure 5.6. Here \( \phi_1 \) is shown as a delayed version of \( \phi'_1 \). When \( \phi'_1 \) goes low, M3 and M4 turn off, and the charge injection and clock feed-through errors from M3 and M4...
are stored on $C_{S1}$ and $C_{S2}$, respectively. After M3 and M4 are opened, however, the bottom plates of both capacitors are only connected to parasitic capacitances, $C_{p1}$ and $C_{p2}$, respectively. Therefore, when
M1 and M2 are turned off, only a fraction of the charge injection and clock feed-through errors from M1 and M2 are stored on \( C_{S1} \) and \( C_{S2} \). Assume that the circuit is perfectly balanced so that transistors M1 and M2 are identical, M3 and M4 are identical, \( C_{S1} = C_{S2} = C_S \), and \( C_{P1} = C_{P2} = C_P \). Then the fraction of charge injection and clock feed-through errors from M1 and M2 that is stored on the sampling capacitors is \( \frac{C_P}{C_P + C_S} \). To the extent that the parasitic capacitance is less than the sampling capacitance, M1 and M2 have little effect on the transition error. Because M3 and M4 are both referenced to ground instead of to the differential input, they make no contribution to the component of the transition error that is proportional to the input; instead they contribute only a common-mode offset error. If the transition error is dominated by M3 and M4, the differential circuit cancels the transition error. The degree of cancellation depends on the extent to which supposedly identical components match each other.

5.4.1.2. Limitations

Several variables directly affect both the speed and the accuracy of the S/H circuit. On the one hand, the speed is proportional to the width of the sampling switch and inversely proportional to both the length of the sampling switch and to the size of the sampling capacitor. On the other hand, the key source of error, the sample-to-hold-mode transition, is proportional to both the width and length of the sampling switch but inversely proportional to the size of the sampling capacitor. Therefore, when the charge injection limits the S/H circuit accuracy, the width of the sampling switch and the size of the sampling capacitor can be used to trade accuracy for speed, whereas a reduction in the length of the sampling switch simultaneously improves both the speed and the accuracy of the S/H circuit. Because fully differential S/H circuits reduce the charge injection and clock feed-through errors to matching errors, and because even small sampling capacitors make the magnitude of the thermal noise negligible at a 9-bit level, the limitation on S/H circuit accuracy at a given speed depends the degree of component matching and the minimum usable channel length of the process technology.
5.4.2. Amplifier

Once the input is sampled, to reproduce the held input elsewhere in the converter so that the magnitude of the reproduction is not sensitive to the size of the parasitics, an op amp is required with the S/H circuit. The combination is called a S/H amplifier and is expanded in Figure 5.7.

![Figure 5.7 - Block diagram of the S/H amplifier](image)

Figure 5.8 shows that the clock is divided into two nonoverlapping phases and two phases to cancel sample-to-hold-mode transition error. The timing diagram for $\phi_1$ and $\phi'_1$ is unchanged from that in Figure 5.6. While both $\phi_1$ and $\phi'_1$ are high, the sampling capacitors ($4C_I$) acquire the input, and the integrating ($C_I$) and common-mode feedback ($C_{CM}$) capacitors are reset. When $\phi'_1$ goes low, the input is sampled. If the parasitic capacitances on the inputs of the op amp are much less than the sampling capacitances, the sampled input is unchanged when $\phi_1$ goes low. During $\phi_2$, the left sides of the sampling capacitors are connected together so that the difference between the two sampled inputs is amplified by the ratio of the sampling to integrating capacitors. To the extent that the op amp in a closed-loop configuration drives its differential input to zero, the gain is insensitive to parasitic capacitances on either the top or bottom plates of any of these capacitors. Meanwhile, the common-mode feedback (CMFB) capacitors are connected to the outputs of the op amp to start the CMFB circuit. Switched-capacitor
Figure 5.8 - Timing diagram of the clock signals

CMFB^{54,55} is useful in pipelined A/D converters because pipelined converters inherently allow a clock phase needed to reset the capacitor bias.

5.4.2.1. Sources of Error

Once the input to a conversion stage is sampled onto a sampling capacitor, the amplifier associated with the stage reproduces the sampled input on the comparator inputs of this stage and the sampling capacitor of the next stage. Because the offset of the amplifier is not canceled by closing a feedback loop around the amplifier while the input is sampled onto the sampling capacitor, the amplifier is never required to track a high-frequency input signal. Therefore, there are no dynamic error sources in the op amp. There are two important types of static errors, however. They are nonideal values interstage gain and nonzero settling time.
5.4.2.1.1. Finite Open-Loop Gain

Interstage gain errors arise from two sources: finite open-loop gain of the amplifier and component mismatch. Figure 5.9 shows a representative single-ended switched-capacitor amplifier during its amplifying phase.

![Block diagram of a switched-capacitor amplifier in its amplifying phase](image)

Figure 5.9 - Block diagram of a switched-capacitor amplifier in its amplifying phase

$C_S$ is the sampling capacitor, and $C_I$ is the integrating capacitor. $C_P$ represents the parasitic capacitance on the inverting op-amp input, and $C_L$ represents the capacitive load. Assume that the input to the amplifier is properly biased and that the amplifier is ideal except for its limited open-loop gain, $A$. The DC closed-loop gain, $G(s=0)$, is:

$$G(s=0) = \frac{v_o}{v_i} = -\left( \frac{C_S}{C_I \left( \frac{C_S + C_I + C_P}{A} \right)} \right)$$  \hspace{1cm} (5.19)

Note that $C_L$ has no effect on the closed-loop gain. If the open-loop gain is infinite, the closed-loop gain equals $-\left( \frac{C_S}{C_I} \right)$ and is entirely determined by the ratio of the two capacitors. Depending on the technology and the geometries of the capacitors, the error in this ratio can be as little as about 0.1%.\textsuperscript{56,57} If the open-loop gain is limited, another error is introduced into the closed-loop gain. Equation 5.19 can be rewritten as:
The closed-loop gain is determined by the ratio of the sampling to the integrating capacitors to the extent that \( A \left( \frac{C_I}{C_S + C_I + C_P} \right) \) is greater than 1; furthermore, finite open-loop gain reduces the closed-loop gain by 1 part in \( A \left( \frac{C_I}{C_S + C_I + C_P} \right) \). Note that parasitic capacitance at the input to the op amp increases the open-loop gain requirement. For example, to produce a closed-loop gain of 16 within ±6% using ideal capacitors in Figure 5.9, if \( C_P = C_S \), the open-loop gain must be greater than about 550.

5.4.2.1.2. Non-zero Settling-Time

Assume that the open-loop gain of the amplifier, \( A(s) \), in Figure 5.9 has a 1-pole frequency response given by:

\[
A(s) = \frac{A}{1 + s\tau}
\]

where \( A \) is the DC open-loop gain, 
\( s \) is the complex frequency, 
and \( \tau \) is the output time constant.

The output time constant, \( \tau \), is:

\[
\tau = r_{out} (C_L + C_I),
\]

where \( r_{out} \) is the open-loop output resistance of the op amp.

The closed-loop gain as a function of complex frequency, \( G(s) \), is:

\[
G(s) = \frac{v_o}{v_i} = \frac{G(s=0)}{1 + s\tau \left[ \frac{C_S + C_P + C_I}{C_S + C_P + C_I + AC_I} \right]}
\]

where \( G(s=0) \) is given by Equation 5.19.

Equation 5.23 can be rewritten as:
\[ G(s) = \frac{G(s=0)}{s \tau}. \]  

Therefore, the output time constant with feedback, \( \tau_{FB} \), is:

\[ \tau_{FB} = \frac{\tau}{1 + A \left( \frac{C_I}{C_s + C_p + C_I} \right)}. \]  

*where \( \tau \) is given by Equation 5.22.*

If \( v_i \) is a unit step, and if the output of the amplifier is always a linear function of its input, the output as a function of time, \( v_o(t) \), is:

\[ v_o(t) = -G(s=0) \left( 1 - e^{-\frac{t}{\tau_{FB}}} \right). \]  

Assume that the magnitude of the input step is big enough to cause the final value of the output to be equal to the positive-full-scale-reference voltage, \( V_{ref} \). For the output to have settled within ±1/2 LSB at a N-bit level of its final value,

\[ V_{ref} \left( 1 - e^{-\frac{t}{\tau_{FB}}} \right) \geq V_{ref} \left( 1 - \frac{2}{2^N + 1} \right). \]  

This condition can be rewritten as:

\[ t \geq N \tau_{FB} \ln(2). \]  

For example, to settle within ±1/2 LSB at a 9-bit level, about 6.2 time constants are required. If the settling should be accurate enough after 40 ns for video-rate conversion, \( \tau_{FB} \leq 6.4 \text{ ns} \), corresponding to a closed-loop -3-dB bandwidth of about 25 MHz.

In practice, the frequency response of the amplifier will be more complicated than that produced by just one pole. Also, the output of a real amplifier may not always be a linear function of its input, and the real input to the amplifier will not be a step function but instead an exponential function of time. These considerations all tend to increase the settling time; computer simulations are used to simultaneously...
model all these effects.

5.4.2.2. Operational Amplifier

As a result of the use of digital correction, the offsets of all the op amps are simply referred to the input of the A/D converter each in an amount diminished by the combined interstage amplifier gain preceding the offset; therefore, the op amps do not have to be offset canceled and do not have to be placed in a unity-gain feedback configuration. Since the op amps do not have to be unity-gain stable, their speed can be optimized for a closed loop gain of 4. Of the 40 ns allowed for total settling, 20 ns each is allocated for slewing and linear settling. For ±2.5-V references, the maximum change in the differential output is 5 V. Because the total load, \( C_L + C_f \), is about 4 pF, the required slew rate is at least 250 \( \frac{V}{\mu s} \), making the required differential output current equal to 1 mA. After slewing, if the output is within 10% of its final value, about 5 time constants of about 4 ns each are required to complete the settling to within 0.1% of the final value. Since the closed-loop output resistance, \( r_{out} \), is about \( \frac{4}{G_m} \), where \( G_m \) is the transconductance of the op amp, the output time constant is \( \frac{4(C_L + C_f)}{G_m} \); therefore, the required transconductance is about \( \frac{1}{250 \Omega} \).

The op amp, shown in Figure 5.10, uses a fully differential, class A/B configuration with dynamic bias. The class A/B structure gives both high slew rate and high gain after slewing. According to simulation, the amplifier dissipates 20 mW and settles in 50 ns to an accuracy of 0.1% with a 5-V differential step into a 4-pF load.

The op amp is similar to one reported by Castello and Gray,\(^58\) and its operation is now described. Transistors M1-M4 form the input stage and generate the class A/B action. Source followers M5-M8 are used to bias the input stage so that it conducts some current even for zero differential input. For an increase in the voltage on the positive input and a corresponding decrease on the negative input, the gate-to-source voltages of both M1 and M4 increase while those of M2 and M3 decrease; therefore, the current in M1 and M4 increases and that in M2 and M3 decreases from their standby values. Transistors
M9 and M13, M10 and M14, M11 and M15, and M12 and M16 form current mirrors that reflect and amplify current from the input branches to the output branches. Cascode transistors M17-M20 increase the gain of the op amp by increasing the output resistance of the output nodes to ground. A high-swing, dynamic bias circuit composed of transistors M31-M38 adjusts the gate bias on the cascode transistors so that the output branches can conduct large currents during slewing and have high swings during settling. Transistors M41-M44, together with the $C_{CM}$ capacitors and associated switches in Figure 5.7, form the CMFB circuit. Because the gates of M41 and M42 are tied to a constant bias voltage, PCMB, these transistors are constant-current sources. The gates of M43 and M44 are connected to the CMBIAS terminal shown in Figure 5.7. This point is alternatively switched from a bias voltage, BIAS, on $\phi_1$ to a
capacitively coupled version of the output on $\phi_2$. During $\phi_2$, the CMBIAS line rises and falls with changes in the common-mode output voltage, adjusting the current drawn through M43 and M44 so that the common-mode output voltage is held constant near zero volts. Note that if the two halves of the differential circuit match perfectly, changes in the differential output voltage do not change the CMBIAS point.

Because the speed of this op amp is limited by the speed of its current mirrors, wide-band current mirrors are used to increase the speed. To this end, transistors M9-M12 are not simply diode connected, but instead are buffered by source followers MB1-MB4. Because of this change, the currents needed to supply the parasitic capacitance between the gates and sources of the current mirrors at high frequencies come from the power supplies instead of from the input stage of the op amp. The drawback to this approach is that the drain-to-source voltages of transistors M9-M12 are increased by the gate-to-source voltages of transistors MB1-MB4, respectively. Therefore, input stage transistors M1-M4 operate with less drain-to-source voltage than if M9-M12 were diode connected. As a result, M1-M4 enter the triode region for smaller differential inputs than with diode-connected loads, and the amount of current that the input stage can produce while slewing is limited. Because a high-swing, dynamic bias circuit is used, this is not a problem for ±5-V operation; however, for +5-V operation, these wide-band current mirrors probably would limit the slew rate of the op amp.

5.5. A/D and D/A Subconverters

Figure 5.11 shows a block diagram of a n-bit A/D, D/A subsection. It consists of a reference divider, a comparator bank, an encoder, and a multiplexer. The reference divider divides the reference into $2^n$ regions of equal width and provides the $2^n - 1$ boundaries between these regions as thresholds for a bank of comparators that do the A/D subconversion. The comparator outputs are encoded into the binary outputs of the A/D subconverter and into control signals that cause the multiplexer to select the appropriate reference level as the D/A output. To save area, both the A/D and the D/A functions share the reference divider. In the prototype, the reference dividers are composed of resistor strings.
Figure 5.11 - Block diagram of a n-bit A/D, D/A subsection

each stage does a 3-bit subconversion, the resistor string must contain at least 8 resistors.

Figure 5.12 shows a single-ended schematic of one 3-bit A/D, D/A subsection used in the prototype. Because the resistor string is shared for both the A/D and D/A operations, it produces two sets of reference voltages, one for each operation. To center the residue versus input plot around the x axis as shown in Figures 4.4 and 4.11, the D/A outputs must be offset from the A/D decision levels by 1/2 LSB at a 3-bit level. This requires that the number of resistors be doubled to 16. To do the subconversions, clock signals such as those shown in Figure 5.8 can be used. The comparators are clocked at the end of \( \phi_1 \). On \( \phi_1 \), eight D/A converter outputs are enabled and one is selected based on control signals generated from the comparator outputs \( (y_1, \ldots, y_8) \). Although Figure 5.12 shows a single-ended representation of both the A/D subconverter and D/A converter functions, on the prototype, both functions are fully differential. Therefore, instead of just one D/A converter output, equal and opposite D/A converter outputs are used. Also, each comparator compares a differential input to a differential reference instead of a single-ended input to a single-ended reference. The resistor strings and the comparators are described in the next two sections.

5.5.1. Resistor String

The primary considerations involved in the design of the resistor string are the total resistance and the matching of the resistors. Figure 5.13 shows a schematic diagram of a resistor string that produces
Figure 5.12 - Circuit diagram of a 3-bit A/D, D/A subsection

$2^n - 1$ A/D decision levels and $2^n$ D/A output levels.

The string consists of $2^{n+1}$ matched resistors. The total resistance, $R_T$, is:
where $R_T$ is the resistance of the $i^{th}$ resistor.

$R_T$ is important because it affects the time required to charge capacitance through the string. The D/A converter output path is from the resistor string through two series switches (see Figure 5.12), and must charge about 2 pF of capacitance. To allow the op-amp outputs to settle to the multiplied residue in about 40 ns, the time constant associated with this path must be less than about 4 ns; therefore, the path resistance must be less than about 2 kΩ. When the D/A output is near the center of the resistor string, the contribution of the string to the path resistance is maximum and about equal to $\frac{R_T}{4}$. In the prototype, $R_T = 2 \text{k} \Omega$, and the switch resistances are each about 600 Ω.

While the D/A subconverter must be as linear as the entire conversion, the A/D subconversion must only be as linear as its resolution. Therefore, the degree to which the resistors in the resistor strings
match each other is most important in determining the linearity of the corresponding D/A subconversion; furthermore, the linearity of the first-stage D/A subconversion limits the linearity of the entire A/D conversion. If each first-stage D/A output is accurate within ±½ LSB at a N-bit level, and if there are no other errors in the A/D conversion, the entire conversion is linear to a N-bit level. To produce multiple D/A converter outputs that each have an absolute accuracy of ±½ LSB, the required resistor matching depends on the positions of the outputs in the resistor string relative to the nearest end of the string. Assume that the total reference voltage, \( V_{tot} \), is the voltage applied between the top and bottom of the resistor string:

\[
V_{tot} = (+V_{r^+}) - (-V_{ref}).
\]  

(5.30)

Then the absolute value of the voltage difference between any output and the nearest end of the string, \( V_{output} \), is a fraction, \( f(output) \), that is no greater than 1/2 of the total reference voltage.

\[
V_{output} = f(output)V_{tot}
\]

(5.31)

For \( V_{output} \) to be accurate within ±½ LSB at a N-bit level,

\[
V_{output} = f(output)V_{tot} \pm \frac{V_{tot}}{2^{N+1}}
\]

(5.32)

Equation 5.32 can be rewritten to express the ratio of a D/A subconverter output to the total reference voltage as a function of the resistor-matching dependent fraction.

\[
\frac{V_{output}}{V_{tot}} = f(output) \left[ 1 \pm \frac{1}{2^{N+1} f(output)} \right]
\]

(5.33)

The second term in Equation 5.33 represents the relative accuracy requirement of a D/A output as a function of its position in the resistor string. Because \( f(output) \leq \frac{1}{2} \), the maximum relative accuracy requirement is \( \frac{1}{2^N} \) and occurs at the center of the string. Here the ratio, \( \frac{V_{output}}{V_{tot}} \), is:
If the resistance values are normally distributed with standard deviation, \( \sigma_R \), about their mean, \( R \), a sum of the resistances is also normally distributed and has a mean equal to the sum of the individual means and a standard deviation equal to the square root of the sum of the squares of the individual standard deviations. Then Equation 5.34 can be rewritten as:

\[
\frac{2^n R \pm 2^2 \sigma_R}{2^n R \pm 2^2 \sigma_R + 2^n R \pm 2^2 \sigma_R} = \frac{1 \pm \frac{1}{2^n}}{1 \pm \frac{1}{2^n}}
\]  

(5.35)

The signs of the standard deviation terms should be selected to calculate the maximum required matching. Because the numerator of the left-hand side of Equation 5.35 and the first term in the denominator represent the sum of the same resistors, their standard deviations must have the same signs. If the signs are positive, the error in the resistor ratio caused by mismatch is maximum when the sign of the standard deviation of the second term in the denominator is negative. Then Equation 5.35 can be rewritten as:

\[
\frac{1}{2} \left( 1 + \frac{1}{n} \sigma_R \right) = \frac{1}{2} \left( 1 + \frac{1}{2^n} \right)
\]  

(5.36)

After simplification, the maximum allowable resistor mismatch, \( \frac{\sigma_R}{R} \), is:

\[
\frac{\sigma_R}{R} = 2^2 - n
\]  

(5.37)

where \( n \) is the resolution and \( N \) is the linearity.

For example, to obtain 7-bit linearity from a 7-bit conversion, Equation 5.37 predicts that \( \frac{\sigma_R}{R} \) must be \( \leq 8.8\% \). Here, the resistor string uses 256 resistors and produces 255 total levels of which 127 are A/D decision levels and 128 are D/A output levels. If the resistor string is only used for either
the A/D or D/A subconversion, only 128 resistors are required, and \( \frac{\sigma_R}{R} \) must be \( \leq 6.25\% \). As a point of comparison, by simulation more accurate than the calculation given above, Black has shown that under the latter conditions, \( \frac{\sigma_R}{R} \) must be \( \leq 5\% \).

Equation 5.37 shows that the allowable resistor mismatch decreases exponentially with increasing linearity. This happens because the magnitude of 1 LSB decreases in the same manner. Equation 5.37 also shows that the allowable resistor mismatch increases exponentially with increasing resolution. This happens because when the resolution, \( n \), is increased, the number of required resistors increases in proportion to \( 2^n \), but the standard deviation of the sum of the resistors only increases in proportion to \( \frac{\sigma}{\sqrt{n}} \). Therefore, the ratio of the standard deviation of the sum of the resistors to the sum of the resistors decreases in proportion to \( \frac{\sigma}{\sqrt{n}} \), providing an averaging effect that compensates for the allowed increase in the error per resistor. For example, to obtain 9-bit linearity from a 3-bit subconversion as in the prototype, Equation 5.37 predicts that \( \frac{\sigma_R}{R} \) must be \( \leq 0.5\% \). This is less than the tolerance allowed in the previous example both because the linearity requirement increased from 7 to 9 bits and because the resolution requirement decreased from 7 to 3 bits. Since polysilicon resistors have smaller voltage coefficients than diffused resistors, polysilicon resistors are used in the prototype. Although this matching is attainable in CMOS technologies, the matching required to reach higher linearity with the same resolution per stage is not commonly attainable. For instance, to obtain 12-bit linearity from a 3-bit subconversion, \( \frac{\sigma_R}{R} \) must be \( \leq 0.07\% \), and calibration or trimming is probably required.

Because the subconversions are fully differential, the resistor string must provide complementary references to every comparator. If the resistor string is laid out in a straight line it is said to be "unfolded". Figure 5.14 shows unfolded and once-folded 3-bit resistor strings that are used to generate differential A/D decision levels only. With an unfolded resistor string, providing complementary references to every comparator poses a routing problem. Furthermore, linear processing gradients have a strong effect on the matching of the resistors in an unfolded string. Folding the string once not only
lessens the routing problem as shown in Figure 5.14, but also improves the matching of the resistors in the presence of a linear resistance gradient as shown by Black. For these reasons, the resistor strings in the prototype are each folded once.

5.5.2. Comparator

The connection of a comparator within an A/D, D/A subsection is shown in Figure 5.15. The points labeled VR+ and VR- are connected to taps on the resistor string that depend on which comparator in the bank is under consideration. For example, for the top comparator, VR+ is connected to the most positive A/D subconverter tap, and VR- is connected to the most negative A/D subconverter tap. On clock $\phi_1$, the comparator inputs are grounded, and the capacitors sample the differential reference. On $\phi_2$,
the left sides of the capacitors are connected to the differential input. Ignoring parasitic capacitance, the input to the comparator is then the difference between the differential input and the differential reference. The parasitic capacitances on the inputs to the comparator attenuate the input slightly, but the decision is not affected if the comparator has enough gain. As mentioned in Chapter 4, because of digital correction, no offset cancellation on the comparator is required. Therefore, the comparator is never placed in a feedback loop and does not have to be stable in a closed-loop configuration.

The comparator, shown in Figure 5.16, uses a conventional, latched-differential-amplifier configuration. Transistors M1 and M2 are source followers with current-source loads MB1 and MB2, which are biased by the current-source bias line (CSB). Transistors M3-M8 form a differential amplifier, and ML1 and ML2 form a latch. Transistors MCS1 and MCS2 form a current switch that allows the bias current from MB2 to flow through either the differential amplifier or the latch. With the latch signal low, the inputs are amplified. Because M7 and M8 are biased in the triode region by the triode bias line (TB), the gain of the amplifier is only about 20 dB. To reduce the Miller effect and increase the speed of the amplification, the loads are separated from the differential pair by cascode transistors M5 and M6, which are biased by the cascode bias line (CB). When the latch signal is raised, the bias current is switched from the amplifier to the latch. During the transition, the parasitic capacitances on the inputs to the latch
hold the amplified input. Finally, the positive feedback in the latch drives its output in one direction or the other, and the comparison is completed.

5.6. Clock Generator

Figure 5.17 shows the schematic for the clock generator and the corresponding timing diagram. The circuit consists of a master-slave flip flop, 2 nand gates, and 4 inverters. Assume that all logic elements have identical delay from input to output and that the low-to-high and high-to-low transition times are zero. When the flip flop outputs are: \( Q = 0 \) and \( \overline{Q} = 1 \), the clock generator outputs settle to the following values: \( X = 1, Y = 0, \phi_1 = 0, \overset{\frown}{\phi_1} = 1, \phi_2 = 1, \) and \( \overset{\frown}{\phi_2} = 0 \). At the moment that the flip flop outputs change state, the inputs to the top nand gate become \( Q = 1 \) and \( \overset{\frown}{\phi_2} = 0 \) instead of \( Q = 0 \) and \( \overset{\frown}{\phi_2} = 0 \); therefore, after one gate delay, \( X \) remains = 1. Meanwhile the inputs to the bottom nand gate become \( \overline{Q} = 0 \)
and $\bar{\phi}_1 = 1$ instead of $\bar{Q} = 1$ and $\bar{\phi}_1 = 1$; therefore, after one gate delay, $Y$ becomes $1$. Because $Y$ changes state, $\phi_2$, $\bar{\phi}_2$, $X$, $\phi_1$, and $\bar{\phi}_1$, each change state in succession. After the flip flop outputs change state again, $X$ is the first signal to change in response, and $\phi_1$, $\bar{\phi}_1$, $Y$, $\phi_2$, and $\bar{\phi}_2$ each follow in turn. As a result of this signal flow, nonoverlapping clocks are generated; that is, before $\phi_1$ goes high or $\bar{\phi}_1$ goes low, $\phi_2$ goes low and $\bar{\phi}_2$ goes high to turn off all n-channel and p-channel switches that were on while $\phi_2$ was high and $\bar{\phi}_2$ was low. Similarly, before $\phi_2$ goes high or $\bar{\phi}_2$ goes low, $\phi_1$ goes low and $\bar{\phi}_1$ goes high.

Although delayed versions of both $\phi_1$ and $\phi_2$ are needed to cancel signal-dependent sample-to-hold-mode transition error, these delayed signals are identical to the undelayed signals in the clock generator. The delays are established on the prototype by placing delayed and undelayed versions of clock
signals on separate lines with different capacitive loading; the load on the lines of delayed signals is greater than that of undelayed signals.

5.7. Digital Correction

The digital correction required by the prototype is done off chip. A logic circuit that implements the digital correction algorithm shown in Figure 4.13 is described below. Figure 5.18 shows the corresponding block diagram of the required A/D and D/A subconverters. By moving one resistor from the top to the bottom of the resistor string, all the A/D subconverter decision levels and all the D/A subconverter outputs are shifted up by 1/2 LSB at a 3-bit level from those shown in Figure 5.12. To do the correction, if the stage residue is positive, the algorithm adds 1 count to the digital output of a stage. If the residue is negative, no correction is necessary as noted in Chapter 4.

The digital correction logic circuit consists of one stage of logic for each stage of subconversion in the pipelined converter; see Figure 5.19. Because the interstage gains in the prototype are all greater than 1, nonlinearity in the LSB stage of conversion has the least effect on the linearity of the entire conversion; therefore, the correction algorithm first uses the outputs of the LSB stage to correct those of the third stage. Then the correction continues backwards with the outputs of each stage used to correct those of the previous stage until the MSB stage outputs are corrected.

A block diagram and truth table of the LSB stage of correction logic is shown in Figure 5.20. The uncorrected input bits are denoted in order of increasing significance as \( I_0, I_1, \) and \( I_2 \). Because half the conversion range of the LSB stage is saved for the correction, there are only 2 output bits; that is, the LSB stage of conversion adds 2 bits of resolution to the entire conversion. The corrected output bits are denoted in the same order as \( O_0 \) and \( O_1 \). \( C_{out} \) denotes the carry output and is connected to the carry input of the third stage of correction. When \( I_2 \) is high, the amplified residue input to the LSB stage of conversion is positive, and \( C_{out} \) is made high to add 1 count to the digital code produced by the third stage of conversion; therefore, \( C_{out} = I_2 \). Because the interstage gain is 4, if the third stage digital outputs had been increased by 1 count before the LSB stage of conversion was done, the amplified residue input to the LSB stage would have been decreased by 4 LSB at a 3-bit level from the actual input. To
Figure 5.18 - Block diagram of an A/D, D/A subconverter for simple digital correction algorithm.
compensate for the addition of 1 count to the third stage, 4 LSB should therefore be subtracted from the last stage digital output. Such a subtraction is done by dropping the MSB of the last stage and setting $O_1 = I_1$ and $O_0 = I_0$. Therefore, no logic gates are needed in the LSB stage of the digital correction circuit (because no correction is made on these bits).

A block diagram of the second and third stages of correction logic is shown in Figure 5.21. The uncorrected input bits are denoted as in Figure 5.20. In addition to these inputs, $C_{in}$ denotes the carry input; $C_{in}$ for the stage under consideration is connected to $C_{out}$ from the previous stage of correction. If $C_{in}$ for a particular stage is high, the previous stage of correction determined that 1 count should be added to the digital code of this stage. Therefore, the least significant output bit, $O_0$, is equal to the binary addition of $I_0$ and $C_{in}$. To fully represent the result of this binary addition, $C_0$, a carry signal internal to this stage of correction is also generated.
Similarly,

\[ O_0 = I_0 \oplus C_{in} \]
\[ C_0 = I_0 \cdot C_{in} \]

where \( \oplus \) denotes the logical-exclusive-or function,
and \( \cdot \) denotes the logical-and function.
Figure 5.21 - Block diagram of the second and third stages in the correction circuit
$O_1 = I_1 \oplus C_0$  \hspace{1cm} (5.39)

$C_1 = I_1 \cdot C_0$

where $C_1$ denotes another internal carry signal.

Finally, $C_{out}$ is high when either $I_2$ or $C_1$ is high; that is:

$$C_{out} = I_2 + C_1$$  \hspace{1cm} (5.40)

where $+$ denotes the logical-or function.

Digital correction in the second and third stages therefore requires 2 logical-exclusive-or gates, 2 logical-and gates, and 1 logical-or gate.

A block diagram of the MSB stage of correction logic is shown in Figure 5.22. Although the inputs to the MSB correction stage are the same as those to the second and third stages, some of the outputs of the MSB correction stage differ from those of the other stages. The functions that give $O_0$ and $O_1$ are unchanged. The MSB output, $O_2$, is new to this stage.
Instead of a carry output for the MSB stage, there is an overflow bit, $OF$, that is high if and only if all the
stage inputs are high.

\[ OF = I_2 \cdot I_1 \cdot I_0 \cdot C_{in}. \] (5.42)

Digital correction in the MSB stage therefore requires 3 logical-exclusive-or gates, and 3 logical-and gates.

Although the above description of the digital correction logic is easy to understand, it requires far too many gate delays to be practical. A practical implementation of the digital correction logic, which requires fewer gate delays, is described by Burstein.60

5.8. Photograph of Prototype

A photograph of the core of a prototype chip is shown in Figure 5.23.

The core is about 50 mils wide by 150 mils long. The stages follow one after another and are identical except that the first stage does not have a subtractor, the fourth stage does not have a D/A converter, and the 2-phase, nonoverlapping clock alternates from stage to stage. A test op amp and a test comparator are at the rightmost end. The prototype was made by MOSIS61 in a 3-micron, double-polysilicon, p-well, CMOS process.
Chapter 6 - Experimental Results

6.1. Introduction

The prototype has been tested primarily in two ways: first with a code-density test, and second with a signal-to-noise ratio (SNR) test. The results of these tests for a prototype converter under several different operating conditions are described below. Although the results vary little as a function of the reference voltages for reference voltages between ± 2.0 V and ± 3.0 V, the reference voltages are about equal to ± 2.5 V for all the tests described in this chapter.

6.2. Test Results under Normal Conditions

Under normal conditions, the conversion rate is 5 Ms/s, the input signal frequency is about 2 MHz, the power supply voltages are ± 5 V, and digital correction is applied to all the stages except the last. The test results under these conditions are presented in this section.

6.2.1. Code-Density Test

As described in Chapter 2, the code-density test involves making a histogram of the output codes obtained while an input with a known probability density function is applied to the A/D converter. To characterize the linearity of the converter, the input must have much less distortion than that produced by the converter. Because low-distortion sinusoidal inputs can be generated and easily characterized with a spectrum analyzer, the code-density test uses such an input. The histogram is normalized by the density function, and the differential nonlinearity (DNL) and integral nonlinearity (INL) are calculated. Because the test is statistical in nature, all output codes do not have to be collected; therefore, the histogram hardware need not run in real time. Since the histogram hardware is capable of collecting data at 10 ks/s, only about 1 out of every 500 output codes is included in the histogram. To compute the DNL accurately within 0.1 LSB at a 9-bit level with 99% confidence, about half a million samples are required. To compute the INL with the same accuracy and confidence, about a quarter of a billion samples are required. At a down-sampled rate of 10 ks/s, the collection of so many samples would take about 8
hours. Not only would this be too long to wait but also by the time the measurement is complete, test conditions that affect the INL (such as the reference voltage) may have changed. As a compromise, 1 million samples were collected for all the code-density test results given below, which is more than enough for the DNL but not enough for the INL measurement to attain an accuracy of 0.1 LSB with 99% confidence. Since the accuracy of the linearity computations is inversely proportional to the square root of the number of samples, the computed INL will only be accurate within about 2 LSB at a 9-bit level with 99% confidence.

In Figure 6.1, DNL is plotted on the y axis versus code on the x axis for all 512 codes.

Because the DNL never reaches -1 LSB, there are no missing codes. The maximum DNL is less than 0.6 LSB. In Figure 6.2, INL is plotted on the y axis versus code on the x axis. Again, the conversion rate is
5 Ms/s and the input frequency is about 2 MHz. The maximum INL is 1.1 LSB.

6.2.2. Signal-to-Noise Ratio Test

SNR characterization overcomes a limitation that occurs in measuring INL with a code-density test; that is, because many fewer samples are needed to measure the SNR than the INL, the SNR test is not as sensitive to changes in test conditions during the test as is the code-density test. As noted in Chapter 2, the SNR test can be used to estimate the maximum absolute INL through the total harmonic distortion (THD). Repeating Equation 2.9:

\[ \text{maximum } |\text{INL}| = N - \frac{\text{THD}}{6} \]  (6.1)
Each SNR measurement is made by taking a discrete Fourier transform (DFT) on \( N_s \) samples from the converter collected at the down-sampled rate of \( f_{DS} \). The resulting frequency resolution of the DFT is \( \frac{f_{DS}}{N_s} \). Since the SNR hardware is capable of collecting data at no more than about 20 ks/s, the downsampled rate is selected to be 20 ks/s. To determine the required number of samples, the SNR is measured as a function of the number of samples. \( N_s = 1024 \) and the corresponding frequency resolution of 20 Hz/sample are found to be adequate. Since the converter samples the input at 5 Ms/s while the SNR hardware samples the converter output at 20 ks/s, only about 1 output out of every 250 outputs is included in the SNR computation, and signal components at frequencies greater than 10 kHz are aliased into the base band. Under normal conditions, the input signal frequency is 2.002011 MHz. Therefore, the fundamental component appears in the base band at 2011 Hz, and harmonics appear at multiples of 2011 Hz; see Figure 6.3. The levels of the harmonics in Figure 6.3 are at least 60 dB smaller than that of the fundamental. The power at the fundamental frequency of the sine wave is considered signal power, and the power at all other frequencies above DC and up to and including half the sampling frequency is considered noise power. With such definitions, the SNR is a signal-to-noise-plus-distortion ratio. Here, the \( SNR = 49.4 \text{ dB} \).

In Figure 6.4, SNR is plotted on the y axis versus input level on the x axis. An ideal 9-bit curve is also shown; the SNR of the ideal curve is limited by quantization noise. The peak SNR is about 50 dB instead of about 56 dB, as would be expected with a 9-bit converter; this difference is accounted for by distortion generated from the INL for large input signal levels. For such signals, if the noise power is dominated by distortion, the \( SNR = THD \), and Equation 6.1 can be used to show that the maximum INL is about 1 LSB here, which agrees with the code-density test results. Because the INL measurement obtained from the code-density test is only accurate within about 2 LSB, this confirmation is important; furthermore, the SNR test only requires about 50 ms to collect the data instead of about 100 s for the code-density test, allowing less opportunity for test conditions that affect the INL to change during the test. When the input signal is reduced in amplitude, the distortion is reduced and the real curve approaches the ideal 9-bit curve, suggesting that the noise associated with the conversion of low-level
input signals is dominated by quantization noise. Although the SNR test provides an easy way to check the maximum INL, it gives little information about DNL and missing codes because the SNR is not measured as a function of the DC offset of low-level sinusoidal inputs. While this connection could be made, the DNL is efficiently measured using the code-density test.

6.2.3. Summary of Tests under Normal Conditions

The prototype A/D converter has typical characteristics summarized in Table 6.1.
Table 6.1

Typical Performance: 25°C

<table>
<thead>
<tr>
<th>Technology</th>
<th>3-u CMOS</th>
</tr>
</thead>
<tbody>
<tr>
<td>Resolution</td>
<td>9 bits</td>
</tr>
<tr>
<td>Conversion Rate</td>
<td>5 Ms/s</td>
</tr>
<tr>
<td>Area*</td>
<td>8500 mils²</td>
</tr>
<tr>
<td>Power Supplies</td>
<td>±5 V</td>
</tr>
<tr>
<td>Power Dissipation</td>
<td>180 mW</td>
</tr>
<tr>
<td>Input Capacitance</td>
<td>3 pF</td>
</tr>
<tr>
<td>Input Offset</td>
<td>&lt; 1 LSB</td>
</tr>
<tr>
<td>CM Input Range</td>
<td>±5 V</td>
</tr>
<tr>
<td>DC PSRR</td>
<td>50 dB</td>
</tr>
</tbody>
</table>

*Does not include clock generator, bias generator, reference generator, digital error correction logic, and pads
6.3. Dependence of Code-Density Test Results on Conversion Rate

To determine the source of the nonlinearity in the prototype, the plot of INL versus code shown in Figure 6.2 is examined; however, no pattern is evident. Because a pattern may be present but hidden by a conversion-rate limitation, the experiment is repeated at half the conversion rate (2.5 Ms/s). Figure 6.5 shows the resulting plot of INL versus code.

At half the conversion rate, not only is the maximum INL reduced but also the entire plot is more orderly than at 5 Ms/s. Examination of Figure 6.5 reveals that the plot is divided into 8 segments of equal width (64 codes each) and varying slope, corresponding to the 8 resistors that compose the D/A converter in the first stage. The nonideality in the 5-Ms/s INL curve is therefore caused both by nonlinearity in the first-stage D/A converter and by some conversion-rate limitation. Although the source of the conversion-rate
limitation is not known, the most likely source is incomplete settling of the first-stage op-amp outputs.

6.4. Dependence of Test Results on Input Frequency

Since the linearity and SNR of the prototype may depend on the input signal frequency, both the code-density and SNR tests are repeated for different input signal frequencies. The results of these tests are summarized below. While the input frequency is less than the Nyquist frequency, decreases in the linearity or SNR caused by increases in the input signal frequency can be attributed to dynamic errors in the input S/H amplifier.

6.4.1. Code-Density Test

Figure 6.6 shows a plot of the maximum DNL as a function of the input signal frequency.

![Figure 6.6 - Peak DNL vs. input signal frequency](image)

When the input frequency is increased from 2 kHz to 5 MHz, the maximum DNL increases by about 0.1
Figure 6.7 shows a plot of the maximum INL as a function of the input signal frequency.

When the input frequency is increased from 2 kHz to 5 MHz, the maximum INL increases by about 0.2 LSB. Because the changes in both the maximum DNL and maximum INL are small for large changes in the input frequency, the dynamic errors in the input S/H amplifier must be small.

6.4.2. Signal-to-Noise Ratio Test

In Figure 6.8, SNR is plotted on the y axis versus input level on the x axis for five input frequencies: 2 kHz, 22 kHz, 202 kHz, 2.002 MHz, and 5.002 MHz. An ideal 9-bit curve is also shown. The curve for a 5.002-MHz input represents a beat frequency test on the converter when compared to the curve for a 2-kHz input because the converter is running at the difference between these two frequencies or 5 Ms/s. Notice that the real curves for different input frequencies are almost identical. Figure 6.9
shows a plot of the maximum SNR as a function of the input signal frequency. When the input frequency is increased from 2 kHz to 5 MHz, the maximum SNR decreases by about 2 dB. Although this change is small, the cause of the change is of interest. To determine the cause of the change, the DFT outputs for various input frequencies are examined. Figure 6.10 shows such a plot for which the input frequency is 2 kHz. The most apparent difference between Figures 6.3 and 6.10 is the amplitude of the second harmonic; in Figure 6.3, the amplitude of the second harmonic is about 33 dB, and in Figure 6.10, it is about 22 dB. Figure 6.11 shows a plot of the amplitude of the second harmonic versus input frequency. For increasing input frequency, the amplitude of the second harmonic increases, suggesting that the distortion generated by the input S/H amplifier increases. To describe the source of the distortion, Figure 5.5 is redrawn here as Figure 6.12. Because the impedances of the sampling capacitors, $C_{S1}$ and $C_{S2}$, decrease in proportion to the input frequency, the fraction of the input voltage that is dropped across the sampling
switches, M1-M4, increases in proportion to the input frequency. When on with a small drain-to-source voltage, a MOS transistor behaves as though it were a linear resistor; however, when the drain-to-source voltage is not small, the on resistance of the transistor is a nonlinear function of the drain-to-source voltage. Therefore, as the input frequency increases, the linearity of the on resistances of the transistors in the S/H circuit decreases and causes the observed distortion in the A/D converter output. To reduce this distortion, the aspect ratio of the sampling switches can be increased; however, such a change causes a corresponding increase in the sample-to-hold-mode transition error. Although this distortion is small at a 9-bit level, it could limit the performance of video-rate A/D conversion at higher resolution.

The results of the code-density and SNR tests for variations in the input frequency are summarized in Table 6.2.
Conversion Rate = 5 Ms/s  
Down-Sampled Rate = 20 ks/s  
Input Frequency = 2 kHz

Table 6.2  
Data Summary over Input Frequency Variation

<table>
<thead>
<tr>
<th>Input Frequency</th>
<th>2 kHz</th>
<th>2.002 MHz</th>
<th>5.002 MHz</th>
</tr>
</thead>
<tbody>
<tr>
<td>Peak DNL (LSB)</td>
<td>0.5</td>
<td>0.6</td>
<td>0.5</td>
</tr>
<tr>
<td>Peak INL (LSB)</td>
<td>1.0</td>
<td>1.1</td>
<td>1.2</td>
</tr>
<tr>
<td>Peak SNR (dB)</td>
<td>50</td>
<td>50</td>
<td>49</td>
</tr>
</tbody>
</table>

Figure 6.10 - Amplitude of DFT output vs. input frequency
Table 6.2 shows the peak DNL, INL and SNR for three input frequencies, and the performance is almost constant. This is important because it means that the first-stage S/H amplifier is able to accurately sample high-frequency input signals.

6.5. Dependence of Test Results on Power Supply Voltages

Both the code-density and SNR tests are repeated for different power supply voltages. The results of these tests are summarized below.

6.5.1. Code-Density Test

Figure 6.13 shows a plot of the maximum DNL as a function of the input signal frequency with power supply voltages as a parameter. Over the entire range of input frequencies, when the power supply voltages change by ±10%, the maximum DNL changes by about 0.2 LSB. Figure 6.14 shows a plot of
Figure 6.12 - Circuit diagram of a fully differential S/H circuit with bottom-plate switching

the maximum INL as a function of the input signal frequency with power supply voltages as a parameter. Over the entire range of input frequencies, when the power supply voltages change by ±10%, the maximum INL changes by about 0.7 LSB.

6.5.2. Signal-to-Noise Ratio Test

Figure 6.15 shows a plot of the maximum SNR as a function of the input signal frequency with power supply voltages as a parameter. The results of the code-density and SNR tests for variations in the input frequency are summarized in Table 6.3.
Table 6.3

Data Summary over Supply Variation

<table>
<thead>
<tr>
<th>Supply Voltage (V)</th>
<th>±4.5</th>
<th>±5</th>
<th>±5.5</th>
</tr>
</thead>
<tbody>
<tr>
<td>Peak DNL (LSB)</td>
<td>0.7</td>
<td>0.6</td>
<td>0.5</td>
</tr>
<tr>
<td>Peak INL (LSB)</td>
<td>1.5</td>
<td>1.1</td>
<td>1.0</td>
</tr>
<tr>
<td>Peak SNR (dB)</td>
<td>49</td>
<td>50</td>
<td>50</td>
</tr>
</tbody>
</table>

Except for an increase in the maximum INL at low power supply voltages, the performance is nearly constant. The increased INL at low supply voltages may be caused by reduced open-loop gain of the op amps at high output swings.
6.6. Dependence of Code-Density Test Results on Digital Correction

As mentioned in Chapter 5, the digital correction is done off chip, allowing the need for the correction to be evaluated. All results described above are obtained using the full correction; that is, digital correction is applied to the first 3 stages for these results.

Under the same conditions as in Figures 6.1 and 6.2 but with the digital correction completely disabled, the maximum DNL and INL are about 10 LSB at a 9-bit level, owing to comparator offsets. If the correction is applied only on the first stage, the maximum DNL and INL drop to about 3 LSB. When digital correction is applied on the first two stages, the maximum DNL is about 0.9 LSB and the maximum INL is about 1.5 LSB; therefore, there are no missing codes in this case. However, when digital correction is applied on the first three stages, the maximum DNL is about 0.6 LSB and the maximum INL is about 1.1 LSB. Because the nonlinearities are minimum in the last case, digital correction should be
fully applied for best performance. Also, the uncorrected histogram data from the code-density test show that there are no codes for which any residue is greater than the reference level for comparator C1 or less than the reference level for comparator C7 as labeled in Figure 5.12. This means that the maximum absolute value of nonlinearity in an A/D subconversion is less than or equal to 1/4 LSB at a 3-bit level, and the full digital correction range (±1/2 LSB) is not used. Therefore, comparators C1 and C7 are not needed in the last 3 stages.
Chapter 7 - Conclusion

7.1. Summary of Research Results

This research shows that pipelined architectures and digital correction techniques are of potential interest for video-rate CMOS A/D conversion applications. The research results are summarized in two parts, those involving the comparison of different conversion architectures and those involving the design of pipelined A/D converters.

7.1.1. Comparison of A/D Conversion Architectures

- High throughput rate and low hardware cost are the primary potential advantages of the pipelined architecture.

- If the A/D subconversions in each stage of a pipelined converter are done with flash converters, a pipelined architecture only needs two clock phases per conversion, the same as a flash architecture.

- Flash architectures use pipelining to do the digital decoding operation. Their throughput rate is maximized because their pipelined information is entirely digital and can be transferred to 1-bit accuracy in less time than it takes to generate and transfer the analog residue in pipelined multistage architectures.

- The area and consequent manufacturing cost of pipelined converters is small compared to those of flash converters because pipelined converters require fewer comparators than flash converters.

- Once interstage S/H amplifiers are designed to isolate the stages of a pipelined A/D converter, a similar S/H amplifier can be used at the input of the A/D converter, providing accurate sampling and allowing accurate conversion of high-frequency input signals.

- The interstage gains from the S/H amplifiers diminish the effects of nonidealities in all stages after
the first stage on the linearity of the entire conversion. This allows pipelined converters to use a
digital correction technique with which nonlinearity in the A/D subconversions has little effect on
the overall linearity.

- The main disadvantage of pipelined A/D converters is that they require the use of op amps to real-
ize parasitic-insensitive S/H amplifiers. Although the S/H amplifiers improve many aspects of the
converter performance, the op amps within the S/H amplifiers limit the speed of the pipelined con-
verters.

- Op amps are not required in subranging architectures. Because high-speed op amps are difficult to
realize, a common goal in the design of subranging A/D converters is to avoid using op amps. If
op amps are not used, however, parasitic-insensitive S/H amplifiers cannot be realized. The conse-
quent high-frequency input sampling is poor, stage operation is sequential, and tolerance to error
sources is less than that in pipelined architectures.

- Flash converters also do not usually use an input S/H amplifier because of the difficulty in realizing
an op amp in CMOS technologies that is fast enough to drive the inherently large input load; there-
fore, flash converters often suffer reduced performance with high-frequency inputs.

7.1.2. The Design of Pipelined A/D Converters

- Although the prototype uses identical stages to reduce the design time, nonidentical stages can be
used to optimize the performance.

- Digital correction should be used on all stages except the last, on which digital correction cannot be
applied.

- To use digital correction effectively, the interstage gain must be greater than 1 so that nonlinearity
in the last stage, which cannot be corrected, has little effect on the linearity of the entire
conversion. If the gain is integer valued, it must be greater than or equal to 2. Therefore, to use one bit of digital correction per stage, the resolution per stage must be greater than or equal to 2 bits. At this level, the conversion rate is maximized because the bandwidth of the interstage gain is largest; the linearity, however, increases with increasing resolution per stage.

- To use the minimum area, about 3-bit resolution per stage should be used, providing a compromise between the conversion rate and linearity requirements.

- The prototypes used no more than half the correction range. The top and bottom comparators in the comparator arrays of the last 3 stages are therefore unnecessary.

- Since digital correction refers op-amp offset errors to the input of the converter, cancellation of op-amp offset is not required to obtain high linearity. As a result, the op amp is not placed in a unity-gain-feedback configuration to cancel the offset; therefore, the op amp does not have to be unity-gain stable, and its speed can be optimized for its closed-loop gain.

- Dynamic common-mode feedback for the fully differential op amp is found to be well suited for op amps in pipelined A/D converters because such converters inherently allow a clock phase needed to reset the bias on the common-mode-feedback capacitors.

### 7.2. Projected Performance in Scaled Technologies

The prototype is implemented in a 3-micron CMOS technology and uses ±5-V power supplies. If the prototype were implemented in a 1.5-micron CMOS technology using only a 5-V power supply, the horizontal and vertical electric fields present in the transistors would be unchanged from those in the 3-micron version. In general, with constant-field scaling, while the horizontal dimensions, vertical dimensions, and power-supply voltages are divided by a constant factor, $k$, the substrate doping is multiplied by the same factor. The horizontal dimensions include the width, $W$, and length, $L$, of the channels of the transistors, and the vertical dimensions include the thicknesses of the layers such as the gate oxide. As a result of such scaling, the threshold voltages, $V_t$, and depletion layer widths, $x_d$, are divided by the factor,
and the oxide capacitance per unit area, \( C_{ox} \), is multiplied by the same factor. Under these conditions, the gain of MOS amplifiers does not change as a function of the scaling factor, \( k \).\(^{63,64}\) In the presence of an extra punch-through-prevention implant, moreover, such scaling increases the gain. Furthermore, constant-field scaling should increase the unity-current-gain frequency of the transistors by the scaling factor, \( k \).\(^{63}\) With the same gain and increased bandwidth, the speed of the S/H amplifiers should be increased. If a punch-through implant is used and increases the gain of the amplifiers, the extra gain can be traded for a further increase in amplifier speed. Because the maximum conversion rate is limited by the speed of the amplifiers, scaling should increase the maximum conversion rate. Video conversion rates should be attainable in 1.5-2-micron CMOS technologies.

7.3. Extensions to Increased Resolution and Linearity

Not only is the area small for pipelined converters, but also while maintaining a constant number of required clock phases per conversion, the area is linearly related to the resolution. For example, to increase the resolution of the prototype from 9 bits to 11 bits, one extra stage of conversion is needed, and the required area is increased by about 25%. Although the number of clock phases per conversion is constant, the duration of each phase may have to be increased to allow more accurate sampling and settling, reducing the conversion rate. Because the linearity of the prototype is limited by the linearity of the D/A converter in the first stage, and because the raw matching of MOS components produces about 8-9-bit linearity, the first-stage D/A converter must be calibrated or trimmed to increase the linearity of the conversion. Self-calibration of high-resolution successive approximation A/D conversion to achieve high linearity has been demonstrated.\(^{65}\) The application of this principle to pipelined conversion is a current research topic.

7.4. Extensions to Increased Conversion Rate, Resolution, and Linearity

Although scaling should increase the conversion rate with constant resolution and linearity, and although calibration should increase the linearity at increased resolution but reduced conversion rate, the combination of scaling and calibration may not be enough to simultaneously achieve high speed and high
linearity. As described in Chapter 6, a limitation in the accuracy of a S/H circuit exists owing to a trade-off between distortion and sample-to-hold-mode transition error that becomes increasingly important at increasing input frequencies. Overcoming this speed versus accuracy tradeoff in the S/H circuit is a future research topic.
References


34. *U. S. Patent Number 3,946,432.*


55. R. Castello, Low-Voltage Low-Power MOS Switched-Capacitor Signal-Processing Techniques,


