# Copyright © 1985, by the author(s). All rights reserved.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.

# TIMING RECOVERY IN DIGITAL SUBSCRIBER LOOPS

bу

C-P. Tzeng

Memorandum No. UCB/ERL M85/29

22 April 1985

# TIMING RECOVERY IN DIGITAL SUBSCRIBER LOOPS

bу

C.-P. J. Tzeng

Memorandum No. UCB/ERL M85/29

22 April 1985

**ELECTRONICS RESEARCH LABORATORY** 

College of Engineering University of California, Berkeley 94720

#### TIMING RECOVERY IN DIGITAL SUBSCRIBER LOOPS

### Chin-Pyng Jeremy Tzeng

Ph.D.

Department of Electrical Engineering and Computer Sciences

Sponsors:

Committee Chairman

Advanced Micro Devices, Fairchild Semiconductor, Harris Corp., National Semiconductor, and Racal-Vadic, with a matching grants from the University of California's MICRO (Microelectronics and Computer Research Opportunities) program.

#### **ABSTRACT**

Timing recovery is a critical function that must be provided to synchronize a digital subscriber loop communication link. Low-cost electronics for digital subscriber loop systems will be achieved with VLSI technology. Sampled-data techniques are the most practical means of obtaining the necessary signal processing functions in VLSI form.

This thesis reports on studies of two sampled-data timing recovery techniques, the wave difference method, and the baudrate sampling technique. The major goal is the integrated circuit implementation of these techniques. The design issues concerning interrelated problems of echo cancellation, equalization, and timing recovery are addressed. The roles played by the line code and pulse shape in timing recovery are reported. The impact of these timing recovery methods on the implementation of the echo canceller and on the performance of the digital subscriber loop system is also presented.

Computer simulations indicate that the performance of both timing recovery approaches is satisfactory. Experimental results for the baudrate sampling technique agree well with the simulation results. A fully-integrated digital subscriber loop transceiver is concluded to be feasible.

I like to express my deepest thanks to Professors David A. Hodges and David G. Messerschmitt for their guidance and support throughout the course of my research work. I am also grateful to Dr. Oscar Agazzi, who pioneered the work on digital subscriber loops at Berkeley, for his technical assistance.

I thank Professor Robert W. Brodersen for letting me use the silicon compiler developed in his group to build an experimental chip, and Jan Rabaey for helping me coding the silicon compiler. It is unfortunate that the chip is still undergoing testing and thus it is not mentioned in this thesis. The help from Peter Ruetz on designing the digital VCO chip. Ho-Ping Tseng on preparing the figures, and the valuable discussions with my colleagues, Graham Brand and Nan-Sheng Lin, and many other graduate students here at Berkeley have been very useful.

I also wish to thank Peter O'Riordan and Peter Winship for constructing the experimental breadboard systems. Without the cables provided by Pacific Telephone, the experiment would not have been complete.

This work was supported by Advanced Micro Devices, Fairchild Semiconductor. Harris Corp., National Semiconductor, and Racal Vadic, with a matching grant from the University of California's MICRO program.

# TABLE OF CONTENTS

| Chapt | ter 1 - Introduction                                            | 1  |
|-------|-----------------------------------------------------------------|----|
| -     | Digital Subscriber Loops                                        | 1  |
|       | 1.1.1 Comparison Between TCM and Echo Cancellation Systems      | 6  |
| 1.2   | Line Coding and Pulse Shaping                                   | 8  |
|       | 1.2.1 Digital Line Coding                                       | 9  |
|       | 1.2.2 Pulse Shaping                                             | 12 |
|       | 1.2.3 Combining Line Coding and Pulse Shaping                   | 13 |
| 1.3   | Echo Cancellation                                               | 14 |
|       | 1.3.1 Echo Canceller Structures                                 | 15 |
|       | 1.3.2 Fully Analog Approach                                     | 16 |
|       | 1.3.3 Implementation Alternatives for the Fully Analog Approach | 17 |
|       | 1.3.4 Other Transversal Filter Structures                       | 23 |
| 1.4   | Summary                                                         | 26 |
| Chapt | er 2 - System Design                                            | 28 |
| 2.1   | Design Parameters and Techniques                                | 28 |
|       | 2.1.1 Error Rate Objectives                                     | 28 |
|       | 2.1.2 Echo Cancellation                                         | 29 |
|       | 2.1.2.1 Interpolation Echo Cancellers                           | 32 |
|       | 2.1.3 Echo Cancellation Requirement                             | 34 |
|       | 2.1.4 Equalization                                              | 34 |
|       | 2.1.5 Timing Recovery                                           | 35 |
|       | 2.1.6 Synchronization in Subscriber and Central Office Sets     | 35 |
| 2.2   | System Description                                              | 36 |
|       | 2.2.1 Scramble and Descrambler                                  | 38 |
|       | 2.2.2 Line Coder                                                | 39 |
|       | 2.2.3 Transmit and Receive Filters                              | 40 |
|       | 2.2.4 Equalizer                                                 | 42 |
|       | 2.2.5 Echo Canceller                                            | 42 |
|       | 2.2.6 Detector                                                  | 42 |
|       | 2.2.7 Timing Recovery                                           | 43 |
|       | 2.2.7.1 Timing Function Generator                               | 44 |
|       | 2.2.7.2 Loop Filter and VCO                                     | 44 |
| -     | er 3 - Timing Recovery                                          | 46 |
| 3.1   | Conventional Timing Recovery Methods                            | 47 |
|       | Optimum Timing Phase                                            | 51 |
| 3.3   | On the Timing Phase Recovered by the Spectral Line Method       | 53 |
|       | 3.3.1 The Importance of Timing Phase                            | 55 |

|            | 3.3.2 Timing Phase Recovered by the Spectral Line Method  |
|------------|-----------------------------------------------------------|
|            | 3.3.3 Prefiltering                                        |
|            | 3.3.4 Summary                                             |
| 3.4        | Timing Recovery in DSL                                    |
|            | 3.4.1 Practical Considerations                            |
|            | 3.4.1.1 Echo Canceller                                    |
|            | 3.4.1.2 Line Impairments                                  |
|            | 3.4.1.3 Equalization Techniques                           |
|            | 3.4.2 Objectives of Timing Recovery                       |
|            | 3.4.3 Summary                                             |
| Chap       | ter 4 - Phase-Locked Loops                                |
| _          | Timing Jitter Effect on Performance of Echo Canceller     |
|            | Residual Error in Echo Canceller due to Timing Jitter     |
|            | PLL                                                       |
|            | 4.3.1 Analog PLL                                          |
|            | 4.3.2 DPLL                                                |
| 4.4        | Voltage Controlled Oscillator (VCO)                       |
|            | 4.4.1 Analog VCO                                          |
|            | 4.4.1.1 Multivibrator Analog VCO                          |
|            | 4.4.1.2 Crystal VCO (VCXO)                                |
|            | 4.4.2 Digital VCO (DVCO)                                  |
| Chant      | ter 5 - Equalization                                      |
|            | Performance Bounds                                        |
| J.1        | 5.1.1 Matched-Filter Bound in an Ideal Channel            |
|            | 5.1.2 Matched-Filter Bound in a Nonideal Channel          |
|            |                                                           |
|            | 5.1.3 Optimum Linear PAM Receiver                         |
| 5 2        | 5.1.4 Optimum PAM Receiver with DFE                       |
|            |                                                           |
|            | Linear Transversal Filter                                 |
|            | Decision Feedback Equalizer (DFE)                         |
|            | Summary                                                   |
|            | ter 6 - Wave Difference Method (WDM)                      |
| 0.1        | Analysis of the Wave Difference Timing Recovery Technique |
|            | 6.1.1 The Wave Difference Method                          |
|            | 6.1.2 WDM Frequency Detector                              |
| <i>-</i> - | 6.1.3 Comparison of the WDM with the Spectral Line Method |
|            | Performance of the WDM                                    |
| _          | An Example of Timing Recovery Design                      |
| 6.4        | 9                                                         |
|            | Equalization                                              |
|            | Summary                                                   |
|            | er 7 - Baudrate Sampling Technique (BST)                  |
| 7.1        | Line Coding and Pulse Shaping                             |
|            | 7.1.1 Line Coding                                         |

| 7.1.2 Pulse-shaping                                                              | 157 |
|----------------------------------------------------------------------------------|-----|
| - 7.1.2.1 Pulse-shaping by digital coding                                        | 158 |
| 7.1.2.2 Pulse-shaping by analog filtering                                        | 159 |
| 7.2 Timing Function                                                              | 161 |
| 7.3 Timing Jitter Analysis                                                       | 168 |
| 7.3.1 Timing Jitter Model                                                        | 168 |
| 7.3.2 Jitter Spectrum                                                            | 170 |
| 7.4 Equalization/Data Detection                                                  | 173 |
| 7.5 Subscriber End Transceiver                                                   | 175 |
| 7.6 Central Office End Transceiver                                               | 176 |
| 7.7 Integrated Circuit Implementation                                            | 177 |
| 7.7.1 Digital Timing Recovery and DFE                                            | 178 |
| 7.7.2 Analog Timing Recovery and Digital DFE Adaptation with Analog Cancellation | 179 |
| 7.7.3 Fully Analog DFE and Timing Recovery                                       | 181 |
| 7.8 Simulation Results                                                           | 181 |
| 7.9 Experimental Results                                                         | 189 |
| 7.10 Summary                                                                     | 191 |
| Chapter 8 - Conclusions                                                          | 192 |
| 8.1 Comparison of WDM and BST                                                    | 192 |
| 8.2 Computer Simulation of Echo Canceller Alternatives                           | 194 |
| 8.3 2-stage echo canceller                                                       | 199 |

.

•

## CHAPTER 1

#### Introduction

Evolving plans for the Integrated Services Digital Network (ISDN) will be introduced into use over the coming years. At present, public-switched telephone networks (PSTN's) are generally becoming more digital in nature, with the increasing use of 64 kb/s Pulse Code Modulation (PCM) switching and transmission. This trend will gradually lead to a multi-service ISDN in which end-to-end digital connectivity exists and through which a multiplicity of services (voice and non-voice) are accessible by the customer.

An ISDN can be regarded as a general-purpose digital network capable of supporting a wide range of services (such as voice, data, text, and image) using a small set of standard multipurpose user-network interfaces. The activities in setting standards on various aspects of ISDN's to prepare an orderly evolution into ISDN have been actively taken up by the International Telegraph and Telephone Consultative Committee (CCITT).

#### 1.1. Digital Subscriber Loops

The digital subscriber loop (DSL) is an important element of the ISDN. Integrated voice and data services will be provided to the customer over a common facility based upon existing twisted-pair cables. The data rate under consideration for the digital subscriber loop is 144 Kb/s, which includes provision for two voice/data channels at 64 Kb/s each plus a data channel at 16 Kb/s. As shown in Figure 1.1, representing a typical application of DSL, voice transmission requires a codec and filter to perform the analog-to-digital and digital-to-analog conversions on the customer premises, together with a modem for transmitting the full-duplex data stream over the two-wire



Figure 1.1. A typical application of DSL, where 2 voice channels, 64 kb/s each and 1 data channel at 16 kb/s can be transmitted simultaneously in full-duplex fashion.

subscriber loop. The central office end of the loop has another full-duplex modem, with connections to the digital central office switch for voice or circuit-switched data transmission, and to data networks for packet-switched data transport capability. Throughout this thesis, the term "DSL modem", representing a combination of modulator and demodulator, will be used interchangeably with the term "DSL transceiver", representing a combination of transmitter and receiver.

Full-duplex transmission on a single pair of wires requires a means of separating the signals in the two transmission directions. Two competing methods for the baseband full-duplex data modems are time-compression multiplex (TCM), also known as the burst mode or ping-pong technique, and the echo-cancellation (or hybrid mode) technique.

A TCM system divides the common medium in time, transmitting first in one direction and then in the other. The two ends transmit bursts alternately, as shown in Fig. 1.2. A guard time is necessary between the bursts to ascertain that the residual reflections from the burst will not be detrimental to the reception of far-end data. These reflections die down rapidly after the end of the bursts but lingering effects prevail due to mismatch and bridged tap effects. The required line data rate is, therefore, more than twice the data bit rate in each direction.

The alternative echo-cancellation mode system is shown in Figure 1.3. Both transmitters operate continuously. The signal received from the far end is corrupted by a large component of near-end signal, resulting from near-end hybrid unbalance, reflection from the far-end hybrid, or reflection from possible bridged taps. A self-adapting echo canceler, one in each modem, synthesizes a replica of the echo using the transmitting data. This replica is subtracted from the total received signal. Then only the desired far-end signal enters each receiver. The major advantage associated with this method is that the line rate is equal to the data bit rate (assuming binary data



Figure 1.2. A time compression multiplexing (TCM) mode DSL system, where time is divided into slots for two sides to transmit one burst at a time.



Figure 1.3. A block diagram of an echo cancellation DSL system, also called hybrid mode system. An echo canceller is used to separate the two-way signals on the same pair of wires.

transmission). Overall, the echo canceller and subtracter must reduce the unwanted signal by 50-60 dB. This is a challenging requirement.

There has been considerable work on techniques for and realization of the echo canceler for DSL systems [1-6]. Large scale integration of the echo canceller for hybrid-mode modems has been shown to be quite feasible [1]. Any nonlinearity introduced by asymmetry of the transmitted pulse, A-D or D-A conversions, and nonlinear transformers, etc. can also be compensated with a nonlinear echo cancellation technique [2].

This thesis focuses on the timing recovery in DSL. The detailed consideration and requirements of the timing recovery in DSL are described in Chapter 3. The emphasis is on the timing recovery techniques that are suitable for integrated circuit realization. Sampled-data techniques are the most practical means of obtaining the necessary signal processing functions in VLSI form. The necessary signal processing functions include transmit and receive filtering, line coding, echo cancellation, equalization, and data detection. The two sampled-data timing recovery methods, the wave difference method and the baudrate sampling technique, to be described in later chapters, are applicable to both the burst mode and the echo cancellation systems. They also apply to a variety of line codes.

#### 1.1.1. Comparison Between TCM and Echo Cancellation Systems

The echo cancellation DSL system is more complex in implementation due to the presence of the echo cancellation and the stringent requirements associated with the echo canceller. However, the following points also contribute to the performance comparison of the two.

1) Line attenuation. The line rate in a TCM system is greater than twice the line rate of a echo-canceller system with the same data line because of the need for a guard time

and the time-domain multiplexing in the burst-mode system. The line attenuation is thus higher in the burst-mode system.

- 2) penetration range. Since the maximum transmitted level is limited by compatibility with other services, and the minimum received level is limited by impulse noise and/or crosstalk from other services, the maximum range of penetration is lower for burst-mode than for echo-cancellation.
- 3) crosstalk limitation. Reference [7] gives results of very detailed analysis of crosstalk and impulse noise limitations for both systems. Fig. 1.4, reproduced from that reference, shows that the range of an echo-cancellation system is typically about 7dB larger than for burst-mode system. It is also deduced from this reference that the crosstalk may be the most crucial limiting factor in terms of the range and data rate of a DSL system.

#### 1.2. Line Coding and Pulse Shaping

Line coding techniques have been extensively studied since the emergence of local networks and the ongoing evolution of the public telecommunication network to digital service. Line coding, also referred to as digital line coding or digital signaling, is usually used in broadband communication systems where the bandwidth of the channel is much wider than the spectrum of the signals. Examples are baseband local networks, including low bit rate data transmission on twisted-pairs and Ethernet on coaxial cables, and short loop digital PBX connections for terminals, hosts, and digital phones. In such systems, the transmission media cause little distortion of the pulse due to their wide bandwidth, and digital waveforms can be transmitted and received with little or no distortion.

However, in transmission systems where the channel bandwidth is limited, digital transmission with analog forms are more reliable. In such cases, pulse shaping filtering





Figure 1.4. Range of TCM and echo cancellation (hybrid mode) systems, as limited by the far end cross talk (FEXT) and the near end cross talk (NEXT). Reproduced from reference [7].

provides the analog formation. Nyquist waveforms with limited spectrum and zero intersymbol interference at the symbol intervals are typical examples of shaped pulses. The filtering involved is generally low pass filtering. In addition to limiting the spectrum of the pulses, it can also filter out the high frequency noise picked up along the transmission channel. The filtering, if placed at the transmitter, limits the high frequency components of the pulses which can interfere with the other transmission systems. If placed at the receiver, it can eliminate the out of band noise which could come from other transmission systems. A split form of filtering, half at the transmitter and the complex conjugate at the receiver can also provide matched filtering for optimizing the signal to noise ratio.

A proper combination of digital line coding technique with a pulse shaping function can preserve the advantages of the two, and is the preferred system configuration.

#### 1.2.1. Digital Line Coding

A digital signal is a sequence of discrete, discontinuous voltage pulses. Each pulse is a "symbol". The repetition rate of these pulses is called the symbol rate, or more commonly, the baud rate. The two important characteristics of a digital signal to be determined at the receiving end are the timing information and the data sequence. The associated properties which affect the determination of timing and data at the receiving end are spectrum bandwidth, error-detection capability, signal synchronization capability, and immunity to the channel impairments.

The line coding schemes which are of interest to us fall into the following categories: Non-Return-to-Zero (NRZ), Return-to-Zero (RZ), Biphase (or Manchester), Bipolar (or AMI), Partial-Response (PR), and Wal-2 code. NRZ and RZ codes, because of it binary nature, are also classified as binary coding. Each category has its own variations. The details of these line codes are illustrated in Fig. 1.5. Some aspects of these line codes will be discussed in the following.



Figure 1.5. Summary of various line codes.

Even in broadband transmission, the bandwidth of the spectrum of the signal is desired to be as low as possible. Bandwidth is an expensive asset, because noise and interference increase with the bandwidth required for the signal. The elimination of the DC component in the signal spectrum is also an important factor if AC coupling is to be used. NRZ is the simplest code, it also efficiently uses the bandwidth. The main limitations of the NRZ are the presence of the DC component and the lack of synchronization capability. RZ is basically the same as NRZ except it takes more bandwidth. Bipolar code is a nonlinear-processed version of NRZ by ensuring alternate positive and negative going pulses to remove the DC component. Note the 3 level property of the bipolar coding and the correlation between the various levels. Manchester and Wal-2 codes are DC free and require more bandwidth. Various partial-response codes have different properties. Generally, they require less bandwidth and 3 level transmission. Some PR codes also have no DC component.

Manchester code uses more bandwidth to ensure a predictable transition in the middle of each interval. The receiver can use this information for timing recovery and error detection. However, if the channel introduces substantial pulse distortion, the property will be lost.

For the same dynamic range, the codes with 3 levels lose some signal-to-noise ratio if the noise level is constant. This results because the distance between any two adjacent levels in a 3-level transmission is only half that of the 2-level transmission if the total dynamic range is fixed. Considering the bandwidth, the less bandwidth a code takes, the more immune to noise it is.

Although digital line coding is more frequently used in the broadband case where equalization is generally not necessary, some codes provide advantages with respect to equalization. This property becomes more important as the line coding is used with subsequent pulse shaping for a channel with limited bandwidth. Intersymbol interfer-

ence caused by the dispersion of pulses is one of the major obstacles to error-free transmission. Pulses normally experience trailing type dispersion (postcursor) in most channels, and the dispersion is a major source of *intersymbol interference (ISI)*. If a symbol has equal positive and negative pulses, the trailing tails tend to cancel and the resulting effective channel impulse response is shorter, and the ISI is also reduced. This property will be referred to as "self-equalizing", and will be exploited more later. Manchester and Wal-2 codes have this property, as do some forms of partial-response.

Some of the characteristics mentioned above change if the pulsing shaping follows the line coder. For example, the signal spectrum will be the overall spectrum of the line code and the filtering.

The convolutional coding, or trellis coding, inserts redundancy into the data sequence. In conjunction with maximum likelihood detection, which can easily be realized with a Viterbi decoder, trellis coding offers superior performance to uncoded signals in a noisy environment [54].

If adaptive filtering is used, whether in the equalization or in the echo cancellation, the convergence characteristic is a function of the signal spectrum and the correlation within the data sequence. It will be shown later that AMI coding, although it removes the DC component, is inferior to binary or partial-response coding in convergence speed.

#### 1.2.2. Pulse Shaping

Overlapping of pulses may cause intersymbol interference. Nonoverlapping requires that a pulse be confined to one time interval. A signal limited in time requires infinite bandwidth. However, symbols can overlap and yet still be free of intersymbol interference, because in a sampled-data system intersymbol interference is only defined at certain time instants. Decisions in data communication are made in discrete time. It turns out that pulses satisfying Nyquist criterion have zero crossing at multiples of

certain time intervals and have limited bandwidth. Nyquist pulses have been widely used in data communication due to these properties. If the timing can be properly recovered and if there is no distortion in transmission, data can be received without intersymbol interference. In the presence of channel distortion, some means of equalization must be used.

Special pulse shapes can be designed such that the precursor intersymbol interference is zero. The interference before the main pulse is referred to as precursor intersymbol interference. And that trailing the main pulse is called postcursor intersymbol interference.

#### 1.2.3. Combining Line Coding and Pulse Shaping

By combining the digital line coding with proper pulse shaping, we can fully utilize the existing knowledge on line coding developed for broadband system for systems with less bandwidth. In fact, there is no clear boundary between the two, either conceptually or in implementation. Line coding, if linear, can be viewed as passing a sequence of binary weighted delta functions through a filter with the impulse response equal to the pulse shape. Similarly, nonlinear filtering applies to nonlinear line coding. However, an artificial boundary is useful from the implementation point of view. Line coding can normally be implemented in digital circuitry. On the other hand, pulse shaping, normally an analog operation, is considerably more complicated. Although digital filters can be used to perform the equivalent analog filtering function, this is generally not advantageous. Still in some cases, it is difficult to distinguish the two when they are combined together. The location of the function also plays a role. If the operation is done before the transmit filter, where the signal is in digital form, it can be viewed as precoding. If however, the same function is placed at the receiver, where the signal is in analog form, it is more reasonably viewed as equalization. Later in the application in the digital subscriber loops, we will exploit different forms of line coding

and pulse shaping, both at the transmitter and in the receiver.

#### 1.3. Echo Cancellation

The echo canceller is one of the most critical components in the echo cancellation method digital subscriber loops. Although it is not the focus of this thesis, it deserves special attention, because it affects the requirements of the timing recovery. A practical system requires the echo canceller to achieve a 50-60 dB echo cancellation. In other words, the average power of the residual echo after the echo canceller must be at least 50-60 dB lower in magnitude compared to the echo power before cancellation. This stringent requirement places constraints on the nonlinearity allowed in the echo channel or in data conversions. It also limits the allowable jitter of the clock used to sample the received signal. Note that this clock is also used in the transmitter in synchronous fashion as required in the digital subscriber loops. The actual constraints on the nonlinearity depend on the the method of implementation. A 60 dB echo cancellation requires that the timing jitter be limited to  $\frac{1}{1000}$  of a data period. This condition basically prohibits the use of digital phase-locked loops (DPLL) unless some interpolation technique is used [8].

The complexity of the echo canceller, no matter how it is implemented, depends heavily on the subsequent processing. Data detection, post-equalization, and timing recovery must all be performed after the echo cancellation. Data detector together with a T-spaced transversal filter type linear equalizer operates at the baudrate, and as a consequence, can work with a baudrate operated echo canceller. Most timing recovery techniques require at least two samples per baud to retrieve the timing information from the data signal. As a result, the echo canceller must be operated at twice the baudrate to provide those samples to the timing recovery block. The complexity of the echo canceller increases linearly with the sampling rate. The conversion speed of the data conversion circuits, if any, must also be speeded up accordingly.

#### 1.3.1. Echo Canceller Structures

The echo canceller uses the digital data from the transmit section to reproduce the echo by synthesizing the echo path using adaptation algorithms. The replica of the echo will then be subtracted from the received signal in the receiver section to complete the echo cancellation function. There are many adaptive filter structures and adaptation algorithms to choose from. Two structures being considered for I.C. realization are the transversal filter structure and the look-up table structure. In the look-up table structure, the adaptation is normally done in the digital domain. The echo replica is stored in a memory, usually a RAM, to be addressed by the transmit data sequence. The addressed data, the replica of the echo, are read out, subtracted from the received signal, and then updated before being written back into the same location. The actual cancellation may take place in either analog or digital domain. A digital-to-analog converter is used to convert the memory readout into analog voltage for analog cancellation. Likewise, an analog-to-digital converter is used in the digital approach to convert the received signal into digital form for digital subtraction. For the digital subscriber loop application, where the data rate is in the range of 144 kbps, A-to-D converters with the required linearity are not yet feasible in VLSI technology. Analog subtraction is, therefore, the dominant approach in the look-up table structure.

There are more varieties of transversal filter structure. It is reasonable to pursue digital means for adaptation and filtering because the echo canceller is driven by digital data. Nevertheless, these functions can also be done easily in the analog domain. With the options of analog or digital subtraction, analog or digital adaptation, and analog or digital filtering, there are 4 combinations considered as candidates for VLSI integration [1,9]. The fully analog implementation performs all the required functions in analog domain without any data conversion. This solves the problem of nonlinearity in the data conversion. In view of currently known analog capability in CMOS technology, it

should be an attractive approach for VLSI realization. It was however determined not feasible because of the small step size required which in turn calls for large capacitor ratios in the switched-capacitor circuitry[9]. The problems associated with the large capacitor ratios are severe enough to rule out this approach. The small step size came about because there were no ways to cancel the far-end signal from the error terms used for adaptation [9]. The situation changes as the baudrate sampling technique, described in full detail in Chapter 7, is successfully developed.

#### 1.3.2. Fully Analog Approach

Two recent developments provided methods to cancel the far-end signal from the error terms. The adaptive reference echo cancellation (AREC) uses a reference former to reproduce the far-end signal to be subtracted from the error term [10]. The reference former, being a transversal filter with the received data as input, can be operated in a time-interleaved fashion to reform the far-end signal at sampling rates higher than data rate. The inevitable delay in the feedback was shown to be no problem if the step size is small.

The other new development is a new technique to recover the timing with sampling rate equal to the data rate, and is thus termed the baudrate sampling technique throughout this thesis. Employing a specific pulse shaping method, this timing recovery technique resolves a timing phase with nearly no precursor intersymbol interference. The far-end signal can then be reproduced using past decisions in an adaptive decision feedback equalizer (DFE) with no delay. A structure with a combination of echo canceller and DFE was proposed before. [11]. It has never been reduced to practice, however, for to two reasons. First, any precursor intersymbol interference of the far-end signal cannot be effectively eliminated. Secondly, no previously known method to recover timing using baudrate sampling was proven reliable. The baudrate sampling timing recovery technique, which will be discussed in much more detail in Chapter 7.

brought new life to the combined structure of echo canceller and DFE. This is a better approach than AREC because in addition to offering cancellation of far-end signal with no delay, it allows a simpler echo canceller. More importantly, the capability to eliminate the far-end signal completely and to recover timing at the baudrate allows a much larger step size for the same degree of echo cancellation. Therefore, the fully analog combination of echo canceller and DFE becomes a viable approach.

The all-analog approach eliminates the need for data conversions. This provides the advantages of eliminating an important source of nonlinearity and its simplicity leads to a small area in VLSI implementation. Other nonlinearities in the echo path (including the hybrid or transmitted asymmetry of pulses can be removed with the nonlinear echo cancellation technique developed by O.Agazzi [2].

## 1.3.3. Implementation Alternatives for the Fully Analog Approach

A straightforward implementation of the all analog echo canceller was described in great detail in [9]. Figures 1.6-1.8, reproduced from [9], illustrate in block diagram form the all analog approach of the combined echo canceller and DFE in Fig. 1.6, the details of the echo canceller taps in Fig. 1.7, and DFE taps in Fig. 1.8. The combination of echo cancellation and DFE together with the baudrate sampling timing recovery make possible both a larger step size, and in turn, the switched capacitor approach which is shown in these figures. Experiment showed that a step size of 0.3 is sufficient. A step size of 0.5 was also shown to perform well in a computer simulation.

A more exotic implementation alternative is to use capacitors to store the tap coefficients and source followers to buffer the storage from the summing circuitry, rather than using a sample-and-hold amplifier for each tap coefficient. This results in a great saving of area and components. This structure was used in [9] in the analog-digital echo canceller to implement the transversal filter. The block diagram and the circuit schematic of the taps are shown in Figs. 1.9 and 1.10 respectively.



Figure 1.6. A simplified circuit schematic of a fully analog implementation of the combination of echo canceller and decision feedback equalizer (DFE). Details of the taps are in Figs. 1.7 and 1.8.



Figure 1.7. Detail of an echo canceller tap as used in Fig. 1.6.



Figure 1.8. Detail of a DFE tap as used in Fig. 1.6.



Figure 1.9. Circuit schematic of an analog implementation of a transversal filter echo canceller, the correction term for the tap updating comes from a digital update logic via a DAC. The taps are shown in Fig. 1.10.



Figure 1.10. Circuit schematic of an echo canceller tap, using a capacitor for tap coefficient storage and a source follower as an output buffer for summation.

In the analog-digital echo canceller, the adaptation was performed with digital circuits in digital domain and then converted via a D-to-A for analog transversal filter. Here we propose to replace the adaptation and D-to-A circuits with a single echo canceller tap described in last section and shown in Fig. 1.7. The analog filter design as shown in Figs. 1.9 and 1.10 had been fabricated and proven experimentally to be satisfactory[9]. Note that the adaptation is updated periodically rather than continuously. The speed of convergence is slower, but the final resolution is not affected.

A unique feature offered by both these analog implementations is that the true stochastic-gradient algorithm can be employed for tap adaptation, whereas in most other approaches, the sign algorithm is more commonly used for implementation reasons. The benefit from this feature is small residual error and faster convergence.

#### 1.3.4. Other Transversal Filter Structures

Other implementation alternatives have been studied extensively elsewhere [9]. They are shown in Figs. 1.11-1.13, again reproduced from [9]. The operating rate of these echo cancellers was originally at twice the data rate. With the new baudrate sampling technique, the number of taps can be reduced by a factor of 2, and the required speed of the data conversions is also reduced by the same factor.

#### 1.4. Summary

In this chapter, the integrated services digital network (ISDN), the two possible implementations of digital subscriber loop systems, the comparison of the two, and the limiting factors were introduced. Line coding and pulse shaping, which are important design issues of the system, also have considerable impact on the equalization and timing recovery. There is a need to understand the complexity of implementation for the echo canceller, which affects and is affected by the timing recovery design. A description of the components in a DSL transceiver is in Chapter 2. The detailed description



Figure 1.11. Digital echo canceller, where the incoming signal is digitized, and all the signal processing and cancellation is done in digital domain.



Figure 1.12. Digital transversal filter with analog cancellation. D/A is used to convert the echo replica to analog domain for cancellation, A/D is used to convert the echo-free signal for adaptation, and possibly further signal processing. e.g., equalization and timing recovery.



Figure 1.13. Analog/digital echo canceller, transversal filter is in analog domain with digital adaptation.

and requirements of the timing recovery is provided in Chapter 3, where the interrelated design issues of the timing recovery, the echo canceller, and the equalization will also be addressed. The two most promising timing recovery methods will be described in Chapters 6 and 7, following the chapters on the phase-locked loops and equalization.

#### CHAPTER 2

## System Design

In Section 2.1, the analysis of several system parameters will be summarized, so will be certain important building blocks. A description of the components involved in a full-duplex modem will be considered in Section 2.2. This chapter draws heavily on Chapter 2 of the thesis by Agazzi [9].

#### 2.1. Design Parameters and Techniques

Basic techniques like echo cancellation, equalization and timing recovery will be introduced in this section. The specifications associated with these functions will also be given.

#### 2.1.1. Error Rate Objectives

The error rate, which can be translated to the signal-to-noise ratio requirement, for a given communications system depends on the statistical distribution of the disturbance. The main sources of disturbance in the context of a digital subscriber loop (DSL) are echo, crosstalk, and impulse noise. Thermal noise, on the other hand, can be ignored because it is much smaller than the other sources mentioned. The crosstalk noise can be adequately modelled by a Gaussian distribution when the number of crosstalk interferers is large. The residual echo error after echo cancellation is random and bounded. Its distribution has been modelled as "approximately Gaussian" [12] and as "uniform" [9] by different people. In either case, if the variance of the crosstalk noise is much higher than that of the residual echo noise, the composite noise distribution will be dominated by that of the crosstalk noise, and vice versa. Impulse noise is generally characterized in terms of the number of events per unit time, instead of its

statistical distribution. It is much more difficult to deal with, and so it will not be considered. Generally speaking, a signal-to-noise ratio of 20 dB gives us enough margin for a comfortable error rate.

#### 2.1.2. Echo Cancellation

The two basic structures considered for echo canceller were introduced in Chapter 1, they are the transversal structure and the look-up table or memory structure. In this section, we will present the adaptation algorithm and the associated parameters that governs the convergence speed and the residual error [6].

Although other algorithms have been proposed in the literature [13,14], the least mean square (LMS) algorithm has gained widespread acceptance in adaptive transversal filter because of its simplicity and efficiency. A small variation of the LMS algorithms leads to the stochastic gradient algorithm, where a simple iterative update equation can be formulated. For a transversal filter echo canceller which covers N band intervals, the N coefficients  $c_i(k)$  are iteratively updated according to

$$c_i(k+1) = c_i(k) + \alpha r(k) a(k-i)$$
 (2.1)

where r(k) is the control signal at time kT,  $\alpha$  is the *step size*, and a(k-i) is the data symbol being transmitted from the local transmitter at time (k-i)T. The output of the transversal filter, the echo replica, is given by

$$\hat{e}(k) = \sum_{i=0}^{N-1} a(k-i)c_i(k)$$
 (2.2)

and is subtracted from the received signal s(k) to yield the control signal  $r(k) = s(k) - \hat{e}(k)$ . If s(k) = e(k) + u(k), where e(k) is the echo signal and is linearly dependent on the local data a(k), and u(k) is the far-end signal independent of a(k), the adaptive filter will continuously try to minimize the root-mean square (rms) value of the residual error  $e(k) = e(k) - \hat{e}(k)$ . Note that in general the control

signal r(k) used for adaptation includes both the residual error  $\epsilon(k)$  and the far-end signal u(k), where u(k) can be comparable or small relative to the echo signal e(k) depending on the length and attenuation of the channel. The residual error  $\epsilon(k)$  is the driving force in the adaptation, and the far-end signal u(k) is a disturbing signal in the control loop. The performance of the adaptive filter is represented by its convergence characteristics, in which the rate  $R^2(k) = P(\hat{e}-e)/P(u)$  is given as a function of the time index k. Here  $P(\hat{e}-e)$  is the (short-time) average power of  $e(k)-\hat{e}(k)$  and P(u) is the average power of u(k) [6]. The convergence can be adequately characterized by only two parameters,  $\delta$  and  $\nu_{20}$ , where

$$\delta = \lim_{k \to \infty} R^2(k) \tag{2.3}$$

and

$$\nu_{20} = 1/\log(R(k)/R(k+1)), \quad R^2(k) \gg \delta$$
 (2.4)

The parameter  $\delta$  characterizes the converged state and  $\nu_{20}$  is a measure for the initial speed of convergence for large values of  $R^2(0)$ . Here  $\nu_{20}$  represents the number of iterations required for 20 dB reduction of R(k). In practical circumstances it will always hold that  $KN \ll 1$ , and then

$$\delta \approx \alpha N / 2 \tag{2.5}$$

and

$$\nu_{20} \approx 2.30/\alpha \tag{2.6}$$

It should be clear from Eqs. (2.5) and (2.6) that the choice of  $\alpha$  allows a tradeoff between initial speed of convergence and final value. Note that this tradeoff can be expressed independent of  $\alpha$  as

$$\delta \cdot \nu_{20} \approx 1.15N \tag{2.7}$$

It is also clear from Eq. (2.1) that there are N updates to be performed per iteration,

and each update requires one multiplication. The number of taps is equal to N.

In the look-up table structure, the combination of N near-end data symbols is used as the address for a random access memory (RAM). The contents of the RAM at this address is the echo replica  $\hat{e}^k[a(k),a(k-1),...,a(k-N+1)]$ , corresponding to the combination of data symbols a(k),a(k-1),...,a(k-N+1) at time t=kT. If a certain address is read at time t=kT, its content is used firstly for echo subtraction, and subsequently updated with a correction term, while all the other contents remain unchanged. For the stochastic gradient algorithm, the updating is formulated as

$$\hat{e}^{k+1}[a(k),a(k-1),...,a(k-N+1)] = \hat{e}^{k}[a(k),a(k-1),...,a(k-N+1)] + \alpha r(k)$$
(2.8)

where  $r(k) = s(k) - \hat{e}(k)$  as before. The parameters  $\nu_{20}$  and  $\delta$  corresponding to this method are given by

$$\delta = \alpha/2 \tag{2.9}$$

and

$$\nu_{20} \approx 2.30 \cdot 2^N / \alpha \tag{2.10}$$

which yields

$$\delta \cdot \nu_{20} \approx 1.15 \cdot 2^N \,. \tag{2.11}$$

Note that there is only one update per iteration, and only one multiplication is needed. However, the size of the memory grows exponentially with N. For binary data, the memory size is  $2^N$ .

For 20 dB echo cancellation, where  $\delta = 0.01$ , and assuming N = 5,  $\alpha = 0.004$  and  $\nu_{20} = 600$  for the transversal filter case, and  $\alpha = 0.02$  and  $\nu = 3700$  for the look-up table. The small values of the step size  $\alpha$  comes from the distubing components in the control term due to the far-end signal.

There may be variations as to the updating methods, e.g., sign algorithm vs. stochastic algorithm; level of transmission, e.g., binary vs. ternary; and inclusion or exclusion of far-end signal in the control term. Some simulation results comparing those alternatives will be covered in Chapter 8.

## 2.1.2.1. Interpolation Echo Cancellers

The echo canceller described in last Section normally puts out one echo-free sample per baud, which is generally sufficient for equalization and data detection. Timing recovery, another function that must be implemented in a DSL transceiver, needs echo-free signal to obtain a clock that is in lock to the far-end transmitter. More than one sample per baud may be required to perform timing recovery if a frequency domain approach is used, such as the spectral line method [20,24,26,30]. In order to preserve the spectrum of the received signal without aliasing distortion, the sampling rate must be at least twice the bandwidth of the signal. Typically the spectrum of the data signal extends beyond  $\frac{1}{2}f_b$ , often as high as  $f_b$ , where  $f_b$  is the data rate.

In the transversal filter echo canceller, an oversampling factor of R can be implemented with RN taps. Unavoidable hardware complexity is involved if a transversal filter with RN taps is used. However, because the input is a data sequence at a rate of  $f_b$ , a time-interleaved structure as shown in Fig. 2.1 can be used instead. This structure consists of R N-tap subfilters, each sampling at rate  $f_b$ , with a phase difference  $\frac{T}{R}$ , where T is the baud interval. At the output the samples of the R filters are then combined sequentially again with a phase difference of  $\frac{T}{R}$ . The total number of taps is still RN, and the number of updates also grows to RN per iteration. The convergence time remains the same as a baudrate sampling transversal filter because each subfilter operates independently of one another.



Figure 2.1. Time interleaved echo canceller with R subfilters.

In the look-up table approach, R independent memory blocks, each is assigned to a specific phase, can be used to accomplish an interpolating echo canceller. The memory size is  $R \cdot 2^N$ , and the number of updates also grows to R per iteration.

# 2.1.3. Echo Cancellation Requirement

The performance measure of an echo canceller is defined as the ratio of the input to the output rms echo ratio. When measure in dB, it is expressed as

$$E_{c} = 20 \log_{10} \left[ \frac{rms \ in \, put \ echo \ power}{rms \ out \, put \ echo \ power} \right]$$
 (2.12)

If a transhybrid loss of 10 dB and a line attenuation of 45 dB are assumed, a 20 dB signal-to-echo ratio requires an echo cancellation of 55 dB.

#### 2.1.4. Equalization

3 ...

The pulse-shaping filters, including transmit and receive filters, of a data transmission system are generally designed either assuming a flat frequency response in the channel, or a particular compromise response, which is chosen to approximate as closely as possible the response most likely to be encountered in practice. However the actual line response will generally depart from the one used in the design, because the system is required to operate with various line configurations. The distortion will generate intersymbol interference (ISI).

Equalizers are linear or nonlinear networks designed to correct ISI arising from the nonideality of the channel. They may be manually adjustable or automatic, depending on how their impulse responses are tailored to the desired responses. Chapter 5 is dedicated to the equalization techniques that are considered as candidates or that are related to the application of DSL.

#### 2.1.5. Timing Recovery

The receiver is required to synchronize to the far-end receiver in a data transmission system to ensure the correct detection of far-end data. It is necessary to sample the incoming data signal at either the maximum eye opening for direct data detection, or at instants appropriate for the equalizer used. The transmitter is also required to be synchronized to the received signal in order to ensure that the signal switch with which the transceiver located at the central office interfaces, transmits and receives data at the same rate.

#### 2.1.6. Synchronization in Subscriber and Central Office Sets

The central office transmitter receives its clock from the switch. The receiver at the subscriber end synchronizes to the received signal and thus is also slaved to the switched clock. The transmitter of the subscriber transceiver is then synchronized to the received clock to ensure that the central office transceiver transmits and receives at the same rate, except at the beginning of the transmission when the echo canceller and the timing recovery of the subscriber transceiver have not yet converged. The central office receiver then adjust its clock phase which is used to sample its received signal using its own timing recovery circuitry. Note that although the frequency at the central office is always correct for the received signal, the correct sampling phase depends on the round-trip delay of the line, and consequently it must be determined from the received signal. Overall, the timing information follows an open loop: from the central office transmitter to the subscriber receiver, from here to the subscriber transmitter, and finally back to the central office receiver, as shown in Fig. 2.2.

#### 2.2. System Description

In this section, all the functions to be implemented in a DSL transceiver equipped with the baudrate sampling technique timing recovery as mentioned in Chapter 7 are



Figure 2.2. Timing information flow in the digital subscriber loop. Timing originates at the central office, is recovered at the subscriber receiver and used by the subscriber transmitter. The central office receiver must lock onto the subscriber transmitter because the correct sampling phase depends on the round-trip delay.

described. The experimental breadboard implementations cover all the functions required but the echo canceller, the details on the breadboard design can be found in references. [15,16] All the functions except the echo canceller are applicable to both the time compression multiplexing (TCM) and the echo-cancellation systems. Since the echo-cancellation mode system has much more restricted requirements on the sampling rate and the timing jitter, the implementation will be focused on this system. Fig. 1.3 in Section 1.1 shows a block diagram of a full-duplex echo-cancellation system under consideration. Two baseband DSL transceivers communicate at 160 kb/s on a single pair of twisted-pair wires. One of them is located at the central office, the other at the subscriber end. Typically, a transhybrid loss of  $10 \ dB$  is expected of the hybrid, that can be either a hybrid transformer or an electronic balance circuit. Line attenuation of  $45-50 \ dB$  is measured at half the data rate for a 5km subscriber loop. An echo cancellation of  $55-60 \ dB$  is thus required for  $20 \ dB$  signal-to-echo ratio.

The system configuration which was chosen for breadboard implementation, also considered as the preferred configuration for VLSI implementation, is shown in detail in Fig. 2.3. They are described in more detail in the following subsections.

#### 2.2.1. Scramble and Descrambler

The scrambler randomize the data sequence to ensure a pseudo-random sequence even during idle or repetitive data patterns [17]. A random data sequence is required for the convergence of the echo canceller and the equalizer, to avoid discrete spectral components in the signal spectrum, as well as to aid timing recovery in the receiver. The scrambler chosen is recommended by CCITT and performs a modulo 2 division by the polynomial  $1+x^3+x^{20}$  [18], where x is the input to the scrambler and assumes values of 1 and 0. When a constant binary value is placed at the input of the scrambler, a period sequence with period  $2^{20}-1$  is generated. For all practical purposes, this can be considered as random because the period is very long.



Figure 2.3. Detailed block diagram of the DSL transceiver.

The scrambler, which is self-synchronizing, performs a modulo 2 multiplication by the same polynomial.

#### 2.2.2. Line Coder

A line coder at the transmitter converts the data sequence into another sequence according to a set of special rules. It is normally done in digital domain, although it may also serve the purpose of pulse shaping if the sampling rate is higher than the data rate. The code chosen for the experiment and proposed for DSL application is the  $1-z^{-1}$  partial response. Other partial-response codes with similar property, such as the modified duobinary can also be used, although they don't ensure alternate mark inversion. The  $1-z^{-1}$  partial response is a 3-level code, however, because it is generated from a 2-level code through a linear process, it can be treated as a 2-level code going through a channel with an impulse response equal to  $1-z^{-1}$ . It is desired to avoid 3-level code in the echo canceller, because of not only the additional hardware complexity, but also a slower adaptation speed. The fact that this 3-level code can essentially be treated as a 2-level code presents a great advantage.

A pulse shaping function is also accomplished by the line coder by sampling the line coder at  $4f_b$ , where  $f_b$  is the data rate. The shaped pulse has a zero crossing at the first precursor, provides a simple target point for timing recovery. The line coder as used in the experimental setup is shown in Fig. 2.4. It should be noted that this pulse shaping function can be performed at the receiver using analog filtering techniques alternatively.

# 2.2.3. Transmit and Receive Filters

Minimal intersymbol interference filters [19] are used for the transmit and receive filters.



Figure 2.4. The digital line coder that performs both the  $1-z^{-1}$  partial response coding and the pulse shaping.

The alternative approach of using minimal ISI matched filters was also considered. It offers the best protection against white Gaussian noise in the ideal channel, however, with significant increase in complexity. The small improvement in nonideal channel may not justify the excessive complexity required by a matched filter.

Furthermore, the duration of the impulse response must be kept low to reduce the number of taps required by the echo canceller and the equalizer. This requirements applies also to the *effective channel impulse response*, which is the composite impulse response including the effect of line coder, transmit and receiver filters, and the channel characteristics. The self-equalization property of the  $1-z^{-1}$  partial response helps to reduce the length of the effective channel impulse response.

The transmit filter has a low-pass property to avoid the unnecessary crosstalk and radio frequency interference (RFI), which increases with frequency. The nonlinearity which may appear in the filtering must be kept at a minimum. The symmetry of the pulses must meet the same requirement.[9] The receiver filter must also have a low-pass property to reject the out-of-band noise from the channel. The pole location for the transmit filter is:

$$p = -8.168 + j 0.0$$

and the pole locations of the receive filter are:

$$p1 = -1.313 + j 2.97$$

$$p2 = -1.313 - j 2.97$$

$$p3 = -2.141 + j 1.154$$

$$p4 = -2.141 - i 1.154$$

These values are in radian frequency and are normalized to the data rate.

## 2.2.4. Equalizer

No  $\sqrt{f}$  equalizer was used in the experiment setup because a self-equalizing line code was used. A digital decision feedback equalizer (DFE) is used to mitigate the effect of ISI caused by the partial-response portion of the pulse and that caused by the bridged taps. The required number of taps was found to be 8 from extensive computer simulation. The relatively large number was simply due to the severe ISI caused by multiple bridged taps. The number can be reduced if more restricted line configurations were used. The taps weights of the DFE were stores in a bank of up-down counters, each has 12 bits and can be incremented or decremented by one least significant bit (LSB) per update. In other words, sign algorithm was used for adaptation. However, only 8 bits were converted to analog form for cancellation.

## 2.2.5. Echo Canceller

The echo canceller was not included in the experiment setup, because the feasibility had been illustrated and a large number of variations is available for implementation. The effect of residual echo on the timing recovery and equalization was tested by injecting an attenuated echo from a local transmitter. The degree of attenuation was adjustable. The alternatives and design issues of the echo canceller were presented in Sections 1.3 and 2.1.2.

## 2.2.6. Detector

Since a DFE was used to remove the partial-response portion of the pulses, the 3-level pulses were detected using a 2-level detector as 2-level codes. The threshold of the 2-level detector is well defined at the center of the signal and independent of the line attenuation. The detector can start making correct decisions even before the AGC converges. It offers a great improvement in the start-up convergence.

## 2.2.7. Timing Recovery

In the experimental system, the baudrate sampling technique was used for timing recovery. The details of this timing recovery technique is in Chapter 7. A block diagram of the timing recovery is shown in Fig. 2.5, where 3 basic blocks are shown, namely, the timing function generator, loop filter, and the voltage controlled oscillator (VCO). It must be noted that the input to the timing recovery is the equalized signal, where the ISI has been removed by DFE. These three blocks constitute a phase-locked loop (PLL), where the timing function generator serves the purpose of a phase detector in a PLL.

# 2.2.7.1. Timing Function Generator

Two versions of the timing function generator were built, corresponding to the digital and the analog approaches described in Chapter 7. The digital approach assumes that the echo canceller outputs a digital signal and the DFE cancellation is also performed in digital domain. A precision of 8 bits was assumed. The analog implementation was built to simulate the switched-capacitor realization of the analog timing function generator. Because of the fundamental difference between integrated circuits and off-the-shell analog components, multiplications were achieved with MDAC's.

## 2.2.7.2. Loop Filter and VCO

Two versions of loop filter and VCO were also built in the breadboard form. In the digital approach, an off-the-shelf part, the SN74LS297 digital phase-locked filter, was used for the loop filter. A divided-by-40 counter also used off-the-shelf TTL parts. The timing jitter in this configuration could be as high as  $\frac{1}{40}T$ . This value can cause performance degradation in the echo cancellation unless interpolation technique is used.



Figure 2.5. Block diagram of the timing recovery used in the experiment setup.

An analog version of the loop filter and VCO was also constructed. The VCO part used was the SN74LS124. The acquisition time was long and timing jitter high as expected because of the large VCO gain and the low precision of the free running frequency.

The most promising and reasonable implementation would be a *crystal VCO* (VCXO), which although not implemented in the experiment setup, has been successful used before in a similar application [9].

## CHAPTER 3

# Timing Recovery

A hierarchy of synchronization problems needs to be considered in digital data communication. Carrier synchronization, which is required only in the passband system, concerns the regeneration of both the frequency and the phase of the carrier. Bit synchronization, or timing recovery, is concerned with synchronizing the receiver clock with the baseband data-symbol sequence, whether it is a demodulated passband signal in the carrier system or a baseband signal originally [20]. Again, timing recovery, also referred to as symbol synchronization, involves both the frequency and phase. Word, frame, and packet synchronization, involving a specially designed message format, belong to another level in hierarchy, which is not considered here. An accompanying pilot tone transmitted along with the data signal is another means to provide timing information to the receiver, which will not be discussed here either. In other words, self-timing, where the timing frequency and phase are derived from the information-bearing signal without any pilot tone, is the one of interest to us.

The recovered timing clock, in the simplest situation, is used to sample the data signal for decision. A small timing jitter, which is the fluctuation in sampling instants deviated from a uniformly time-spaced clock, is allowable. The recovered timing is also used to retransmit data if the timing recovery is in a repeater. Timing jitter can accumulate along the repeater chain. If a sampled-data transversal filter is used for equalization, the recovery timing clock may also be used to clock this filter. In a synchronous system, where the *local transmitter* clock is slaved to the recovered timing in the transceiver, the recovered timing may be the only clock in the transceiver. In the latter three cases, the jitter specification is much more restrictive.

# 3.1. Conventional Timing Recovery Methods

Since the classic paper on the statistics of regenerative digital transmission in 1958 by W. R. Bennett [21], the timing recovery technique has received considerable attention. A survey of literature on timing recovery shows that a variety of methods has been studied or developed. The spectral line method, most commonly realized and analyzed in the square-loop configuration, is the most straightforward approach [20,24,26,30]. In the spectral line method, the data signal is passed through a nonlinear device before being processed by a narrow bandpass filter or its equivalent, a phaselocked loop (PLL), as shown in Fig. 3.1. It can easily be shown that the expected value of the nonlinearly processed signal has a periodic term with the frequency equal to the data rate. It is often stated that a spectral line component is created at the data rate in the spectrum although the signal is cyclo-stationary and on which the power spectrum is not defined. The discrete spectral line component is then extracted with a narrow bandpass filter. The phase and the magnitude of this periodic signal in relation to the data signal depend on the nonlinear function and the pulse shape. However, because of the complexity involved with nonlinearity, square-law nonlinear process is the only one tractable. More details about the timing phase obtained by the spectral line method will be covered in Section 3.3.

The threshold crossing and the sampled-derivative methods were analyzed in [22]. The block diagrams of these two methods are shown in Fig. 3.2. In the threshold crossing method, an error pulse with amplitude equal to the difference between the time of the zero crossing and the time of the nearest sampling pulse is generated each time the signal crosses zero. In the sampled-derivative method, the error signal is the sampled value of the derivative of the data signal multiplied by the polarity of the signal at that time. Note both methods have inherent nonlinear functions in the error signal generation. It is shown in [22] that although the timing jitter performances are very





Figure 3.1. The conventional spectral line method for timing recovery, where  $f_b$  is the data rate, and is equal to  $\frac{1}{T}$ .



Figure 3.2. Threshold crossing method (a) and sample-derivative method [22].

similar, the timing phases recovered by the two methods are different if the pulse is not symmetrical. The phase recovered by the threshold crossing method is the time instant,  $\tau$ , such that  $f(\tau-T/2)=f(\tau+T/2)$ , and that recovered by the sampled-derivative method is the peak of f(t), where f(t) is the effective channel impulse response representing a symbol, and T is the time interval between pulses. The maximum eye opening as observed in the eye diagram is not necessary at either of the two timing phases recovered by these two methods. However, if the impulse response is symmetrical around its peak, the two phases coincide at the maximum eye opening.

In [23], another method, namely the early-late gate method was analyzed. The block diagram of the early-late gate method is shown in Fig. 3.3. The optimum timing phase in the sense of maximum likelihood is also derived. An important result of the study showed that both the sampled-derivative and the early-late gate methods are good approximation of the implementation of the maximum likelihood strategy. Note again that a nonlinear operation, the full-wave rectification, is included in the early-late gate implementation.

The timing recovery methods described in this section are either continuous-time or sampled-data with sampling period much less than the data interval. Timing recovery techniques with lower sampling rates are the preferred approach as we will see after taking practical considerations into account.

#### 3.2. Optimum Timing Phase

The implementation of the maximum likelihood strategy is shown in Fig. 3.4., where a matched filter is inserted in the front end to generate an effective channel impulse response which is symmetrical around its center. The optimum timing phase in this configuration is apparently the center of symmetry for maximum likelihood detection. It is also obvious that all the timing recovery methods mentioned in last section converge to this timing phase if there is a filter matched to whatever the channel



Figure 3.3. Early-late gate method [23]. The two sampling phases are either advanced by  $\delta$  or retarded by  $\delta$  with respect to the desired phase.



Figure 3.4. Implementation of maximum likelihood strategy [23].

characteristics happen to be. Because a matched filter is difficult to implement, more realistic data transmission systems use lowpass filters instead. What is then the optimum timing phase? It has been assumed in some studies [12,23] that the desired sampling timing is that instant when the unequalized impulse response is a maximum. It was later shown by Lyon [24] that such a timing phase can result in a spectral null if a T-spaced transversal filter is used for equalization. In general, the best sampling phase depends on the subsequent equalization scheme. For instance, if no equalizer is used other than a slicing device for data decision, the phase with maximum eye opening in the eye diagram is the best sampling instant. It has also been shown that the timing at the peak of a periodic alternating data sequence is the optimum phase for a T-spaced transversal equalizer[25]. Since the timing phase plays an important role in equalization, it is useful to find out exactly what timing phase is recovered. In next section, a technique to locate the recovered timing phase in the spectral line method is described.

# 3.3. On the Timing Phase Recovered by the Spectral Line Method

Let's consider the timing recovery in a baseband data transmission. The input signal assumes the form

$$x(t) = \sum_{k=-\infty}^{\infty} a_k \ g(t-kT)$$
 (3.1)

where  $\{a_k\}$  is the data message sequence and g(t) is the signaling pulse, sometimes also referred to as the *impulse response* of the effective channel, which includes the pulse shaping filter, transmit and receive filters, and the transmission medium. T is the symbol interval, and is equal to  $\frac{1}{f_b}$ , where  $f_b$  is the data rate (baud-rate). In a binary data system,  $a_k$  can only assume the values +1 and -1.

A nonlinear operation  $f(\cdot)$  is performed on x(t) and results in signal y(t)

$$y(t) = f(x(t))$$
(3.2)

A discrete line at the data rate is generated in the signal spectrum of y(t) after the non-linear operation  $f(\cdot)$  is performed on the input signal x(t). This frequency component is separated from the residual continuous spectral components by using a narrow bandpass filter or its equivalent, a phase-locked loop.

This technique is widely used in the passband systems where the baseband spectrum is relatively narrow compared to the carrier frequency and the channel has a relatively flat spectrum around the carrier frequency. The spectral line method can resolve satisfactory timing phase under the condition that g(t) is symmetrical about the peak. In most systems, Nyquist pulses are used for g(t) to minimize the intersymbol interference. Nyquist pulses are symmetrical around the peak, and therefore, spectral line method recovers the phase at the peak of the pulses.

However, problems occur for high baudrate transmissions. In high speed passband modems, where symbol rate can be 2400 baud / sec or higher, the channel spectrum is not flat any more. In high speed baseband systems, where the baseband spectrum occupies the whole channel spectrum, the channel could cause large distortion of the pulse shape. The recovered timing phase for an asymmetrical pulse using the spectral line method is no longer at the peak. The recovered timing phase under such conditions is the topic of this section.

Because of the difficulties in dealing with the nonlinearity, only the square-law nonlinearity has been analyzed [26]. The analysis was done not only for the recovered timing phase but also to compare the timing jitter generated by the continuous spectrum. The jitter analysis turned out to be pessimistic because of the fact that the signal is cyclostationary. However, the power and the phase of the *timing tone* are still valid.

Assuming that G(f) is band limited to  $\frac{1}{T}$ , the magnitude and phase of the timing tone generated by the square-law nonlinearity is

$$A_1 = \int_{-\infty}^{\infty} G\left(\frac{1}{T} - f\right)G(f)df \tag{3.3}$$

where G(f) is the Fourier transform of g(t).

It is obvious from Eq.(3.3) that the recovered timing phase for a symmetrical g(t) is at the center of symmetry. Since G(f) is real for a symmetrical g(t), so is  $A_1$  real, and the phase is zero. The phase for an arbitrary g(t) can also be calculated from Eq.(3.3).

#### 3.3.1. The Importance of Timing Phase

If Nyquist pulses are received, such as in the case of narrow baseband transmission and flat channel spectrum, the optimum sampling time is at the center of symmetry, which is also the peak of the pulse. If the recovered timing is used for clocking the equalizer as well as for the detection, the definition of the optimum timing phase depends on the following equalization scheme. A commonly used equalizer is the transversal filter type linear equalizer. As shown if Figure 3.5, a tapped delay line transversal filter with spacing T ( $T = \frac{1}{f_b}$ ) takes samples  $\{x_k\}$  from the received signal x(t) and performs filtering function on  $\{x_k\}$ . The tap coefficients are usually adaptive and are updated to minimize the mean squared detection error. Because  $x_k$ 's are sampled from x(t) at rate  $\frac{1}{T}$ , the power spectrum of  $\{x_k\}$  is periodic with period  $\frac{1}{T}$ . The relationship between the power spectra of x(t) and that of  $\{x_k\}$ , denoted by S(f) and  $S_s(f)$  respectively, are shown in Figures 3.6, the spectra shown in Figures 3.6(b) and 3.6(c) are called the folded spectra and are functions of sampling phase. Normally, S(f) and  $S_s(f)$  are complex.

The effect of the transversal filter is to generate a compensating transfer function proportional to  $\frac{1}{S_s(f)}$  and the output will have a flat folded spectrum identical to the folded spectrum of a baseband PAM signal with Nyquist pulses. A major drawback of



Figure 3.5. Structure of a tapped delay line transversal filter with spacing T.



Figure 3.6. (a). Magnitude of the power spectrum of a PAM data signal. (b). Magnitude of the power spectrum of a sampled PAM data signal, sampling at phase 1. (b). Magnitude of the power spectrum of a sampled PAM data signal, sampling at phase 2.

this equalizer is that for a power spectrum  $S_s(f)$  with nulls at some frequencies, the transfer function of the equalizer will have a very large gain at these frequencies, and consequently, enhance the noise. As noted above, the shape of  $S_s(f)$  depends on the sampling phase, and therefore, the timing recovery phase becomes important in this application.

Lyon [24] has shown that for normal telephone channel, where the frequency response decreases with frequency and where phase is a nonlinear function of frequency, a sampling phase at the peak of the pulse can create a null at the band-edge, at  $f = \frac{1}{2T}$ . Obviously, if the linear equalizer is to be used, sampling at the peak is not optimum. Lyon has proposed to select a sampling phase which maximizes the band-edge component. It is interesting to note that if the pulse is symmetrical, the phase which maximizes the band edge component is also at the center of symmetry.

Bridged taps, which are open-ended wire pairs attached to the main subscriber loops, can alter the characteristics of the transmission channel dramatically. Figure 3.7 shows the frequency response of a subscriber loop with three bridged taps intact. There are periodical valleys in the frequency response. In this case, nulls can exist within the band and make it more difficult to define the optimum timing phase.

The validity of the spectral line method for timing recovery can be evaluated if the timing function phase recovered by this method for any arbitrary non-linear function can be determined analytically. Next section we will present two approaches to calculating the timing phase obtained by the spectral line method.

## 3.3.2. Timing Phase Recovered by the Spectral Line Method

A brute force approach to calculating the timing phase recovered by the spectral line method is to generate a large number of random data symbols and the corresponding baseband pulse amplitude modulation (PAM) signal x(t) and also the corresponding nonlinearly processed signal y(t). Then periodically average y(t) for





Figure 3.7. (a). Frequency response of a twisted pair with 3 bridged taps intact. (b). Composite frequency response of the transmit filter, channel as shown in (a), and the receive filter.

0 < t < T with period T. A Fourier series expansion can then be operated upon this averaged signal and the phase of the Fourier component at frequency  $f = f_b$  is the recovered phase.

Another approach is to utilize the binary series expansion [2] on the non-linear function. Here we assume that the impulse response g(t) is finite in time and lasts only for N intervals, i.e.,

$$g(t)=0 t < 0, or t \ge NT (3.4)$$

and we take advantage of the fact that x(t) is generated from a binary data sequence according to Eq.(3.1). Samples taken from x(t) are denoted as  $x_k$ 

$$x_k = \sum_{n=0}^{N-1} a_n \ g_{k-n}$$
 (3.5)

$$x_k = \sum_{n=0}^{N-1} g_n \ a_{k-n} \tag{3.6}$$

After the nonlinear operation, the samples are

$$y_k = f(x_k) = f(a_k, a_{k-1}, \dots, a_{k-N+1})$$
 (3.7)

where  $f(a_k, a_{k-1}, \dots, a_{k-N+1})$  is a nonlinear function on N binary symbols. There are  $2^N$  possible combination of these N bits and  $y_k$  can assume only  $2^N$  possible values. It is then reasonable to express  $y_k$  as a linear function of the  $2^N$  -dimensional augmented binary vector  $\mathbf{z}$ , where

$$\mathbf{z}^{T} = (1, a_{0}, a_{1}, \cdots, a_{N-1}, a_{0}, a_{1}, a_{0}, a_{2}, \cdots, a_{N-2}, a_{N-1}, \dots, a_{0}, a_{1}, \cdots, a_{N-2}, a_{N-1})$$
(3.8)

in the fashion of inner product of two vectors

$$\mathbf{y}_{k} = \mathbf{z}^{T} \mathbf{c} \tag{3.9}$$

where c is a  $2^N$ -dimensional coefficient vector.

If we know  $g_k$  and  $f(\cdot)$ , we can calculate  $x_k$  and  $y_k$  for all  $2^N$  combinations of  $(a_k, a_{k-1}, \dots, a_{k-N+1})$  and form a  $2^N$ -dimensional vector  $\mathbf{f}$ .

$$f (+1,+1,+1,...,+1) f (-1,+1,+1,...,+1) f (+1,-1,+1,...,+1) f (+1,+1,-1,...,+1) f (-1,-1,+1,...,+1) f (-1,+1,-1,...,-1) f (-1,+1,+1,...,-1) f (-1,-1,-1,...,+1) f (-1,-1,-1,...,+1) f (-1,-1,-1,...,+1) f (-1,-1,-1,...,+1)$$

Also define  $2^N \times 2^N$  matrix M

$$\mathbf{M} = \left[ \mathbf{z}_1, \, \mathbf{z}_2, \, \cdots, \, \mathbf{z}_{2^N} \, \right] \tag{3.11}$$

where

$$\mathbf{z}_1 = \begin{bmatrix} +1, +1, +1, \dots, +1 \end{bmatrix}^T$$
 (3.12)

and

$$\mathbf{z}_2 = \begin{bmatrix} +1, -1, +1, \dots, +1 \end{bmatrix}^T$$
 (3.13)

follow the same sequence as in Eq.(3.10). Matrix M is orthogonal, and

$$\mathbf{M}\mathbf{M}^T = \mathbf{2}^N \mathbf{I} \tag{3.14}$$

where I is the  $2^N \times 2^N$  identity matrix. Also note that all elements of M are either 1 or -1. Orthogonal matrices whose elements are 1 or -1 are called *Hadamard matrices*, and have been used in other fields of signal processing, like image encoding [27]. To compute the coefficient vector c, note that

$$Mc = f (3.15)$$

which admits the closed form solution

$$c = \frac{1}{2^N} Mf \tag{3.16}$$

In deriving Eq. (3.16), we use the relation

$$\mathbf{M}^{T} = \mathbf{M} = \frac{1}{2^{N}} \mathbf{M}^{-1} \tag{3.17}$$

This vector c is evaluated at a specific phase in a period T, same derivation can be performed similarly for other phases. If we divide a period T into R phases, then R vectors can be evaluated and each is denoted by  $c^r$  respectively, where  $0 \le r \le R-1$ .

After computing all vectors  $c^r$ ,  $0 \le r \le R-1$ , from f at various phases, which was again computed from  $g_k$  at different phases, we can evaluate the ensemble average of  $y_k^r$  over all the possible combinations of  $\{x_k\}$ .

$$y_k^r = \mathbf{z}^T \ \mathbf{c}^r \tag{3.18}$$

$$E\{y_k^r\} = E\{\mathbf{z}^T \mathbf{c}^r\}$$

$$= E\{\mathbf{z}^T\}\mathbf{c}^r$$
(3.19)

Note that

$$E\{\mathbf{z}^T\} = (+1,0,0,...) \tag{3.20}$$

It is simple to see that the ensemble average of y(t) at phase r is simply the first term in  $c^r$ . Again,  $c^r$  can be easily evaluated if  $h_k$  and  $f(\cdot)$  are known.

Define

$$c(r)=(1,0,0,...)\cdot c^r$$
 (3.21)

and take discrete Fourier transform of c(r), we can obtain the phase and the magnitude of the timing tone at frequencies at multiples of  $f_b$ .

When the ensemble average of y(t), namely c(r), is obtained, the timing phase resolved by some other timing recovery techniques can also be evaluated. For example, wave difference method (WDM)[28,29] resolves the phase  $r_0$  such that

$$c(r_0 + \frac{R}{4}) = c(r_0 - \frac{R}{4}) \tag{3.22}$$

This phase can be easily obtained from c(r). Usually, R is chosen to be large for

higher accuracy. Moreover, M is a function of N only, and can be computed once and tabulated for later use.

#### 3.3.3. Prefiltering

It has been reported in [30,31] that a prefilter with certain properties in front of the square-law device can greatly reduce the jitter. Obviously, if the impulse response of this prefilter is denoted by h(t) as shown in Figure 3.8, this impulse response can be incorporated with g(t) to form an overall effective impulse response and the same approach as described in the previous section can be used to calculate the timing phase. Define u(t) as the effective impulse response, where

$$u(t) = g(t)*h(t)$$
 (3.23)

is the convolution of g(t) and h(t). Franks et. al.[30] has shown that if the Fourier transform U(f) of u(t) is real and bandlimited to the interval  $\frac{1}{4T} < |f| < \frac{3}{4T}$  and is even symmetrical around  $\frac{1}{2T}$ , the timing signal will have zero crossings at multiples of  $\frac{T}{2}$ , provided that the following narrow bandpass filter has also even symmetry about  $\frac{1}{T}$ . Here we will show that if h(t) satisfies the following conditions:

- [1] H(f), the Fourier transform of h(t), is narrow band and centered at  $\frac{1}{2T}$ , and
- [2] H(f) is Hermitian symmetric about  $\frac{1}{2T}$ , i.e.,

$$H(\frac{1}{2T} + \delta) = H^*(\frac{1}{2T} - \delta)$$
 (3.24)

and if the narrow band filter which follows also has the Hermitian symmetry about  $\frac{1}{T}$ , the recovered timing phase maximizes the band edge component and the timing signal has zero crossings at multiples of  $\frac{T}{2}$ .



Figure 3.8. Spectral line method with a prefilter to suppress timing jitter.

Let us assume that G(f) is complex and

$$G\left(\frac{1}{2T}\right) = G_0 e^{j\theta} \tag{3.25}$$

$$G(\frac{-1}{2T}) = G_0 e^{-j\theta} (3.26)$$

Note that they are complex conjugates because g(t) is real. The band edge component when sampled at  $kT + \tau$  is

$$G_{s}(\frac{1}{2T}) = G_{0}e^{j(\theta - 2\pi\frac{\tau}{2T})} + G_{0}e^{j(-\theta - 2\pi\frac{\tau}{2T})}$$
(3.27)

The magnitude of this component is then

$$|G_s(\frac{1}{2T})| = 2G_0^2 + G_0^2 \cos(2\theta - 2\pi \frac{\tau}{T})$$
 (3.28)

The band edge component is maximized at phase  $\tau_o$  such that  $\frac{2\pi\tau_o}{T}=2\theta$ .

From (3), the timing tone after the square-law device is

$$A_{1} = \int_{-\infty}^{\infty} U(f)U(\frac{1}{T} - f)df$$
 (3.29)

where

$$U(f) = G(f)H(f)$$
(3.30)

Assume the G(f) is constant in the passband of H(f) when H(f) is narrow band, the timing tone becomes

$$A_{1} = (G(\frac{1}{2T}))^{2} \int_{-\infty}^{\infty} H(f) H(\frac{1}{T} - f) df$$
 (3.31)

It follows that if H(f) is Hermitian symmetric and zero phase at  $\frac{1}{T}$ , the phase of the timing tone is exactly equal to  $2\theta$ , which maximizes the band edge component. Lyon has shown the same result for timing recovery in the QAM systems, where the in-phase and quadrature timing signals are added together after the square-law devices.[31] It can also be shown that if this timing signal is passed through a narrow band filter with Hermitian symmetry about  $\frac{1}{T}$ , the resulting timing waveform satisfies the requirement

of periodic zero crossings at multiples of  $\frac{1}{2T}$ .[30]

### 3.3.4. Summary

The timing phase obtained by the spectral line method using any arbitrary non-linear function can be calculated analytically if the impulse response  $g_k$  and the non-linear function  $f(\cdot)$  is known. This timing phase can then be used to compute the folded spectrum corresponding to this phase and evaluate the validity of this method in the presence of linear equalization. For a normal telephone channel, where magnitude of the frequency response decreases with frequency, a null can occur at the band edge due to the aliasing caused by sampling at less than Nyquist rate. This problem can be avoided by using a fractionally spaced equalizer, also a linear equalizer with a higher sampling rate, [32-35] or a timing recovery technique which maximizes the band edge component. Fractionally spaced equalizers sample at a higher rate and therefore more complex. For channels which have periodic valleys or nulls within the band, such as the cases where the bridges taps are present, a linear equalizer will always enhance the noise at those frequencies. Other equalization methods, such as the decision feedback equalization (DFE) [36-38] or maximum likelihood estimation [39] will result in better performance.

### 3.4. Timing Recovery in DSL

In a digital subscriber loop system, a transceiver on the customer premises communicates with the transceiver in the central office via a 2-wire subscriber loop. The transmitter is required to be synchronized to the received signal in order to ensure that the digital switch with which the transceiver located at the central office interfaces, transmits and receives at the same rate. A direct way to achieve this synchronization is to use the recovered clock to drive the transmitter at the customer end. It is more complicated in the central office end because the transceiver at this end gets its clock from

the switch. Two different arrangements for the transceiver at the central office end, depending on the sampling rate, will be described in more detail in Chapters 6 and 7.

#### 3.4.1. Practical Considerations

Since the transmitter clock is synchronized to the receiver clock and the receiver clock is loop-timed at the customer end, the recovered timing clock is used for the echo canceller in the echo-cancellation mode system. Equalization is also required at the desired data rate, around 160 kbps, due to the channel impairments, especially those caused by the bridged taps. The echo canceller, filters, equalizer, and detector all employ the recovered clock to perform their intended functions. It is practical to consider only discrete-time techniques for these functions because no known continuous-time circuit techniques can meet the required specifications when implemented in VLSI technology. The interrelated design issues of the timing recovery, the echo canceller, and the equalizer contribute to the practical considerations for timing recovery as listed in the following subsections.

#### 3.4.1.1. Echo Canceller

The time duration of the echo response of a subscriber loop depends on its length. location and length of bridges taps, etc. and may extend to  $50 \mu s$  or more. The desired 160 kbps line data rate translates to a baud interval of  $6.25 \mu s$ , so that the echo canceller must be designed to operate on 8 or more previously transmitted symbols. Previously reported echo cancellers on samples taken at 2 to 8 times the baud rate [1.6.40] due to requirements from the timing recovery circuits. The penalty for oversampling by a factor of k is that the complexity of the echo canceller is proportional to k. Because the echo canceller is the most complex portion of a DSL transceiver, there is a high priority to minimize the sampling rate used. The speed of operation in the echo canceller also increases proportionally with the sampling rate. This implies a factor of

k increase in the speed of data converter if used in the echo canceller.

In order to achieve 50 to 60 dB echo cancellation as required in practical applications, the timing jitter on the clock of the echo canceller must be held at a very low level [41,42]. Generally speaking, to achieve 60 dB of echo cancellation requires about -60 dB clock jitter, which is 1/1000 of a baud period, or 6.25 ns for a 160 kbps line transmission rate. A simple approximation of how much jitter is allowed to achieve a certain degree of echo cancellation is discussed in Sections 4.1 and 4.2.

### 3.4.1.2. Line Impairments

The characteristics of the twisted pairs present impairments on the transmission channel. The presence of the bridges taps also has a great impact on the transmission. Among various impairments, the most serious are the  $\sqrt{f}$  attenuation, which in addition to attenuating the signal magnitude along the line, disperses the signal pulse and cause severe *intersymbol interference (ISI)*, and the reflection of signal from the bridges taps, which causes distortion of the pulse shape and creates nulls in the signal spectrum. Figs. 3.9(a) and (b) show graphically the effects of various length lines on a fairly ideal pulse in the time domain. The data were taken from computer simulation using *Linemod* [43], a line modelling simulation package. One important observation is that bridged taps only affect the tail portion of the pulse (besides the additional attenuation).

To be useful with a wide range of line configurations, we desire a robust timing recovery which is insensitive to distortion caused by bridged taps.

## 3.4.1.3. Equalization Techniques

The dispersion of the signal pulse due to the  $\sqrt{f}$  attenuation and the distortion due to the presence of bridged taps make adaptive equalization indispensable. Linear equalizers, which have been used extensively in data communication applications, have



Figure 3.9(a). The received minimum ISI pulse on 1, 2, 3 km lines.



Figure 3.9(b). The received minimum ISI pulse on 2 mile line with and without bridged taps.

two drawbacks in the VLSI implementation of digital subscriber loops. Multiplication, an essential operation within the linear equalizer, takes a lot of silicon area and/or computation time. The high data rate, 160 Kb/s, rules out software multipliers. Another important drawback of the linear equalizer is the noise enhancement when there are spectrum nulls in the input signal caused by the bridged taps.

Adaptive equalization without multiplications can be achieved with a decision feedback equalizer (DFE). The DFE uses past decisions to synthesize the postcursor intersymbol interference (ISI) caused by the previous pulses. This replica of the ISI is then subtracted from the current sampled value. Because of the inherent nonlinearity of the DFE, the noise enhancement problem is reduced [38]. The major problem of the DFE is that this feedback transversal filter cannot cancel the precursor distortion. Unless the timing recovery resolves a phase with minimum precursors, a forward transversal filter with multiplications is needed to cancel the precursors. As previously mentioned, the best timing phase depends on the equalization scheme used. DFE is by far the best candidate for VLSI implementation if a timing recovery technique can be designed to recover a timing phase which minimizes the precursor ISI.

More detailed discussion on the equalization techniques and their tradeoffs is delayed to Chapter 5.

### 3.4.2. Objectives of Timing Recovery

Summarizing the above considerations, the design goals are:

- [1] Sampled-data timing recovery technique with minimum sampling rate.
- [2] Minimum timing jitter.
- [3] Insensitivity to the effect of bridged taps.
- [4] Decision feedback equalizer for equalization.

# [5] Recovered timing phase with minimum precursor intersymbol interference.

There is another incentive toward sampling at the baud rate for timing recovery. It offers a potential advantage of combining DFE with the echo canceller. This configuration removes both the echo replica and the far end signal from the error signal used for echo canceller adaptation if the precursor ISI is zero [11]. The resultant residual error of the echo canceller is smaller and the convergence of the echo canceller can also be speeded up.

### 3.4.3. Summary

Two sampled-data timing recovery methods, the wave difference method (WDM) [28,29], and the baudrate sampling technique (BST) [44], sampling at twice the baudrate and the baudrate respectively have found to be very attractive in DSL application. They will be discussed in much more detail in Chapters 6 and 7.

### CHAPTER 4

# Phase-Locked Loops

A phase-locked loop (PLL) can be used to extract a sinusoidal signal from a noisy background, among many applications [45]. The PLL has been found to be extremely useful in timing recovery, where the timing tone is embedded in heavy noise. It has also been shown to be much more effective in rejecting high frequency noise than a high-Q tank circuit [41,42]. The high frequency noise is a major source of the pattern jitter in timing recovery applications. It will be shown in Sections 4.1 and 4.2 how the timing jitter, caused basically by the noise accompanying the timing tone, can affect the performance of the echo canceller. We will also show that why a digital phase-locked loop (DPLL) is determined to be not practical in this application unless some interpolation technique is employed.

This chapter is by no means meant to give an in depth treatment of phase-locked loops. Only those issues related to the specific application in the digital subscriber loops will be mentioned. More extensive material can be found elsewhere [45-47].

A PLL is a device by means of which the phase of a frequency-modulated oscillator output signal is obliged to follow that of the input signal [46]. A diagram of this device is shown in Fig. 4.1. The phase detector compares the phases of the two input signals, one the input signal  $y_i(t)$ , the other the output of the voltage controlled oscillator (VCO)  $y_o(t)$ , and generates an output which is a function of the phase difference between the two signals. This error signal is passed through a loop filter. The output of the loop filter then serves as the control signal of the VCO. There is a great number of variations in the realization of each component in a PLL. For example, the input block, which is now a phase detector, can be a combination of a nonlinear operation and



Figure 4.1. Block diagram of a phase-locked loop (PLL).

a phase detector when the input  $y_i(t)$  is a cyclo-stationary signal such as a pulse amplitude modulation (PAM) signal. The VCO can be either an analog VCO with output frequency varying as a continuous function of the VCO control signal, or a digital VCO with discrete changes in the phase of the output of the VCO.

In Sections 4.3 and 4.4, we will discuss various issues on designing a VCO and PLL. Two types of phase detector, used in WDM and baudrate sampling technique respectively, will be described in Chapters 6 and 7.

#### 4.1. Timing Jitter Effect on Performance of Echo Canceller

An echo canceller implemented with a transversal filter is shown in Fig. 4.2, where a digital delay line contains the past and the current samples from the received data signal with delay equal to T, the time interval between successive data symbols. The tap coefficients, denoted by  $C_n$ , are updated recursively using an adaptive algorithm, which is not shown in the figure. The summer sums up the products of the data bits and the corresponding tap coefficients to generate the echo replica to be subtracted from the received signal.

Shown in Fig. 4.3 is a representative sample of the effective impulse response of the echo channel, where h(nT) denotes the value of the impulse response at time nT. Assume that at time t=kT,  $a_k$  is transmitted with  $a_{k-1},a_{k-2},...$  being the previously transmitted data bits. If all the previous pulses are transmitted at the right time instants, i.e., no timing jitter in the transmitter clock, the echo received at t=kT is

$$e_k = \sum_{i=0}^{N} a_{k-i} h(iT)$$
 (4.1)

Apparently, the tap coefficients  $C_n$  should converge to  $h_n$  to reproduce the echo if the convergence is perfect. Now let's assume that the tap coefficients have already converged perfectly and a timing jitter with magnitude  $\Delta T$  is experienced by the pulse associated with  $a_{k-2}$  when it was transmitted. The actual echo at t=kT is



Figure 4.2. Echo canceller implemented with a transversal filter.



Figure 4.3. A representative echo response.

$$\hat{e}_k = a_k h(0) + a_{k-1} h(T) + a_{k-2} h(2T + \Delta T) + a_{k-3} h(3T) + \dots$$
 (4.2)

The replica of the echo is still  $e_k$  as in Eq. (4.1) because the tap coefficients have converged after a lengthy convergence process with a long time constant. The long time constant, implemented by a small step size, prevents the adaptive filter from tracking any fast change in the echo channel. The residual error results from this timing jitter in the transmitter clock at time t = kT is therefore

$$r_k = h(2T) - h(2T + \Delta T) \tag{4.3}$$

If there is a timing jitter of  $\Delta T$  at the receiver clock, which will also appear at the transmitter clock, it is equivalent to that all the previously transmitted pulses experiencing a time displacement of  $-\Delta T$ . The residual error due to imperfect echo cancellation is

$$r_k = (h(T) - h(T - \Delta T)) + (h(2T) - h(2T - \Delta T)) + (h(3T) - h(3T - \Delta T)) + \cdots (4.4)$$

This is the worst case residual error due to a single timing jitter. Timing jitter may happen at every clock instant in a random fashion. The jitter spectrum, if the jitter is random, is used to characterize the timing jitter.

It is clear that the residual echo due to any timing jitter depends on the timing jitter magnitude and the transmitted pulse shape. We will establish an approximation of the relationship between the residual echo on the jitter magnitude in the next section.

# 4.2. Residual Error in Echo Canceller due to Timing Jitter

A detailed study of the effects of timing jitter on digital subscriber loop echo cancellers has been reported [41,42]. It is shown in [41] that the residual steady-state mean squared error is a function of the step size in the echo canceller, the echo impulse response, and the jitter power spectrum, and particularly, it is especially sensitive to the high frequency part of the jitter spectrum. It is also concluded that a high-Q tuned circuit is not sufficient for rejecting the high frequency jitter spectrum and thus a

tuned-circuit is deemed inferior to a PLL in this application.

Estimating the residual error is useful in designing the VCO and PLL for timing recovery, especially in DPLL, where the timing jitter has a fixed magnitude and the frequency of jitter occurrence can be controlled. A very simple approximation of the residual error will be described.

Assume only one jitter component is not zero within time period NT, where N is the duration of the echo impulse response, from Eq. (4.3), the residual echo is

$$r_k = a_k [h(mT) - h(mT + \Delta T)] \tag{4.5}$$

where mT is the time instant when the jitter occurs,  $\Delta T$  is the magnitude of the jitter component. Eq. 4.5 can be approximated when  $\Delta T$  is small by

$$r_k = a_k \Delta T \ h'(mT) \tag{4.6}$$

where h'(mT) is the derivative of h(t) evaluated at mT. The residual error in echo canceller due to a single jitter component is proportional to the jitter magnitude  $\Delta T$  and the derivative of the echo impulse response at this instant. h'(mT), which depends on the echo impulse response.

In practical applications, the echo impulse response is band limited to f=1/T, and may have a peak at low frequency and a slow roll-off to f=1/T. A typical echo impulse response, assuming a simple hybrid with a 110-ohm balance circuit and the measured characteristic impedance versus frequency of a 26-gauge twisted pair pulp cable was shown in Fig. 4.3. It is reasonable to compute the residual error at the frequency equal to f=1/(2T), half the highest frequency. h'(mT) would then be approximated by  $h(mT)\omega_c$  for the worst case, where  $\omega_c=\frac{2\pi}{T}$ . The residual error is estimated as

$$r_k \approx a_k \ h(mT) \ \Delta T \times \frac{2\pi}{T}$$
 (4.7)

Therefore, the ratio of the residual echo to the uncancelled echo is roughly on the order of  $\Delta T/T$ . For a 60 dB echo cancellation,  $\Delta T$  should be in the range of  $\frac{1}{1000}T$ , or 6.25 ns at 160 kbps data rate.

### 4.3. PLL

As shown in Fig. 4.1, a PLL consists of a phase detector, a loop filter, and a VCO. Two types of phase detector, used in WDM and BST respectively, will be discussed in later chapters. In this chapter, we will assume that an adequate phase detector generates an appropriate error signal to the loop filter. The analog phase-locked loops (APLL) and the digital phase-locked loops (DPLL) will be described in this section. Another essential component in a PLL, the VCO, is discussed in the next section.

# 4.3.1. Analog PLL

There is no clear boundary between an APLL and a DPLL. A common approach to separating the two is that all the building blocks, including the phase detector, the loop filter, and the VCO, are analog in an APLL. If one or more building blocks in a PLL is digital, it is a DPLL.

An APLL can be further divided into two groups, the continuous-time APLL and the discrete-time APLL. A continuous-time APLL has all the building blocks in continuous-time and continuous amplitude. A second order APLL, with a first order loop filter, has enough degrees of freedom and yet is still easy to handle analytically. The first-order proportional-plus-integral loop filter, with transfer function

$$F(s) = \frac{1+s\,\tau_2}{s\,\tau_1} \tag{4.8}$$

offers the versatility of independent choice of the three important parameters of the PLL, has been extensively used and studied [45]. The three parameters are  $\omega_n$ , the natural frequency,  $\zeta$ , the damping factor, and  $K_v$ , the DC gain, which is equal to

$$K_{v} = K_{0} K_{d} F(0) (4.9)$$

where  $K_o$  is the VCO gain,  $K_d$  is the gain of the phase detector, and F(0) is the DC gain of the loop filter [45]. The closed-loop transfer function of the system, shown in Fig. 4.4, expressing the phase error  $\theta_{\epsilon}(s)$  in Laplace domain is

$$\frac{\theta_e(s)}{\theta_i(s)} = \frac{s^2}{s^2 + 2\zeta\omega_n s + \omega_n^2} \tag{4.10}$$

It can be shown that the continuous-time APLL is always stable.

As we move from a continuous-time system to a discrete-time, sampled-data system, the first question that concerns is the stability. Since the two types of phase detector discussed in this context operate in sampled-data domain, we will limit ourselves to discrete-time APLL with sampled-data phase detector. The loop-filter which follows the sampled-data phase detector can be either a continuous-time filter, as shown in Fig. 4.5, or a sampled-data loop filter, as shown in Fig. 4.6.

In the case of a continuous-time loop filter with transfer function equal to Eq. (4.8), the *Z-transform* of the closed-loop transfer function is

$$\frac{\theta_e(z)}{\theta_i(z)} = \frac{1}{1 + G(z)} \tag{4.11}$$

where G(z) is the Z-transform of the open-loop transfer function of Fig. 4.5, and is

$$G(z) = K_d K_o K_n \frac{z (T^2 + 2T \tau_2)}{2\tau_1 (z - 1)^2}$$
(4.12)

where T is the sampling period. It can be shown that this discrete-time APLL system is *conditionally stable* [48]. If we define the parameters  $\omega_n$  and  $\zeta$  as in the continuous-time APLL case, the condition for stability is

$$4\zeta > \omega_n T \tag{4.13}$$

The requirement on the sampling rate is very loose because  $\omega_n$  is small in our applica-



Figure 4.4. s-domain representation of a continuous PLL.



Figure 4.5. s-domain representation of a continuous PLL equipped with a sampled-data phase detector.



Figure 4.6. s-domain representation of a continuous PLL equipped with sampled-data phase detector and loop filter.

tion.

It is very likely that a sampled-data loop-filter is desired in VLSI implementation.

The equivalent of the proportional-plus-integral filter in the sampled-data domain is

$$H(z) = a + \frac{1}{1 - z^{-1}} \tag{4.14}$$

as shown in Fig. 4.6. The open-loop gain is

$$G(z) = K_d K_o K_n \frac{(1+a)z - a}{(z-1)^2}$$
 (4.15)

The condition for stability is then

$$a > 0$$
  
 $K_d K_n K_n T (1+2a) < 4$  (4.16)

#### 4.3.2. DPLL

The digital PLL (DPLL) which we are interested in for DSL application is the one with a digital voltage-controlled-oscillator (DVCO) and the same sampled-data phase detectors. The loop-filter used in DPLL can be analog filter, sampled-data filter, or even a digital filter. Because the DVCO takes digital input, somewhere along the line there must be an analog-to-digital operation since the output of the sampled-data phase detectors is analog. A type of digital filter, also known as a sequential filter, implemented with an up/down counter, can be used as the loop-filter if the output of the phase detector is binary-quantized. DPLL's of the type employing binary-quantized phase detection and discrete phase adjustments have been utilized for detection of a binary signal or suppression of phase jitter in data-transmission techniques [49]. We will again restrict ourselves to this specific configuration. A block diagram of the DPLL is shown in Fig. 4.7. A local crystal oscillator with a much higher frequency very close to N times of the frequency to be locked is required in this configuration. Since crystals with accuracy within 50 ppm are readily available at reasonable cost, this does not present a problem considering the advantage of digital circuits in VLSI technology. The



Figure 4.7. Block diagram of a digital PLL.

output signal of this PLL changes phase in discrete quantity equal to 1/N of the period of the input frequency. The magnitude of this jitter depends on the frequency of the local crystal. The higher the crystal frequency is, the smaller the jitter is. The occurrence of this changes can be controlled by the up/down counter. For instance, if the up/down counter overflows or underflows when the count reaches +M or -M, the phase change takes place at least M cycles after the previous phase adjustment. This period can be much longer if the frequency difference between the 1/N of the local crystal and the input frequency is very small. The timing jitter can be effectively suppressed.

### 4.4. Voltage Controlled Oscillator (VCO)

A voltage-controlled oscillator (VCO) changes its output frequency as a function of the control voltage, which is the input to the VCO. Depending on whether the frequency change is continuous or discrete, the VCO is termed analog VCO or digital VCO respectively. We will again concentrate on VCO's which are realizable in VLSI technology and those that are applicable to out application.

#### 4.4.1. Analog VCO

Two types of analog VCO are considered. Multivibrator, using the concept of relaxation, can be realized in monolithic form. The *crystal VCO (VCXO)*, which requires an off-chip crystal, generally has a more accurate center frequency and smaller tuning range.

# 4.4.1.1. Multivibrator Analog VCO

The multivibrator type VCO, equipped with either a single floating capacitor or two grounded capacitors, is the most widely used VCO in integrated-circuit form. Due to the lack of precision in the absolute capacitance values and the control of current source, the accuracy of the center frequency of the VCO can be poor. Errors of +/- 50 %

can be realistically expected. A large VCO gain, which is desirable in many applications, is then required to pull the VCO output frequency to the desired frequency. However, in the application of timing recovery, where the incoming frequency is usually known and fixed, a small VCO gain and an accurate center frequency are required. Accurate center frequency reduces the pull-in time and also removes the possible requirement of a frequency-locked loop. A small VCO gain reduces the timing jitter.

A simplified circuit schematic of a grounded-capacitor multivibrator VCO is shown in Fig. 4.8. The floating-capacitor multivibrator VCO works on the same principle. The current derived from the voltage-controlled current source, composed of MOSFET's F1 to F3 and resistance R, is steered between two pathes, p1 and p2, to charge either capacitor C1 or C2, since the two outputs vco1 and vco2 are complementary. The outputs are periodic with period equal to

$$T = \frac{2V_{TH}C}{I_C} \tag{4.17}$$

where  $V_{TH}$  is the threshold of the two inverters, C is the capacitance value of the two capacitors C 1 and C 2, and  $I_C$  is the current of the voltage-controlled current source. This current is

$$I_C = \frac{V_C}{R} - \frac{V_T}{R} \tag{4.18}$$

Due to process variations in both electrical parameters and device geometry, the oscillation frequency may vary from its designed value by 50% or more. The VCO gain, the ratio of the change in frequency to the change in control voltage, is normally made large to ensure that the frequency can be pulled to the desired value. The frequency jitter is large and the pull-in time can be long due to the lack of precision in frequency.

A digital-aided analog VCO, as shown in Fig. 4.9, is designed to overcome this problem. A set of binary weighted current sources that can be turned on or off with digital control is used in conjunction with the voltage-control current source. The



Figure 4.8. Simplified circuit schematic of a monolithic multivibrator  $\dot{VCO}$  in CMOS.



Figure 4.9. Current source portion of the digital-aided multivibrator VCO.

control of the binary weighted current source is performed at initialization with the aid of a frequency-locked loop. The information of the control is then stored in the memory and remains through out the normal operation. The controllable current sources work on the same principle as the binary-weighted digital-to-analog converters. The precision of the current depends on the number of the binary-weighted current sources. In normal operation, these current source array will supply a constant current and the frequency adjustment will be controlled by the voltage-dependent current source. The VCO gain can be reduced to a value that ascertains enough margin of pullability to cover the range of frequency variation due to temperature coefficient or power supply variation.

## 4.4.1.2. Crystal VEO (VCXO)

A crystal VCO (VCXO) always ensures a well controlled center frequency within a certain percentage of a specified value. The tuning range is normally very small, however, it should be large enough to pull the frequency to any frequency within the tolerance of the crystal. Several design examples using discrete components, including inductors, varactors, bipolar transistors, and crystals, can be found in reference [45]. The consideration in VLSI implementation is to minimize the number of discrete components. The voltage dependent junction capacitance under the reverse-biased condition can be used for on-chip varactors.

A possible circuit schematic of a CMOS VCXO with the crystal as the only off-chip component is shown in Fig. 4.10, where the configuration of a Pierce oscillator is used. The large signal swing, as always seen in a Pierce oscillator, has been identified as a source of problem. Under large signal swing, the varactor is not a function of just the applied bias, and severe nonlinear characteristics result [50]. Small signal oscillator with a peak detector has been studied in [50]. It is postulated that a CMOS small signal VCXO can be designed with pullability equal to 200ppm.



Figure 4.10. Simplified schematic of a crystal VCO using reverse-biased junction capacitance as varactor in CMOS.

# 4.4.2. Digital VCO (DVCO)

A DVCO is simply a frequency divider with a variable divisor. The input is normally a well-behaved periodic signal, most likely from a crystal oscillator, with frequency close to N times the desired output frequency. The divisor is normally N and can be changed to other values, commonly N-1 and N+1, from a control input. By varying the divisor from N to N-1 or N+1, the phase of the output signal can be altered. It should be noted that the phase of the output signal changes abruptly in discrete steps equal to  $\frac{1}{N}$  of the nominal period, which is NT, if T is the period of the input signal.

A chip performing both the variable divider and the loop-filter functions was designed using standard cells in NMOS technology [51]. The block diagram is shown in Fig. 4.11, and the chip photo in Fig. 4.12.



Figure 4.11. Block diagram of the DVCO/Filter chip.





# CHAPTER 5

# Equalization

In Chapter 3, some simulation results for the effect of line impairments on a minimum intersymbol interference (ISI) pulse were introduced. The dispersive channel was shown to cause a detrimental effect of ISI on the pulse shape. The characteristics of the channel, which often can be modelled and characterized, are normally unknown to the designer or the user before the call is set up. The only way to mitigate the effect of channel distortion is to include within the transmission system an adjustable filter which can adapt to fit closely the required characteristics for any individual channel. Traditionally this kind of filter has been called an *equalizer*. An equalizer can be considered as a device which alone or together with other devices undos the nonidealities of the transmission medium. In general, certain penalty has to be paid as a price to remove the channel effect when there is noise present in the channel.

This chapter is dedicated to equalization techniques which are either related to the topic of this research or seriously considered as candidates in the digital subscriber loop application. This chapter is not meant to provide a thorough survey or a theoretical treatment of all the equalization techniques in existence to date. Readers who are interested in more in-depth treatment of equalization are referred to references [12.52]. The equalizers discussed in this chapter are restricted to the *post-equalizer* type, which are generally located at the receiving end of the transmission path. The *pre-equalizer*, located at the transmitting end, is assumed to generate a suitable pulse shape.

In this chapter, some performance bounds are first introduced. No special coding is considered in this chapter. These bounds can serve as the reference points as the more practical equalization implementations are described.

Later on in this chapter, implementation alternatives which approximate some of the optimum receivers are presented. The complexities of these alternatives are compared with performance being considered.

#### 5.1. Performance Bounds

The matched-filter bound on an ideal channel, representing the optimum receiver case, is used as a reference. The matched-filter bound on an nonideal channel is then introduced. A performance degradation represents the penalty induced by the nonideal characteristics of the channel. The bound of a linear PAM receiver, which is a linear filter followed by a symbol-by-symbol decision device, indicates further degradation in performance. The price paid for using a linear PAM receiver is expressed in a closed form, and can be very substantial under certain circumstances. The bound of a realizable PAM receiver, which is a linear forward filter followed by a causal decision feedback equalizer (DFE), is also introduced. It is also interesting to note that the matched-filter bound can be achieved with a fictitious nonlinear PAM receiver, with both causal and noncausal DFE. Neither the matched-filter nor the noncausal DFE is realizable, however.

# 5.1.1. Matched-Filter Bound in an Ideal Channel

In an ideal channel with the presence of additive white Gaussian noise, the optimum receiver for an L-ary PAM signal with N symbols consists of a set of matched-filters, each matched to one of the  $L^N$  possible data signals, and a decision device which selects the sequence corresponding to the largest output. There are  $L^N$  possible sequences, and the number of matched filters is  $L^N$ , which can be very large. The structure of this matched-filter receiver is shown in Fig. 5.1. It is interesting to note that a different structure, as shown in Fig. 5.2, performs the same function as that in Fig. 5.1, with a different structure. This optimum receiver is realized as a filter



Figure 5.1. Optimum maximum likelihood receiver in ideal channel. There are  $L^N$  matched filter each matched to one of the possible  $L^N$  data signals [12].



Figure 5.2. Same receiver as in Fig. 5.2, but with only one filter matched to an isolated pulse, and  $L^N$  processing units [54].

matched to the transmitted pulse g(t), followed by a sampler operating at the symbol rate 1/T and a subsequent processing algorithm for estimating the information sequence  $I_k$  from the sample value. The two matched-filter receivers examine the entire received signal, which is possibly of infinite duration, and then make a single choice as to which of the  $L^N$  sequence was transmitted. This is obviously not feasible in real life.

Interestingly, a symbol-by-symbol decision type PAM receiver which consists of a filter followed by a sampler and threshold detector can make optimum detection in white Gaussian noise and ideal channel. The only requirement is that the transmit and receiver filters each performs the square root of transfer function with Nyquist characteristic, as shown in Fig. 5.3, where x(t) is a Nyquist pulse, and  $X(\omega)$  is the Fourier transform of x(t). A Nyquist pulse is a pulse with all the precursors and tails (postcursors) passing through 0 at regular T-sec intervals. The probability of error in this case is

$$P_e = 2(1 - \frac{1}{L})Q\left[\left(\frac{3}{L^2 - 1} \frac{P_s}{P_N}\right)^{1/2}\right]$$
 (5.1)

where  $p_e$  is the probability of error, L is the number of levels.

$$Q(x) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} e^{-t^2/2} dt$$
 (5.2)

 $P_s$  is the signal power at the transmitter output, and  $P_n$  is the noise power at the receiver input.

## 5.1.2. Matched-Filter Bound in a Nonideal Channel

However, if the channel is not ideal, it will cause ISI on the received pulses, and a matched filter shown in Fig. 5.1 doesn't give the optimum performance because the pulses are overlapped. A matched-filter bound can be achieved with a fictitious parallel channel as shown in Fig. 5.4, where there are actually N independent channels each is



Figure 5.3. Yet another optimum PAM receiver in ideal channel. x(t) is a Nyquist pulse with zero ISI [12].



Figure 5.4. Matched-filter bound in nonideal channel represented by N fictitious channels [12].

dedicated to one symbol from the N successive symbols. At the receiving end, each received signal is independently determined with a matched-filter. The probability of error can be shown to be:

$$P_{e} = 2(1 - \frac{1}{L})Q\left[\left(\frac{3}{L^{2} - 1}\beta \frac{P_{s}}{P_{N}}\right)^{1/2}\right]$$
 (5.3)

where

$$\beta = \frac{\int_{-\infty}^{\infty} |C(\omega)|^2 |G_T(\omega)|^2 d\omega}{\int_{-\infty}^{\infty} |G_T(\omega)|^2 d\omega}$$
(5.4)

where  $C(\omega)$  is the channel response, and  $G_T(\omega)$  is the Fourier transform of the transmitted pulse.

It should be clear that  $\beta$  represents the signal-to-noise ratio (SNR) degradation caused by the nonideal channel. Note again that  $P_s$  represents the signal power at the transmitter output. This optimum performance, surprisingly enough, can also be achieved by a fictitious filter which is composed of a filter matched to the received effective impulse response, followed by an infinite DFE with both causal and the unrealizable noncausal taps, as shown in Fig. 5.5 [53]. The mean square error (mse) for this case is

$$\epsilon_{opt} = \sigma_a^2 \frac{N_0}{N_0 + R_0} \tag{5.5}$$

where

$$\sigma_a = d^2 \frac{L^2 - a}{3} \tag{5.6}$$

is the average symbol power, and 2d is the Euclidian distance between any two adjacent symbol levels.  $R_m$  is the autocorrelation function of the channel impulse response as



Figure 5.5. The matched-filter bound as represented by Fig. 5.4 can also be achieved by a matched-filter  $p^*(T-t)$ , followed by a nonrealizable decision forward canceller and a decision feedback equalizer [53].

$$R_m = T \int_{-\infty}^{\infty} p(mT - \tau) p(-\tau)^* d\tau \qquad (5.7)$$

and its Fourier transform is

$$R(\omega) = \sum_{m=-\infty}^{\infty} R_m e^{-jm\omega T}.$$
 (5.8)

The noncausal DFE is not realizable because it requires that all the data symbols be known in advance at the receiver.

## 5.1.3. Optimum Linear PAM Receiver

If the receiver is restricted to a linear PAM receiver, defined as a linear filter followed by a threshold detector, more degradation in SNR must be suffered. The optimum linear PAM receiver has the structure shown in Fig. 5.6, where a linear T-spaced equalizer with infinite length removes the ISI caused by the nonideal channel, and a matched filter in the front end serve to maximize the SNR and define the timing phase. The probability of error is shown to be [12]

$$P_e = 2(1 - \frac{1}{L})Q\left[\left(\frac{3}{L^2 - 1}\beta_1 \frac{P_s}{P_N}\right)^{1/2}\right]$$
 (5.9)

where

$$\beta_{1} = \left[\frac{T}{2\pi} \int_{-\infty}^{\infty} |G_{T}(\omega)|^{2} \frac{T}{2\pi} \int_{-\pi/T}^{\pi/T} \frac{d\omega}{\sum_{n=-\infty}^{\infty} |P(\omega + \frac{2\pi n}{T})|^{2}}\right]^{-1}$$
(5.10)

where  $P(\omega)$  is the Fourier transform of the received pulse p(t).

Again,  $\beta_1/\beta$  represents the further degradation corresponding to the price paid for the use of a linear PAM receiver. Note that  $\beta_1$  can be very substantial if the overall channel has nulls or valleys in the folded frequency response. This excessive SNR degradation is often called *noise enhancement* in linear PAM receivers.



Figure 5.6. Optimum linear PAM receiver, with a matched filter followed by a T-spaced transversal filter [12].

It sometimes makes more sense to look at the SNR at the output of the equalizer, which is the input of the threshold detector, in the cases of PAM receivers. The SNR at the equalizer output will be denoted as  $SNR_D$ , and in the case of a linear transversal filter.

$$SNR_D = \left[\frac{T^2 N_0}{2\pi} \int_{-\pi/T}^{\pi/T} \frac{d\omega}{\sum_{n=-\infty}^{\infty} |P(\omega + \frac{2\pi n}{T})|^2}\right]^{-1}$$
 (5.11)

It is observed again from Eq. (5.11) that if the folded spectrum of  $P(\omega)$  possesses any zeros, the integrand becomes infinite and the SNR goes to zero. The relation between the mean square error (mse), denoted by  $\epsilon$ , and the SNR at the output of the equalizer is

$$\epsilon = \frac{1}{1 + SNR_D} \tag{5.12}$$

### 5.1.4. Optimum PAM Receiver with DFE

If the PAM receiver is not restricted to a linear filter only, a causal DFE which uses past decisions to remove the postcursor ISI and effectively whitens the output noise, presents some improvement in the SNR. Fig. 5.7 shows the structure of an optimum PAM receiver with DFE. The corresponding output SNR is [38,54]

$$SNR_{D} = -1 + exp \left[ \frac{T}{2\pi} \int_{-\pi/T}^{\pi/T} ln \left[ \frac{N_{0}}{\sum_{n=-\infty}^{\infty} |P(\omega + \frac{2\pi n}{T})|^{2} + N_{0}} \right] d\omega \right]$$
 (5.13)

The noise enhancement is generally reduced, and better performance results from the addition of the nonlinear DFE.

In this section, four types of optimum receivers with their performance bounds were described. Notice that all the optimum receivers require some kind of a matched filter, either matched to a single pulse or the entire received signal. In the cases of linear transversal filter and DFE, infinite number of taps are also required to achieve the performance bounds.



Figure 5.7. Optimum PAM receiver with DFE.

A matched filter, unless adaptive, is not realizable because the channel characteristics are unknown. In the following sections, we will examine some realizable receivers which approximate these optimum receivers.

### 5.2. Automatic Line Built Out (ALBO)

The transmission characteristics of line are determined by the properties such as conductivity, diameter and spacing of conductors, and the lossy dielectric constant of the insulation. These properties in turn determine the electrical primary constants, R, L, G, and C, representing the uniformly distributed series resistance, series inductance, shunt conductance, and shunt capacitance. The characteristics of the twisted pairs and coaxial cables are commonly characterized by their secondary constants, calculated from the primary constants. The secondary constants consist of the characteristic impedance, and the propagation constant [55]. The propagation constant expresses the attenuation.  $\alpha$  in dB per km and the phase shift,  $\beta$  in degrees per km, where  $\gamma = \alpha + j\beta$  is the propagation constant. Coaxial and symmetrical pair cables can be shown to exhibit the following frequency dependence:

$$\alpha = a + b\sqrt{f} + cf \qquad dB / km \tag{5.14}$$

$$\beta = \sqrt{LC} f \qquad degrees / km \tag{5.15}$$

The value of constants a and c in Eq.(5.14) is very small. The attenuation can be considered proportional to the square root of f, the frequency, unless very high transmission rates are involved.

Both the attenuation and phase delay are directly proportional to the length of the cable, and that leads to a simple way to adjust the degree of compensation if the length of the cable can be estimated. The length of the cable can easily be estimated from the power level of the received signal if there is no other factor contributing to the attenuation.

A simple automatic line built out (ALBO), comprised of a single-pole single-zero filter with adjustable pole and zero locations, can compensate for a wide range of attenuations. The high-pass characteristic of this transfer function, although not exactly  $\sqrt{f}$ , often performs satisfactorily if the line length can be accurately estimated. Figs. 5.8-5.10 show an example of how a simple ALBO can equalize the channel impairments. Fig. 5.8 shows a minimum ISI pulse at the receiving end if the channel is ideal, where the receive filter is included. Fig. 5.9 shows the pulse at the receiving end of a twisted-pair wire with length equal to 2 km. The  $\sqrt{f}$  attenuation causes visible dispersion and significant ISI. Fig. 5.10 shows the received pulse under the same line condition, however, equalized by a simple one-pole, one-zero ALBO, with pole located at 80 kHz and zero at 23 kHz. The equalized pulse is seen to have very small ISI. The transmission in this example is 160 kb/s.

Besides this simple one-pole, one-zero ALBO, there have been considerable activities on the design of an adjustable  $\sqrt{f}$  equalizer [28.56].

Estimating the line length from the power level works successfully if there are no bridged taps intact on the line. The bridged taps cause, in addition to the reflections on the trailing portion of the pulse, excessive attenuation. The excessive attenuation will make the line length estimation inaccurate. The ISI caused by the reflections from the bridged taps is detrimental to the data detection because it can not be removed by the simple ABLO or  $\sqrt{f}$  equalizer. A DFE is necessary in the presence of bridged taps.

### 5.3. Linear Transversal Filter

The transversal filter, a kind of sampled-data linear filter, has gained wide usage in telecommunication applications. As shown in Fig. 5.11, a transversal filter is comprised of a tapped delay line, a set of tap coefficients to be multiplied by the delayed samples, and a summing stage. A variety of adaptation algorithms, which statistically minimize the error by adjusting the tap coefficients, makes the transversal



Figure 5.8. Minimum ISI pulse in ideal channel.



Figure 5.9. Received minimum ISI pulse undergoing a 2 km line.



Figure 5.10. Pulse in Fig. 5.9 equalized by a one-pole, one-zero ALBO.



Figure 5.11. Transversal filter with delay D. It is a T-spaces equalizer if D = T, and a fractionally-spaced equalizer (FSE) if D < T.

filter very attractive in the presence of unknown channels [52].

It has been pointed out in Section 5.1.3 that a T-spaced transversal filter following a filter matched to the channel impulse response is the optimum linear PAM receiver. However, a matched filter is normally not realizable because the channel characteristics are unknown. In practice, a low-pass filter is in place of the matched-filter. This arrangement, although much simpler, makes the performance sensitive to the timing phase, as described in Section 3.3. The multiplications required by the transversal filter also present a difficulty. Multiplications generally takes more complicated circuits or longer time and are to be avoided if possible.

The sensitivity to the timing phase can be avoided by using a fractionally spaced equalizer (FSE), where the delays in the transversal filter are fractions of a data interval. The sampling rate is high enough such that there is no alias. The spectral null at the band-edge due to aliasing effect as in the T-spaced transversal filter can be eliminated. This can be interpreted that the FSE has enough degrees of freedom to synthesize any phase characteristic required for minimum error. If the number of taps is large enough, the FSE can also perform the matched-filter function to achieve the optimum linear PAM receiver performance, because the criterion in adaptation is to minimize the residual error.

There are also drawbacks associated with the FSE. It requires a large number of multiplications. In the absence of noise and when the sampling frequency is much higher than Nyquist frequency, there may be spectrum nulls close to the sampling frequency. Under this condition, there could be more than one tap setting to satisfy the minimum error criterion. The tap coefficients then may "wander", and could become very large to cause overflow. A method to overcome this problem was proposed in [34]. It penalizes the performance by introducing a leakage in the tap coefficient adaptation.

If there are spectrum nulls or valleys at frequencies below half the data rate, for example in the presence of severe interference from bridged taps as in Fig. 3.7, FSE will always enhance the noise at these frequencies.

## 5.4. Decision Feedback Equalizer (DFE)

A DFE can be added to a T-spaced or a fractionally-spaced transversal filter to improve the performance. It was pointed out in Sect. 5.1.4 that this configuration with a matched-filter reduces the noise enhancement. Without a matched-filter in the front, the performance of the T-spaced transversal filter in combination with DFE is still sensitive to the timing phase. If the T-spaced transversal filter is replaced by a FSE, a matched-filter and the optimum transversal filter will both be synthesized by the FSE, and the performance bound of the optimum PAM receiver with DFE can be achieved. The drawback of this configuration is the complexity associated with the multiplications in the transversal filter. Note that DFE alone doesn't require any multiplications if the transmission is binary (+/- 1) or bipolar (+/- 1 and 0). Very simple multipliers, with selectable integer multiplicands, usually suffice in multi-level transmission.

If the pulse can be designed such that all the precursor ISI is zero and the timing phase can also be successfully recovered, a suboptimum equalizer comprising of a DFE alone removes all the postcursor ISI without any noise enhancement. Although not optimum against noise, the performance is fairly insensitive to the line impairments, unlike linear transversal filter, where the noise enhancement can be substantial under certain line conditions.

### 5.5. Summary

Performance bounds for several receivers are introduced. To achieve the optimum performance, a matched filter is always required. The practical realization of receivers approximating their equivalent optimum receivers are also described. Multiplications,

which take large area in implementation and/or long processing time are not desirable at high speed transmission, as in DSL. Noise enhancement also cause performance degradation in the presence of bridged taps if linear equalizers are used. The DFE, being nonlinear, doesn't require any multiplication, also removes the noise enhancement problem. Therefore DFE is considered as the most viable approach. The timing recovery is required to recover a timing phase with no precursor ISI. The baudrate sampling technique, to be described in Chapter 7, provides a solution to this problem.

## CHAPTER 6

# Wave Difference Method (WDM)

The wave difference method (WDM) appears to be a good candidate for timing recovery in DSL. The sampling rate is normally at 4 times the baudrate [28], and can be reduced to 2 times the baudrate. A detailed study of its performance is carried out analytically and by computer simulation for the case of binary and alternate mark-inversion (AMI) line coding. A closed form expression describing the binary jitter performance of the WDM and its continuous time counterpart, the spectral line technique, is used to compare the two methods [29]. Analytical and simulation results for recovered phase and jitter are presented for various cable pulse responses carefully chosen to represent worst-case or nearly worst-case conditions.

Two methods for including frequency detection in the WDM, the quadricorrelator and the rotational detector, are also simulated. In sampled-data systems, the so called wave difference method (WDM) has been proposed [28]. However, no detailed analysis of performance of the WDM has been published, and some questions arise that deserve further study, namely:

- [1] What is the timing phase recovered by this technique in the presence of severe pulse distortion, as can result for example from bridged taps? The answer to this question also depends upon the type of equalization used, as some methods are more sensitive to timing phase than others.
- [2] What is the jitter performance of the WDM?
- [3] What are the tradeoffs in the design of a phase lock loop based on a WDM phase detector?

[4] How can a frequency detector be designed in the context of the WDM, increasing the pull-in range of the PLL and decreasing the required free-running frequency accuracy of the VCO?

In Section 6.1 phase and frequency detectors which use the WDM are characterized analytically. In Section 6.2, the performance of the timing recovery system when operating with imperfections such as bridged taps is evaluated by computer simulation. In Section 6.3 tradeoffs in the design of a PLL using a WDM phase/frequency detector are evaluated, including a design example.

## 6.1. Analysis of the Wave Difference Timing Recovery Technique

In this section, the WDM is analyzed, extended to include a frequency detector, and shown under certain conditions to be the sampled-data equivalent of the spectral line method.

# 6.1.1. The Wave Difference Method

Let

$$s(t) = \sum_{k=-\infty}^{\infty} x_k \ h(t-kT)$$
 (6.1)

be the received signal in a baseband subscriber loop receiver where h(t) is the channel response to the input pulse. In subsequent analytical results, binary line coding will be assumed in which  $x_k$  is an independent identically distributed sequence of transmitted data symbols assuming the values -1 and +1. In many of the simulation results the additional case of AMI line coding will be considered. In this case  $x_k$  can be considered to be the result of applying a first difference operation to an independent identically distributed binary sequence assuming the values 0 and +1.

Define the timing function as

$$E\{f\left[\sum_{k=-\infty}^{\infty}x_k\ h\left(t-kT\right)\right]\}\tag{6.2}$$

where  $f(\cdot)$  is some convenient nonlinear function, and E stands for expected value. The timing function w(t) is clearly periodic with period T, and so its spectrum consists of a set of discrete lines at multiples of the data rate. For the particular case  $f(x)=x^2$ ,

$$w(t) = E\{ \left[ \sum_{k} x_{k} \ h(t - kT) \right]^{2} \} = \sum_{k=-\infty}^{\infty} h^{2}(t - kT)$$
 (6.3)

where we have assumed that the  $x_k$  are independent and equally likely. If the data signal is bandlimited to less than  $\frac{1}{T}Hz$ , the spectrum of w(t) will be bandlimited to less than  $\frac{2}{T}Hz$ , and so must be of the form

$$w(t) = A_0 + A_1 \sin(\frac{2\pi t}{T} + \phi)$$
 (6.4)

For a general f(x), w(t) will have higher order harmonics.

The phase error function in the WDM is defined as

$$p_n = w(nT + \tau) - w[(n + \frac{1}{2})T + \tau]$$
 (6.5)

where au is some arbitrary sampling phase. If frequency detection is desired, an additional quadrature error signal

$$q_n = w \left[ (n + \frac{1}{4})T + \tau \right] - w \left[ (n + \frac{3}{4})T + \tau \right]$$
 (6.6)

is defined. In a phase locked loop,  $p_n$  is used to control the frequency of a VCO, and the feedback acts to force  $p_n$  to zero. Fig. 6.1 shows the steady-state sampling phases for  $p_n$  and  $q_n$  on an eye diagram. It will be specified later how frequency detection is performed.

In a practical implementation, the expectation in (6.2) must be replaced by a time average, as in



Figure 6.1. Eye diagram and the steady state sampling phase for  $p_n$  and  $q_n$ .

$$\hat{w}(t) = \frac{1}{K} \sum_{k=0}^{K-1} f(s(t-kT))$$
 (6.7)

where K is the number of samples in the average. In the WDM, an oversampling factor R=2 is used. The subtractor function could be located at the input of the transversal filters which performs the averaging. In general the output signal, the phase error estimate, is a slowly varying function of time. A high sampling rate is not necessary, and decimation by a large factor can be accepted. Fig. 6.2 shows a structure that decimates the signal by a factor M. In the limiting case M=K, no transversal filter is required at all.

Sometimes it may not be desirable to decrease the sampling rate excessively. One such case is when one wants to perform frequency detection, when the maximum frequency offset allowed in the VCO cannot be larger than half the sampling rate of the error signals to avoid aliasing. If a large pull-in range is desired in the PLL, a high sampling rate must be used for the error signals, and consequently a longer transversal filter is required in Fig. 6.2. The storage requirements can be reduced while keeping the sampling rate high by using a recursive filter. Since in this application accurate control of the bandwidth is not necessary, the coefficients of the filter can be approximated by numbers of the form  $2^{-N}$  or  $1-2^{-N}$  to avoid multipliers in the case of a digital implementation.

When oversampling by a factor R=2, neither of the two samples in each period will be taken at the instant of maximum eye opening after  $P_n$  is driven to zero by the PLL. They will instead be located approximately (if the pulse is approximately symmetric) at  $-\frac{T}{4}$  and  $\frac{T}{4}$  relative to that point. Therefore, it seems that an oversampling factor of at least R=4 is needed, but this increase in R is costly in receivers employing the echo canceller method because the complexity of the echo canceler grows linearly with R. An approach which achieves an effective R=4 without increasing the sampling



Figure 6.2. WDM using one transversal filter with decimation.

rate of the echo canceler uses an interpolation filter at the output of the echo canceler to increase the effective oversampling factor. This filter must provide negligible distortion of the signal within the band  $0 \le f \le \frac{1}{T}$ , and a large alias suppression in the band  $f > \frac{1}{T}$ . In order to satisfy these conditions, a relatively complicated filter is needed. A simpler solution to obtain an effective R=4 is the use of a linear phase all-pass network which approximates a delay  $\frac{T}{4}$ . The resulting fractionally-delayed samples can be used in the phase detector, so that one of the original samples will be located at the center of the eye. A second order all-pass section with a transfer function

$$H(z) = \frac{z^{-2} + c_1 z^{-1} + c_2}{1 + c_1 z^{-1} + c_2 z^{-2}}$$
(6.8)

has been found by Agazzi in [29] to provide satisfactory results in computer simulations with:

$$c_1 = 0.429968$$

$$c_2 = -0.048017$$
(6.9)

Fig. 6.3 shows a typical example run for the case of a 2 mile gauge 26 line, with a 0.5 mile gauge 19 bridged tap at the center. In this example the sampling rate was R=2, but the output was computed 50 times with different values of the sampling phase, and the outputs plotted together so that the signal appears to be a continuous time signal. This was done to compare the pulse shapes before and after the phase shift network. For the same reason the output pulse was displaced in time by an amount equal to the delay of the network, namely  $\frac{T}{4}$ . Although only one example is presented here, many more have been run, with similar or better results. We conclude that the use of this phase shift network provides a very simple and practical solution to the sample interpolation problem.



Figure 6.3. All pass filter approximation to a  $\frac{T}{4}$  delay: comparison of input and output with input appropriately delayed. For an ideal delay, the two waveforms would be identical.

## 6.1.2. WDM Frequency Detector

The WDM lends itself to the implementation, with little increase in complexity, of a frequency detector. This is potentially attractive because of the increase in the pull-in range of the PLL. In order to minimize jitter, very narrow loop bandwidth is required, which results in a limited pull-in range. This is no problem when accurate crystal-controlled VCO's are used, but the use of cheaper low-precision crystals, or even non-crystal VCO's is an economically appealing possibility. The latter, in particular, would enable the monolithic integration of all the components of the VCO on the transceiver chip.

For frequency detection, an oversampling factor R=2 is not sufficient because aliasing distortion would not permit the distinguishing of positive and negative frequency offsets. The minimum oversampling factor depends on the maximum frequency offset allowed for the VCO. Since R=4 can be achieved without an increase in complexity of the echo canceler using the all-pass filter, assume R=4 in the subsequent analysis of the frequency detector.

The basic difference between a phase and frequency detector is that the former measures the phase error modulo T, whereas the latter can keep track of cycle slips and therefore phase errors larger than T. This difference is illustrated in Fig. 6.4. Fig. 6.4a shows the characteristic of a phase detector, and Fig. 6.4b and 6.4c those of frequency detectors. In the case of Fig. 6.4b the error characteristic is linear over a large number of cycles, whereas in Fig. 6.4c, the characteristic saturates for phase errors  $|\phi| \ge \frac{T}{2}$ . A way to make a phase detector into a frequency detector is to keep track of the number and the sign of the cycle slips. With an oversampling factor of R=4, the in-phase and quadrature error signals  $p_n$  and  $q_n$  defined in (6.5) and (6.6) can be used to detect these cycle slips.



Figure 6.4a. Characteristic of a phase detector.



Figure 6.4b. Characteristic of a linear frequency detector.



Figure 6.4c. Characteristic of a nonlinear frequency detector.

130

A rotational detector [57] detects a cycle slip whenever the vector  $(p_n.q_n)$  (Fig. 6.5) passes between the upper and the lower half-plane. The direction of the passage indicates whether the slip was positive or negative. Thus a crossing from quadrant 1 to 4 or from 3 to 2 indicates a negative cycle slip, whereas a crossing from 4 to 1 or from 2 to 3 corresponds to a positive slip. The rotational detector lends itself to a simple implementation as shown in Figure 6.6a, and has been found to perform satisfactorily in computer simulations.

Another frequency detector is based on the quadricorrelator [57], as shown in Fig. 6.6b. The quadricorrelator works on nearly the same principle as the rotational detector. The output of the hard limiter indicates whether the  $(p_n, q_n)$  vector is in the upper or the lower half plane, and the derivative of  $p_n$  indicates whether the vector is moving from the left to the right half plane or vice-versa. Thus the sign of product of both signals represents the sign of frequency error (direction of rotation). The main difference is that the rotational detector counts only integral numbers of slips, whereas the quadricorrelator generates a proportional error signal.

Results of computer simulations of both the rotational detector and the quadricorrelator in the specific case of a subscriber loop receiver using WDM timing recovery are reported in Section 6.3.

Unfortunately, any non-crystal VCO that can be integrated on a monolithic chip in MOS technology without trimming will have large errors in its center frequency. Errors of  $\pm 50\%$  can be realistically expected. Frequency errors of this magnitude cannot be corrected with a continuously running frequency detector as described above. However, preliminary work indicates that it is possible to use a monolithic non-crystal VCO if an initial half-duplex startup sequence is used. During that sequence, pulses are sent from the central office at a much lower rate than the nominal, for example  $\frac{1}{107}$ . A systematic sequence like +1,-1,+1,-1,+1,... is sent. At this low speed, all input



Figure 6.5. Rotational detector detects a cycle slip whenever the vector  $(p_n .q_n)$  passes between the upper and lower half plane.



Figure 6.6a. Rotational frequency detector.



Figure 6.6b. Quadricorrelator frequency detector.

filtering and equalization circuitry can be bypassed (obviously this circuitry needs an accurate clock, since it works on sampled data, so it cannot be used until the frequency of the clock has been adjusted). A simple threshold device generates a square wave from the received waveform, which can be used to adjust the VCO frequency using some of the standard digital frequency and phase detection techniques [57]. After frequency lock has been achieved, the operation is switched to full duplex and the WDM phase detector takes over the control of the VCO. The loop lock range must be large enough to allow tracking of frequency drifts caused by temperature variations during the operation of the VCO. Due to the large initial error in the VCO center frequency, a very large VCO dynamic range in frequency<sup>(1)</sup> is required. This is of the order of 106:1, or about 20 bits. The complexity of an analog-digital implementation would be roughly equivalent to a 20 bit DAC, although it seems that if appropriate interpolation techniques are used, the system can be implemented in a reasonable amount of silicon area. Since frequency lock is achieved in this case during the startup sequence, another continuously running WDM frequency detector is not required.

In summary, frequency detection may be advantageous whenever a lower accuracy free-running frequency for the VCO is desired, as would be obtained from using a cheaper crystal. If monolithic non-crystal VCO's are used, a continuously running WDM frequency detector does not provide enough pull-in range, and frequency acquisition must be achieved during an initial startup sequence.

### 6.1.3. Comparison of the WDM with the Spectral Line Method

In the spectral line method, the timing tone obtained after the nonlinear operation on the data signal is recovered by a narrowband filter. Instead of using a bandpass filter centered at the data rate, a *lowpass* filter of equivalent bandwidth can be used if the filtering operation is preceded by a frequency translation of the spectrum down to

<sup>(1)</sup> In a digitally controlled VCO, the frequency can be adjusted only in steps. Dynamic range of the VCO is defined as the ratio of the total tuning range to the smallest frequency step that can be generated.

dc. This frequency conversion can be achieved by multiplication of the timing signal by a sinusoidal reference signal with frequency nominally equal to the data rate. This is the operation usually performed when the timing tone is filtered with a phase locked loop, where the reference signal is the output of the VCO. Any departure of the frequency of the reference signal from its nominal value will create a beat tone at the output of the lowpass filter which can be used to control the VCO that generates the reference signal. If the timing tone is  $A\sin(\omega t + \phi)$  and the reference signal is  $\sin(\omega_r t)$ , demodulation will create the tones  $\frac{1}{2}\cos((\omega-\omega_r)t+\phi)$  and  $-\frac{1}{2}\cos((\omega+\omega_r)t+\phi)$ .

The demodulation of the timing signal can be performed in continuous time or in sampled data fashion. Assume in the latter that the sampling rate is exactly twice the frequency of the reference signal. Then, if the departure of  $\omega_r$  from its nominal value  $\omega$  is smaller than  $\omega_r$ , the frequency difference tone will be adequately represented without aliasing distortion. The sum frequency tone will be aliased down to  $\frac{1}{2}\cos(\omega-\omega_r)t$ , and will reinforce the difference frequency tone. Clearly, it is not possible to distinguish whether the frequency error is positive or negative, and therefore no frequency detection is possible with a sampling rate twice the reference frequency. With the sampler locked to the VCO as assumed here, the samples of the reference signal have only two values  $+\sin\psi$  and  $-\sin\psi$ , where  $\psi$  is the relative phase between the sampler and the VCO. If  $\psi = \frac{\pi}{2}$ , the samples are +1 and -1. The phase detector is then effectively computing the difference between the even and odd-order samples of the timing signal. If, in addition, the low-pass filter is an averaging filter (that is, the output samples are computed as the average of a certain number K of input samples), this sampled-data version of the spectral line technique becomes exactly equivalent to the WDM as described in Subsection 6.1.1.

So far we have considered that the timing signal consists of a purely sinusoidal tone at the data rate, with no harmonics and no continuous spectral components. When the nonlinear operation performed on the data signal is a square-law operation, no harmonics of the discrete timing tone are generated, but there are continuous spectral components. If the nonlinear operation is other than a square-law operation, there will be both higher order harmonics of the timing tone and continuous components. The WDM is no longer exactly equivalent to the spectral line technique because the phase error signal will include the effect of the dc aliases of the higher order harmonics of the timing tone and the phase recovered by the WDM then is different from the phase recovered by the spectral line technique. Furthermore, the jitter is expected to be different even in the case of a square-law operation, because of the aliasing of continuous components above the data rate. We will show in the next Section that the WDM generally performs better than the spectral line method.

## 6.2. Performance of the WDM

The performance of the WDM when operating on real telephone lines has been studied using channel impulse responses computed by a cable modeling program [58]. This program models the line sections, gauge discontinuities, bridged taps, transformers, transmit and receive filters, and an equalizer. The transmit and receive filters used were all-pole minimal intersymbol interference filters [19], and the equalizer frequency response was

$$F(s) = \left(\frac{b}{a}\right)\left(\frac{s+a}{s+b}\right) \tag{6.10}$$

where a and b are the zero and the pole locations. (Usually b is fixed and a is adjustable).

A large number of cable configurations were analyzed to determine how cable imperfections affect the system performance. Although the amount of pulse distortion

would increase with more than one bridged tap, only lines with single bridged tap are considered here. More study is needed to assess the validity of the WDM in cases with more than one bridged tap. Model program output showed that the transmission path (unlike the echo-path) impulse response is almost completely insensitive to the location of the bridged tap, and thus a center location was assumed. The reflection originates at the open end of the bridged tap, and the longer the tap the longer the delay of the reflection with respect to the main pulse and the smaller its relative height.

The pulse distortion generated by a bridged tap depends on the transmit and receive filters used. Severe bandlimiting of the signal causes the reflection from the bridged tap to merge with the main pulse. The only observable effect may be simply a widening of the pulse, particularly for lower data rates, for example 80 Kb/s and a gauge 26 bridged tap. When the data rate is increased, causing a corresponding increase in the bandwidth of the transmit and receive filters, the reflections from bridged taps start to be resolved, causing more concern about the phase of the recovered timing. Clearly, the intensity of the reflection is also strongly dependent on the bridged tap gauge. Gauge 26 bridged taps have been found in the modeling to cause very little observable effect in bandlimited systems, even at data rates as high as 144 Kb/s. Because of the larger instantaneous data rates involved, TCM systems may be more susceptible to sampling phase offsets caused by bridged taps.

The typical configuration presented here consists of a 2 mile gauge 26 cable with a bridged tap at the center. The wire gauge of the bridged tap is 19 and its length is varied from 0.1 to 0.5 miles. A gauge 19 bridged tap was used in order to minimize the attenuation of the reflected wave, and thus present the worst case. Fig. 6.7a shows the impulse response of the five different cables considered at a data rate of 144 Kb/s, whereas Fig. 6.7b a shows the same response at 256 Kb/s.



Figure 6.7a. Timing phase obtained by WDM at 144 Kb/s.



Figure 6.7b. Timing phase obtained by WDM at 256 Kb/s.

Another computer program was written to evaluate timing function (6.2). It was assumed that the impulse response length was 5T, and consequently the average was computed over the 32 possible sets of overlapping data bits. The sampling phase recovered by the WDM was found solving the equation:

$$w(t-\frac{T}{4})=w(t+\frac{T}{4})$$
 (6.11)

This phase is shown in Figs. 6.7a and 6.7b. The phase recovered by the WDM is very close to the maximum of the pulse even in the presence of strong distortion caused by a single bridged tap.

The jitter performance of WDM when operating on the same example lines was studied using the theory developed by Agazzi in [29], which applies directly to binary line coding. For comparison, the jitter performance of the continuous time spectral line method was also computed using nonlinear functions  $x^2$ , |x|, and  $x^4$ . In order to make the examples more directly comparable, the averaging filter of the WDM was replaced here by a recursive filter with a unit sample response  $e^{-com}$ , while in the spectral line method a resonator centered at the data rate, with an impulse response  $e^{(-\alpha+\beta\omega)t}$  was used. The transmission speed was  $144 \ Kb/s$ , and the filter bandwidth was in both cases  $B=\frac{\alpha}{2\pi}=3.6 \ Hz$ . The jitter power was computed after adjusting the signal level to yield a normalized timing tone amplitude with a unity peak value in order to compare with the results of the PLL simulation of Section 6.3. The PLL bandwidth depends on the phase detector gain, which in turn depends on the timing tone intensity. If T is also normalized to 1, the jitter power can be expressed in dB, and directly compared with the output of the PLL in Section 6.3.

The WDM yields a performance comparable or superior to the spectral line method for most cable configurations [29]. The absolute value function in the latter not only is the easiest to implement, but also gives the best results in most cases. The

worst performance is associated with bridged taps of an intermediate length, such that the delay of the reflected pulse is about  $\frac{T}{2}$ . When the delay increases to T, the jitter decreases.

#### 6.3. An Example of Timing Recovery Design

In this Section the conclusions of the previous Sections are appled to the practical design of the timing extraction block of a subscriber loop receiver. Both phase and frequency detection are used. For the latter, one of the examples uses a rotational detector, and the other a quadricorrelator. A proportional plus integral (PI) loop filter for the phase error signal, and an integral only filter for the frequency error are used, as recommended in [57]. In the case of the rotational detector, the integral of the frequency error is computed simply by counting slips, a technique that introduces a very coarse quantization in the filter output signal. This is advantageous since, once in lock, the frequency detector does not disturb the phase locked loop. In the quadricorrelator, the frequency detector is always active, even in lock, and the fluctuations of the frequency error signal introduce an extra jitter in the recovered clock.

Block diagrams of the two approaches are shown in Figs. 6.6a and 6.6b. The complete timing recovery systems were simulated by Agazzi [29] for the same five example lines considered in Section 6.2. Simulations for both binary and AMI line codes indicated that the jitter performance is quite similar, and only results for the AMI case are presented here. The convergence transients for an initial VCO offset of 2000 ppm or 288 Hz are shown in Figs. 6.8a, 6.8b, 6.9a, and 6.9b for the rotational detector and the quadricorrelator, respectively. Figs. 6.8a and 6.8b show the phase, and Figs. 6.9a and 6.9b the output of the phase detector. Only one case, corresponding to a 0.5 mile BT is shown here, but similar results were obtained for the other cases. Lock was acquired by the rotational detector in less than 14400 cycles, which corresponds to 100 ms, in all the cases. A tradeoff between speed of acquisition and residual jitter exists in the



Figure 6.8a. Transient behavior of the sampling phase acquired by the rotational detector.



Figure 6.8b. Transient behavior of the sampling phase acquired by the quadricorrelator detector.



Number of Cycles

144

Figure 6.9a. Phase detector output of the rotational detector vs. time.



Figure 6.9b. Phase detector output of the quadricorrelator detector vs. time.

case of the quadricorrelator. Using a large gain in the frequency loop, acquisition can be speeded, but the extra jitter introduced by the frequency error signal under lock conditions is higher. In the simulations shown here a longer acquisition time than was obtained for the rotational detector was deliberately accepted to decrease somewhat the steady state jitter. However it seems that the overall performance is poorer, and thus the rotational detector is to be preferred. The steady state phase after lock coincides with that shown in Fig. 6.7a for the WDM.

## 6.4. Line-coding and Pulse-shaping

Certain degree of randomness is needed for WDM timing recovery. Binary (antipodal. +/- 1) coding with scrambler offers the best signal-to-noise ratio for a fixed dynamic range. However, the low-frequency cutoff, arising from transformer or capacitor coupling, is sensitive to the low-frequency component in the signal spectrum. The effect of this low-frequency cutoff is the baseline wander [55].

AMI coding, which ensures alternate positive and negative pulses, removes the spectral component at DC by making binary into ternary code. The power spectrum density at low frequency is also reduced in AMI coding. Apparently, the distance between any two adjacent code level is reduced by a factor of two for a fixed dynamic range, the total power is also reduced by the same factor. There is another drawback, however, such as the more complicated decision circuit for three-level signal. The power spectra of the binary and AMI codes are plotted in Figs. 6.10a and b.

Another possible line-coding that works well with WDM timing recovery is the  $1-z^{-1}$  partial-response. A more detailed discussion on this special form of partial-response can be found in Sec. 7.1. It will be shown that it is identical to the AMI coding if there is a precoding process before the partial-response coding. Thus it preserves all the properties that AMI offers, which are zero DC power, small low-frequency components, 3-level signal, and alternate-mark-inversion. In addition, it can also be



Figure 6.10a. Power Spectrum of a Binary-Code Transmission.



Figure 6.10b. Power Spectrum of a AMI-Code Transmission.

detected as a binary (antipodal) signal with the aid of a single tap DFE. The drawback is the possible error propagation problem [12]. The advantages associated with this 2-level detection will be described in Chapter 7.

Minimum intersymbol interference (ISI) pulse-shape is used. For an ideal channel, this pulse-shape provides minimum ISI and a nearly symmetrical pulse shape with the center of symmetry at the peak of the pulse. The property of symmetry ensures that the WDM timing recovery acquires the timing phase where the eye-opening is maximum and ISI is minimum. Equalization, the topic of next section, is necessary in the presence of line impairments.

# 6.5. Equalization

A simple single-pole and single-zero filter, with the transfer function equal to Eq. (6.10) was used in the simulation of the performance of the WDM in Section 6.2. This high-pass filter was used to compensate the  $\sqrt{f}$  attenuation of the line. The degree of compensation, achieved with the adjustment of a in Eq. (6.10), should depend on the length of the line. More complicated  $\sqrt{f}$  equalizers, adjustable as well, were also used in various literature [28.56]. The adjustment is usually made adaptive by estimating the line length from the received signal level. However, bridged taps cause excessive attenuation of the pulse and can make the estimate of the line length inaccurate. Excessive high-pass filtering results from this case, the burden to remove the ISI caused by this high-pass characteristic will be placed on DFE.

Another complication due to the presence of bridged taps is the reflection of the pulses adding on to the original pulses, which results in postcursor ISI. This ISI can also be removed by DFE.

#### 6.6. Summary

The use of discrete time timing recovery techniques has been found to be a viable alternative to conversion of the data signal to continuous time followed by continuous time timing recovery. The former is a preferable approach for realization with MOS monolithic technology. Of particular interest is the WDM, which has been studied both analytically and by computer simulation and found to be as good or better than its continuous time counterpart, the spectral line method. Computer simulations have shown that the recovered phase is very satisfactory even in the presence of severe pulse asymmetry due to a single bridged tap.

Alternatives for the implementation of frequency detectors have also been discussed. The advantage of adding a frequency detector in addition to the phase detector is an increased pull-in range. This could allow a less accurate free-running frequency for the VCO.

A complete implementation of a WDM timing recovery circuit has been proposed. The sampling rate is twice the data rate, thus permitting the use of a relatively simple echo canceler, but a second order all pass linear phase network is used to generate a  $\frac{T}{4}$  delayed version of each sample, increasing the effective sampling rate to four times the data rate. At this sampling rate both frequency and phase detection can be performed.

The adjustable  $\sqrt{f}$  equalizer is sufficient for equalization for lines without bridged taps. When the effect of the bridged tap, basically the reflection, is not too severe. WDM recovers a timing phase that has very small precursor ISI, with the help of the  $\sqrt{f}$  equalizer. Lines with a single bridged tap belong to this category, where a DFE is often needed to remove the reflection and the excessive ISI due to over compensation. However, if there is significant effect from the bridged taps, for instance when the reflections from multiple bridged taps add up to be larger than the main pulse. WDM can acquire a timing phase that has a large precursor ISI. More complicated

equalization techniques, which are undesirable in VLSI implementation, are then necessary. The other method, baudrate sampling technique, to be discussed in next chapter, presents a good solution to this problem.

## CHAPTER 7

# Baudrate Sampling Technique (BST)

The baudrate sampling technique in conjunction with a special line coding and timing function fulfills all the objectives for timing recovery in digital subscriber loop systems. It reduces the required sampling rate to the possible minimum, the baudrate. The recovered phase using this method is suitable for decision feedback equalization and insensitive to the presence of bridged-taps. A detailed performance study has been carried out, including analysis, simulation, and experimental measurements on a variety of cable configurations, some including bridged taps.

Jitter of the recovered timing signal is particularly critical to the performance of the echo canceller in hybrid-mode systems. Analysis of the jitter performance leads to design techniques for reducing the jitter magnitude. Because the sampling rate is at the baudrate, the decision feedback equalizer can be placed in front of the timing recovery block, and effectively reduces the timing jitter.

The baudrate sampling technique for timing recovery offers a potential advantage of combining DFE with the echo canceller. This configuration removes both the echo replica and the far end signal from the error signal used for echo canceller adaptation [11]. The resulting residual error of the echo canceller is smaller and the convergence of the echo canceller can also be speeded up.

The approach we have pursued toward this goal is to select a special line code and a corresponding timing function. Details are presented in the next section.

## 7.1. Line Coding and Pulse Shaping

The line coding and pulse shaping has considerable impact on echo cancellation, equalization, and timing recovery. Appropriate pre-equalization, which is performed before echo cancellation or even at the transmitter in the form of line coding, can reduce the number of taps in the echo canceller and equalizer. The timing jitter can also be reduced by line coding. One example of achieving pre-equalization at the transmitter using line coding technique is the so-called self-equalizing line codes, such as Biphase and Wal-2 codes, which reduce the duration of the impulse response significantly in the bandlimiting channel.

#### 7.1.1. Line Coding

The line coding proposed for the baudrate sampling technique combines partial-response with self-equalization. Self-equalization reduces the duration of impulse response in the presence of line impairments. The  $1-z^{-1}$  partial-response property ensures that it has zero DC power and that it has a reasonable spectral energy distribution. Note that the  $1-z^{-1}$  partial-response is equivalent to the AMI coding with appropriate precoding. For instance, assume that the data to be transmitted assume the following sequence

A true AMI will make alternate polarity on the +1's being sent and result in the following sequence

**AMI** 

$$-1,+0,+0,+1,-1,+1,+0,+0,...$$
 (7.2)

Note that it is a three-level code rather than a 2-level code as the original data sequence

is. A straight  $1-z^{-1}$  on the original data sequence without any precoding result in

$$1-z^{-1}$$
 partial-response  
+1.+0.-1.+0.+0.+1.+0.... (7.3)

Note it has similar property as the true AMI that there are alternate +1 and -1 such that the DC power is always zero. However, if we pre-code the data sequence through a process of modulo 2 summation, we end up with

$$(\Sigma)_2$$
  
+1,+1,+1,+0,+1,+0,+0,+0,... (7.4)

Then take  $1-z^{-1}$  on this sequence, we get

$$(\Sigma)_2 + (1-z^{-1})$$

$$-1.+0.+0.+1.-1.+1.+0.+0....$$
(7.5)

which is identical to the true AMI sequence. A process flow with precoding is shown in Fig. 7.1. It may not be necessary to transmit true AMI code unless it is standardized, because the  $1-z^{-1}$  has the same properties as the AMI code, such as zero-DC power, small low-frequency components, alternate-mark-inversion, and 3-level. The bandwidth and spectrum of the  $1-z^{-1}$  code is identical to that of the AMI code.

Although the  $1-z^{-1}$  is a ternary code, it can be detected as either ternary or binary depending on the structure of the equalizer and the data decision circuit. Being able to be detected as an antipodal signal with the same spectrum of a three-level AMI offers the greatest advantage of this coding. Antipodal detection has a well defined threshold which is independent of the signal level. Therefore it make the start-up and convergence more robust. We will show later in the computer simulation and experiment results that timing recovery and DFE jointly converge without any training sequence. The self-equalization inherent to the  $1-z^{-1}$  is lost if it is detected as a 3-



Figure 7.1. Block diagram of precoder, line coder, 2-level (a) and 3-level (b) detector, and decoders.

level signal. Fig. 7.2 shows the comparison of two pulses at the receiving end of a 4 Km line, the only filtering is the raised-cosine pulse shaping. It can be seen that a  $1-z^{-1}$  pulse, which corresponds to a two-level detection, has a much shorter tail than the pulse corresponding to a 3-level AMI detection. The self-equalization property is clearly demonstrated.

There are other means to remove the spectral component at DC besides using the partial-response approach proposed here. A scrambler, which randomizes the data sequence, and AMI coding, which ensures equal number of positive and negative pulses, also make the DC power zero. The same timing function applies to these cases, and the requirement on the DFE is also relaxed because the postcursor ISI is smaller. However, the scrambler does not guarantee alternate positive and negative pulses, nor eliminate the possibility of long strings of zeros. It is sometimes also undesirable to have the data scrambled in certain applications. The AMI coding, although ensuring alternate positive and negative pulses, makes binary signals into ternary signals. This complicates the detection process.

## 7.1.2. Pulse-shaping

A special pulseshape, which inserts a zero-crossing at the precursor, is also proposed for the sake of timing recovery. The objective of this pulse-shaping is that a timing function can be defined on the pulse such that the resultant timing phase obtained by the timing function is insensitive to the line impairments, especially the presence of the bridged taps. The timing function will be described in next section.

As mentioned before in Chapter 1, pulse-shaping can be achieved in either digital domain by means of digital coding or in analog domain by means of filtering. Digital coding is normally simpler to implement in VLSI technology.

The pulse-shaping proposed for the baudrate sampling technique can also be performed with digital coding at the transmitter end, which will be discussed in next



Figure 7.2. Received pulse shape at the end of a 4 km line. (a) a minimum ISI pulse (b) a  $1-z^{-1}$  partial-response coded minimum ISI pulse.

section. A means of filtering in the sampled-data domain located in the receiver is the topic of Sec. 7.1.2.2.

## 7.1.2.1. Pulse-shaping by digital coding

The digital code corresponding to the  $1-z^{-1}$  of the zero-crossing precursor pulse representing a -1 is shown in Fig. 7.3(a). The code for +1 is simply the negative of this waveform. Note that a single pulse extends over two time intervals and creates severe postcursor intersymbol interference, which will be removed by DFE if binary detection is to be performed. This digital code can be easily generated in digital circuits. Further analog filtering is needed to reshape the pulse into a desired form to eliminate high frequency components for crosstalk and noise reasons. This analog filtering can be performed in either the transmitter end or the receiver end, or split between transmitter filter and receiver filter. The filtered pulse shape is also shown in Fig. 7.3(b). It is self-equalizing and DC-free because there is equal energy in positive and negative portions of a code. The zero-crossing created by this special code, if made one baud period ahead of the sampling phase, will make precursor ISI equal to zero.

Digital coding provides a simple means for pulse-shaping. The spectrum of the data signal appearing in the transmission channel, however, contains more high frequency components due to the abrupt change in the waveform.

## 7.1.2.2. Pulse-shaping by analog filtering

The drawback of more high frequency spectral components of the digitally generated pulse-shaping can be alleviated by placing the analog pulse-shaping filtering at the receiving end. A sampled-data transversal filter, with fixed tap-coefficients and sampling rate equal to 4 times the data rate, can be used for this purpose. A raised-cosine filtered pulse, which is more compatible with the conventional approach and contains less high frequency components than the special code in Fig. 7.3(b), can be



Figure 7.3. A  $1-z^{-1}$  coded, digitally shaped pulse in digital domain (a), in analog domain after transmit and receive filtering (b), and the corresponding timing function (c).

reshaped to a zero-precursor pulse after passing through a filter with sampling rate at  $4f_b$  and transfer function

$$H(z) = 1 - \alpha z^{-1} \tag{7.6}$$

The block diagrams of the two methods for linecoding and pulse-shaping are shown in Figs. 7.4(a) and (b).

## 7.2. Timing Function

The timing function is a function of the timing phase with the property that the values of the function can be used to control the voltage-controlled oscillator (VCO) to adjust the timing phase. An example of a typical timing function defined for a symmetrical pulse is shown in Figures 7.5(a) and (b), where the desired timing phase is defined as  $\tau_0$ . This timing function is positive when the timing phase is ahead, and negative when behind.

The received data signal can be expressed as

$$x(t) = \sum_{k=-\infty}^{\infty} a_k \ h(t-kT)$$
 (7.7)

where  $\{a_k\}$  are the transmitted data symbols and h(t) is the effective channel impulse response. The received signal, if sampled at baud interval at phases  $\tau+nT$ , can be expressed by the weighted sum of the samples, taken also at baud intervals, of the impulse response, as

$$x_n = x(\tau + nT) = \sum_{k=-\infty}^{\infty} a_k \ h(\tau + nT - kT)$$
 (7.8)

It appears possible to estimate the timing function, which is defined on samples of the impulse response, from the received samples  $x_n$ , via a reverse operation. It is also obvious that this reverse operation is a function of data symbols  $a_k$ .

A method to estimate this function from samples of the received signal, taken at one-baud intervals, was proposed by Mueller and Muller [44]. The idea is to introduce



Figure 7.4. Block diagrams of the two pulse shaping approach, digital coding in (a), and analog filtering in (b).





Figure 7.5. Example of a timing function, where the pulse shape h(t) is assumed to be symmetrical. The timing function is defined as  $f(\tau) = h(\tau - T) - h(\tau + T)$ .

a weighting vector which is an algebraic function of data symbols as an operator on the received samples such that the expected value of this weighted sum is equal to the timing function. The weighting vector corresponding to the timing function of  $f(\tau)=h(\tau-T)-h(\tau+T)$  in Fig. 7.5(b) is:

$$\mathbf{g_k} = \begin{bmatrix} a_k \\ -a_{k-1} \end{bmatrix} \tag{7.9}$$

This same technique, but with a different timing function, is used in the baud-rate sampling timing recovery.

The timing function defined on the proposed code, shown in Fig. 7.3(c), is

$$f(\tau) = h(\tau - T) \tag{7.10}$$

and the desired phase is at  $f(\tau_0) = 0$ . Note that the timing function is defined on the leading portion of the pulse and thus is insensitive to the bridged taps. Also note that by definition the precursor at -T is zero and this zero crossing is close to one period before the peak of the pulse.

Figures 7.6(a)(b) and (c) show the timing phase obtained by this timing function for various line configurations. The results show that for all the lines shown the recovered timing phases are close to the peak of the pulse. Note from Fig. 7.6(c) that the recovered timing phase is insusceptible to the presence of bridged taps, simply because the bridged taps only affect the trailing portion of the pulse. It is shown also in Fig. 7.6(a) and (b) that the recovered phase is closer to the peak as the line length gets longer until 5 km.

The weighting vector corresponding to the timing function in Eq. (7.10) is:

$$\mathbf{g}_{k} = \begin{pmatrix} a_{k-1} - a_{k} & a_{k-1} a_{k-2} \\ a_{k-2} + 2a_{k} \\ -a_{k-1} - 2a_{k} & a_{k-1} a_{k-2} \end{pmatrix}$$
(7.11)

and assumes values  $0,\pm 1,\pm 2,\pm 3$ , whether it is binary or ternary detection, which



Figure 7.6(a). Recovered timing phase indicated by dots for lines with lengths equal to 1.2, and 3 km. Note that the recovered timing phase is one T after the zero crossing.



Figure 7.6(b). Recovered timing phase indicated by dots for lines with lengths equal to 3.4. and 5 km. Note that the recovered timing phase is one T after the zero crossing.



Figure 7.6(c). Recovered timing phase indicated by dots for lines with length equal to 2 miles with and without bridged taps intact. Note that the recovered timing phase is one T after the zero crossing.

facilitates implementation by a shift and add operation in the digital domain or by a ratioed switched-capacitor technique in the analog domain. Figure 7.7 shows the signal flow diagram of the timing function generator.

## 7.3. Timing Jitter Analysis

The jitter of the recovered timing signal in the DSL system was shown to be very critical. Analysis and computation of the jitter provides needed understanding to find means of reducing the timing jitter. It will be shown in this section how the timing jitter can be computed in the baudrate-sampling technique implemented with a PLL. An efficient method to reduce the timing jitter in such a system is also presented. Under ideal conditions where ISI is removed entirely, a jitter-free timing signal can be achieved with the baudrate sampling timing recovery technique.

## 7.3.1. Timing Jitter Model

A simplified version of the baudrate-sampling timing recovery system is shown in Fig. 7.8(a). The VCO output, which is the recovered timing signal, is used to sample the input data signal. The received samples and the correct received data sequence are then processed in the timing function generator block. The output of the timing function generator is equivalent to the output of the phase detector of a conventional PLL [45,46] (as shown in Fig. 7.8(b)). It is passed through a loop filter to drive the VCO.

During steady state operation, and when the phase jitter and phase error are very small, the PLL can be modeled as a linear system, with the two inputs into the phase detector representing the phase of the incoming data signal and that of the VCO output. As shown in Fig. 7.8(c), the phase detector is also replaced by a linearized phase detector with two inputs in the dimension of phase. Under the same assumptions, the system in Fig. 7.8(a) can also be modeled by the system in Fig. 7.8(c). The characteristics corresponding to two types of timing function are shown in Fig. 7.5(b) and 7.3(c). Note that the expected values of the timing functions are monotonic, and approximately



Figure 7.7. The timing function generator, analog delay line for samples  $x_k$ , and digital delay line for data symbols  $a_k$ .



Figure 7.8. (a). Block diagram of the baudrate sampling technique. (b). Block diagram of a conventional PLL for timing recovery. (c). Linearized model of (a).

linear, functions of the difference between the VCO output phase and the ideal sampling phase on the data signal. Also note that the linear system model of the PLL is a low pass filter, the characteristic of which is defined by the loop gain and the loop filter.

The feedback in the PLL tends to drive the VCO such that the average sampling phase of the VCO output coincides with the desired timing phase, defined by the zero crossing of the timing function. However, the actual VCO output phase fluctuates around the desired phase because the timing generator output also fluctuates. The loop filter in the PLL averages the timing generator output, and the time constant of the loop filter determines how much fluctuation gets through.

The actual timing function outputs for the two types of timing function are computed in the Appendix. This instantaneous output of the timing function generator is data dependent. This is one way timing jitter is induced. If the channel impulse response has a finite duration, say NT, where T is the time interval between data symbols, this output is a function of M data symbols only, where M is an integer depending on N and the timing function, and can only assume finite number of values. With the assumption that the PLL has converged and the timing jitter is very small, it can be shown that if the data sequence  $a_k$  is an independent identically distributed sequence assuming the values +1 and -1, the output of the timing function generator is a random sequence, and it is wide-sense stationary. The probability density function can be computed from the  $2^M$  combinations of the M data symbols. This random sequence can be referred back to the input of the phase detector, and treated as the jitter of the input phase  $\theta_{in}$ . In this case, input  $\theta_i$  is a discrete time random sequence with index n denoting time.

#### 7.3.2. Jitter Spectrum

The autocorrelation function of this input-referred jitter in two cases is computed in the Appendix. The result is plotted in Figs. 7.9(a) and (b), where N was assumed to be 3 for simplicity. The same derivation can easily be extended to cases with N larger. The power spectra of the jitter are then computed and also are plotted in Figs. 7.9(c) and (d). It is interesting to note that the jitter power at low frequency of Fig. 7.9(c) is inherently much smaller than that in Fig. 7.9(d), which happens to be white. This is intuitively true because of the property of symmetry in Fig. 7.5(a). After referring the jitter to the input, the PLL can be treated as having a linear transfer function with a low-pass characteristic. The jitter power at the output of the VCO, at point A in Fig. 7.8(c), is

$$S_{\theta_0}(\omega) = S_{\theta_i}(\omega) |H(\omega)|^2$$
 (7.12)

where  $H(\omega)$  is the effective transfer function of the PLL,  $S_{\theta i}(\omega)$  is the power spectrum of pattern jitter referred to the input, and  $S_{\theta o}(\omega)$  is the power spectrum of timing jitter. We see that the timing jitter can be reduced to any desired value by making the effective bandwidth of the PLL narrower. However, the bandwidth of the PLL is also subject to other constraints, such as the capture range and the pull-in range.

Another obvious way to reduce jitter is to make  $h_{-1}$  and  $h_1$  both equal to zero, since the power spectrum is linearly proportional to the magnitude of  $h_{-1}$  and  $h_1$  as shown in Eq. (A8) and (A14) in Appendix. This is equivalent to equalizing the pulse to a Nyquist pulse, as is required by intersymbol-interference-free data detection. This property is a major advantage of this timing recovery technique. In other words, the operation needed for equalization also serves the purpose of reducing timing jitter. With other timing recovery methods, a separate operation is usually needed to reduce the timing jitter [28,29], e.g., a prefilter in the spectral line method as described in Section 3.3.3. Note that this jitter reduction mechanism, if implemented with an adaptive



Figure 7.9. Autocorrelation function (a.b) and power spectrum (c.d) of the timing jitter. Timing functions are  $f(\tau) = h(\tau - T) - h(\tau + T)$  for (a.c) and  $f(\tau) = h(\tau - T)$  for (b.d).

equalizer, is adaptive and data directed. It is therefore reasonable that data dependent jitter can be removed by this technique.

# 7.4. Equalization/Data Detection

The dispersion of the signal pulses caused by the  $\sqrt{f}$  attenuation, and the distortion of the signal pulses caused by the bridged taps make equalization indispensable for data detection. DFE was determined to be the best candidate from the viewpoint of complexity and noise enhancement.

The  $1-z^{-1}$  partial-response, as mentioned before in Sec. 7.1.1, is equivalent to the AMI code with precoding. The data sequence, therefore, can be detected as a binary antipodal (2-level) signal as a partial-response code by treating the  $z^{-1}$  portion as ISI. It can also be detected as a ternary (3-level) signal as an AMI code. In the case of binary detection, a large first postcursor ISI can be experienced which must be removed by the first tap of the DFE. In the case of ternary detection, DFE may not be necessary if there is no bridged tap attached. However, the self-equalization property of the  $1-z^{-1}$  code is lost, a  $\sqrt{f}$  equalizer is needed. If there are bridged taps, a DFE is still required for ternary detection to remove the reflection caused by the bridged taps.

A comparison, shown in Table 7.1, of binary coding, binary-detected  $1-z^{-1}$  partial-response, and ternary-detected  $1-z^{-1}$  partial-response shows the tradeoffs among those coding schemes. For the same dynamic range, binary transmission requires 4 times the signal power with 6 dB signal-to-noise ratio (SNR) advantage over the other two. In terms of SNR, binary-detected  $1-z^{-1}$  is identical to ternary-detected  $1-z^{-1}$ . The binary-detected  $1-z^{-1}$  code suffers error propagation, which has been shown to have negligible effect if the number of taps is small [36]. The binary-detected  $1-z^{-1}$  partial-response, compared to the ternary-detected case, offers the advantage of simple decision circuit, well-defined threshold which is independent of the signal level, and robust and fast convergence during start-up.

| CODING<br>SCHEME  | Dynamic<br>Range | Decision<br>Level | Average<br>Power | Decision<br>Threshold | h(+)     | ак<br>1 о 1 1 |
|-------------------|------------------|-------------------|------------------|-----------------------|----------|---------------|
| BINARY            | d                | ½ d               | · (1)²           |                       | 1 d      |               |
| AMI               | d                | <b>¼</b> d        | ( 1/2)2          | <u> </u>              | 4        | <b>△</b>      |
| BINARY<br>(I-Z-1) | d                | 1/4 d             | (½)²             |                       | <b>\</b> |               |

Table 7.1. Comparison of the 3 coding schemes, binary, AMI, and  $1-z^{-1}$  partial response.

#### 7.5. Subscriber End Transceiver

Since the clock in the transceiver at the subscriber end is slaved to the master clock in the transceiver of the central office, the timing recovery portions of the two transceivers are different. In the subscriber end, the timing recovery is always active, the recovered timing is then used for the transmitter clock, the echo canceller clock, and the clock for the equalization/detection, as shown in Fig. 7.10.

#### 7.6. Central Office End Transceiver

Whereas in the central office end, a master clock is available for the transmitter and echo-canceller. The master clock, which is derived from the central office switch, has very little jitter. Using this clock to sample the received signal can eliminate jitter accumulation. The phase of this clock, in reference to the received signal, depends on the round-trip delay on the line. A simple method, shown in Fig. 7.11, can be used to solve this problem. A sample/delay circuit, sampling the received signal at the right phase, delays the sample for a certain period of time before delivering it to the subtractor. The delay is estimated at the start-up, using the same baudrate sampling technique. The delay is implemented with a high frequency clock and a variable count-down circuit, as in a DPLL.

# 7.7. Integrated Circuit Implementation

Three approaches to the integrated circuit realization of the timing recovery and DFE have been considered, namely, fully digital timing recovery including a DFE, an analog timing recovery with digital DFE adaptation and analog cancellation, and thirdly a fully analog approach. The fully digital approach is only practical when the echofree signal is available in the digital domain, as it is with certain echo cancellers [1]. Several echo cancellers perform the critical operations in the analog domain to avoid the intermodulation due to nonlinearity in the data conversion [1,6]. The echo-free signal



Figure 7.10. Block diagram of the subscriber transceiver.



Figure 7.11. Block diagram of the central office transceiver.

is available only in analog form in these designs.

# 7.7.1. Digital Timing Recovery and DFE

Some proposed echo cancelers use digital adaptation and analog cancellation. In these configurations, analog echo-free signals are digitized for adaptation, and both analog and digital echo free signals are available. A fully digital approach is to utilize the digital echo free signal for timing recovery and equalization, as shown in Figure 7.12. The required accuracy for the analog to digital conversion and the DFE were determined by simulation to be 8 bits and 12 bits, respectively.

The required hardware for the DFE is a set of up/down counters for both storage and adaptation. An arithmetic-logic unit (ALU) and accumulator are used for addition and subtraction. To compute the timing function, the same ALU and accumulator can be used for the shift and add operations. A shift register is needed to store and update the detected data, and a ROM look-up table or some random logic is used to generate the weighting vector.

# 7.7.2. Analog Timing Recovery and Digital DFE Adaptation with Analog Cancellation

Possibly only the sign of the digital echo free signal will be used for adaptation. Analog timing recovery is then necessary. Adaptation for the DFE can still be implemented in the digital domain and converted via a D/A (time shared with the D/A in the echo canceller). Postcursor cancellation in the DFE is then performed in the analog domain. A switched-capacitor realization of the analog delay line and the simple multiplications required by the timing recovery is shown in Figure 7.13, where the adaptation and the DFE cancellation is also shown. An analog delay line, implemented with sample-hold amplifiers, is a more conservative approach.



Figure 7.12. Fully Digital Timing Recovery and DFE



Figure 7.13. Analog Timing Recovery and Digital DFE Adaptation with Analog Cancellation

#### 7.7.3. Fully Analog DFE and Timing Recovery

Perhaps the greatest impact of the baudrate sampling technique is that it makes possible a fully analog echo canceller/DFE approach. The fully analog echo canceller was abandoned because of the small step size required by the adaptation in the echo cancellation [1]. The small step size was due to the far-end signal in the error terms used for adaptation. The precursor-free timing recovery sampling at the baudrate combining with the DFE can remove the far-end signal from the error term, and therefore allow a much larger step size. Simulation showed that a step size of 0.03 to 0.05, which can be implemented with ratioed switched-capacitor techniques, performs satisfactorily. The fully analog echo canceller/DFE eliminates the problems associated with the nonlinearity in the data conversion. Note that this nonlinearity problem has always been a challenge to the monolithic realization of the echo canceller.

The circuit design associated with this approach is already discussed in Sec. 1.5.

#### 7.8. Simulation Results

The transient behavior of the timing recovery process is too fast to observe experimentally. Computer simulation was used to examine the convergence of the DFE and timing recovery.

Simulation of the system is based on the block diagram shown in Figure 7.14. Note that the samples used to generate the timing function are the echo-free signals which have also been equalized by the DFE. The DFE removes the ISI among pulses and greatly reduces the timing jitter of the recovered timing. This configuration is possible only when the sampling frequency required by the timing recovery is the baud rate, because the DFE works at the baud rate. This, of course, becomes one of the greatest advantages of the techniques described here.



Figure 7.14. Block diagram of the simulated system.

Simulations of the timing recovery and the residual error at the output of the DFE are shown in Figures 7.15(a), (b), 7.16(a), and (b), for two line configurations. The results for other line configurations were very similar. In Figs. 7.15(a) and 7.16(a), the vertical axis is the timing phase, where 64 steps in phase constitute a baud interval. The horizontal axes in all 4 plots are in number of data symbols. In the plots showing the residual error of the DFE, the peaks of the full-scale signal are normalized to +1 and -1. Fig. 7.15(c) shows the tap weights of the DFE after convergence.

More simulations were conducted to compare the timing jitter for different algorithms. The timing phase at the output of a second order analog VCO is plotted with time. The two cases plotted, Figures 7.17(a) and (b), are wave difference method (WDM) [28,29] and baud-rate sampling technique, where a grid in the vertical axis corresponds to  $\frac{1}{64}$  of a baud interval. We see that the baud-rate sampling technique has far less jitter because the signal used for timing extraction has been equalized by the DFE. It should be emphasized that it is a unique property of the baudrate sampling technique that the equalizer serves both the purposes of removing the ISI for data detection and reducing timing jitter.

## 7.9. Experimental Results

Two breadboard systems, corresponding to the first two approaches discussed in the implementation, have been built. MDAC's, rather than the switched-capacitor circuit, were used in the analog portion for multipliers. For details of the breadboard design, readers are referred to references [15.16]. A digital phase-locked loop with clock running at 80 times the data rate was used for simplicity. Experiments were performed at 160 Kb/s, on cables ranging from 0 to 5 km in length. The cables used were gauge 24 and 26. Cases with bridged taps were also tested.

The waveform of a single pulse corresponding to a "1" is shown in Fig. 7.18(a), where the bottom trace is the transmitter clock. The period of the clock is the baud



Figure 7.15(a). Convergence of timing recovery for line with 2 km length. One baud in divided into 64 equally spaced phases.



Figure 7.15(b). Residual error at the DFE output for 2 Km line, the equalized signal is normalized to have peaks at +1 and -1.



Figure 7.16(a). Convergence of timing recovery for line with 2 mile length and 3 bridged taps. One baud in divided into 64 equally spaced phases.



Figure 7.16(b). Residual error at the DFE output for 2 Km line with 3 bridged taps, the equalized signal is normalized to have peaks at +1 and -1.



Figure 7.17(a). Timing jitter for WDM method, analog VCO is used for simulation. Y-scale is the same as in Fig. 7.16(a).



Figure 7.17(b). Timing jitter for BST method, analog VCO is used for simulation. Y-scale is the same as in Fig. 7.16(a).

interval, which is  $6.25 \ \mu s$  for a 160 Kb/s transmission rate. Fig. 7.18(b) depicts the eye pattern (top trace), the recovered clock (second trace), the transmitter clock (third trace), and the received signal (bottom trace) of the case with line length equal to zero. Note that the 3-level eye-openings are visible in the received waveform. The eye pattern shown in the picture is the analog version of the DFE output. They are square because the sampling rate is the baudrate. Figure 7.18(c) depicts the same waveforms corresponding to the line configuration of a 3.2 km line with 2 bridged taps attached. The experiment results on other line configurations show that the baudrate sampling technique works well up to 4.8 km on a 26 gauge twisted-pair cable.

## **7.10. Summary**

A sampled-data timing recovery technique with the sampling rate equal to the baud rate is described in this chapter. This technique is well-suited for use in a hybrid-mode digital subscriber system, and it can also be applied elsewhere. The recovered timing phase can be used in conjunction with DFE, and this phase is insensitive to line characteristics even in the presence of bridged taps. The timing jitter can be minimized without extra operations. The timing jitter analysis of this technique is also described. Three possible integrated circuit implementations are described. Computer simulations and experimental results are presented.



Figure 7.18. Plate (a) shows the received pulse shape in ideal line and the transmitter clock. Plate (b) and (c) show the analog version of the DFE output (top trace), the recovered clock (second trace), the transmit clock (third trace), and the received signal (bottom trace) using BST. Note that the DFE output shows a square eye pattern, because the sampling rate is at baudrate.

# CHAPTER 8

# Conclusions

In this chapter, the performances of the two timing recovery methods, namely, the wave difference method (WDM), and the baudrate sampling technique (BST), are compared in terms of timing jitter, complexity, line penetration, capability of frequency lock, immunity to noise, and optimum line coding. The impact of the baudrate sampling technique on the digital subscriber loop (DSL) is illustrated by the various implementation alternatives using computer simulation. A two-stage echo canceller approach, proposed in [59], is also readily applicable to the timing recovery methods previously analyzed.

## 8.1. Comparison of WDM and BST

The WDM needs a  $\sqrt{f}$  equalizer to equalize the pulse shape against the  $\sqrt{f}$  attenuation of the line to ensure a correct recovered timing phase. In the case of moderate bridged-tap interference, the WDM may still recover an adequate timing phase for data detection, however, a decision feedback equalizer (DFE) may be required for better signal-to-noise ratio. The WDM may accquire an entirely erroneous timing phase if the bridged tap effect is severe. If the BST is used, the DFE alone is sufficient to mitigate the effects of both the  $\sqrt{f}$  attenuation and the bridged taps. The number of taps may be reduced if a  $\sqrt{f}$  equalizer is placed in front of the echo canceller. The recovered timing phase is almost insensitive to the presence of the bridged taps.

A comparison of the performance of WDM and BST in the presence of timing jitter was presented in Section 7.8. Because BST needs only one sample per baud, a DFE can be used to equalize the signal used for timing recovery. The timing jitter is therefore reduced. In WDM, the timing jitter can only be reduced with additional hardware

as described in Section 3.3.3, because it is basically a sampled-data equivalent of the spectral line method.

In terms of line penetration, the application of WDM is limited to line configurations with no or one bridged tap, or multiple bridged taps that cause only minor distortion on the pulse shape. Since it counts on a  $\sqrt{f}$  equalizer to remove the line impairments, a channel composed of lines with mixed gauges may also present difficulty for WDM timing recovery. BST however is much more robust, because it takes advantage of the effect of bridged taps. Simulation showed that BST works well under all line configurations as long as the attenuation is not too severe.

The line coding used in WDM and BST can either be the alternate mark inversion (AMI) or the  $1-z^{-1}$  partial response. It should be noted that the difference between the two is the operation of a modulo 2 summation as described in Section 7.1. In the system with WDM, if no DFE is engaged, the code must be decoded as 3-level, and a post-decoder will convert it back to a 2-level sequence. In BST system, with the existence of a DFE, a 2-level threshold detector may be used. The convergence of the DFE and the timing recovery presents no difficulty in binary transmission.

AMI coding and the  $1-z^{-1}$  coding are shown to have the same noise immunity because they are essentially 3-level transmission. Being detected as a 2-level code using DFE however may raise the risk of error propagation. The error propagation has been shown to have little effect when the number of taps is small and the error rate is low [36].

In terms of frequency lock capability, the WDM certainly has the advantage. Because it basically samples at the Nyquist rate, all the frequency information is preserved in the WDM case, and a frequency-locked loop can then be used to improve the frequency and phase acquisition when the free-running frequency of the local VCO is far from the desired frequency. This property may not be important because a cry-

Chap. 8. Conclusion 194

stal with a well-tuned frequency is easily available and may be required in the system for other signal processing purposes.

The most important difference between the two is the sampling rate. By reducing the sampling rate by a factor of two, the complexity of the echo canceller in the BST system is reduced by the same factorcompared to the WDM system. The speed requirement on most signal processing units in the transceiver is also reduced accordingly.

# 8.2. Computer Simulation of Echo Canceller Alternatives

Two fundamental versions of the echo canceller are the transversal filter and the table look-up structure. Extensive simulation results on several aspects of the two approaches can be found in reference [6]. In this section, the focus is on the variations with the structure of the echo canceller limited to the transversal filter implementation, simply because the memory size grows enormously in the look-up table approach as the line length gets large. A hybrid form, combining a memory look-up table for the short-delay, large echo, and a transversal filter for smaller echo with longer delay, may also be a viable approach.

The greatest advantage that BST offers compared to other timing recovery methods is the possibility of combing echo canceller and DFE. This comes about not only because that timing can be recovered using samples taken at the baudrate, but also because the timing phase thus recovered has no precursor intersymbol interference. Because the DFE can synthesize all the postcursors of the far-end signal and the current far-end signal can also be determined when the data detection is performed, the far-end signal can be removed completely from the error control term used for adaptation. Consequently, the step size used in the adaptation can be enlarged because the disturbance from the far-end signal is not present any more. Fig. 8.1 compares the cases with far-end signal included and excluded in the control term. The average short-time power of the residual error is plot in decibels(dB) vs. the number of iterations. The



Figure 8.1. Residual echo vs. number of iterations in a binary data driven transversal filter echo canceller using stochastic gradient algorithm for adaptation. Curves (a) and (b) are the cases of combining echo canceller and DFE with step size 0.05 and 0.015 respectively, curve (c) has the far-end signal included in the adaptation control term with step size equal to 0.004.

slope of the curve shows the convergence speed. The far-end signal is removed from the control term in cases (a) and (b), where the step sizes are 0.05 and 0.015 respectively. The code is binary, the adaptation method is the *stochastic gradient algorithm* as described in Section 2.1. The number of taps in the transversal filter echo canceller is 5 and that in the DFE is 4. Curve (c) shows the case with the far-end signal included in the control term. The step size must be reduced to 0.004 to ensure the required 20 dB of echo cancellation. In this simulation, the worst case was assumed where the power of the far-end signal is 40 dB below that of the echo.

The correction term in the stochastic gradient algorithm is the product of the step size and the error control term, multiplied by the data bit, as in Eq. (2.1). The complexity of the hardware to perform this adaptation can be simplied if the sign algorithm is used instead, where the error control term is replaced by its sign. The tap coefficients are either incremented or decremented by an amount equal to the step size according to the signs of the error control term and the data bit. This can be easily implemented with an up/down counter. The sign algorithm therefore becomes attractive in the echo canceller equipped with digital adaptation. The fully analog approach made possible by the baudrate sampling technique, however, can perform the stochastic gradient algorithm easily. Fig. 8.2 compares the convergence speed and residual error of the stochastic algorithm and the sign algorithm. Curves (a) and (b) show the cases of stochastic gradient algorithms with step size equal to 0.05 and 0.015 respectively. Curve (c) is the case of sign algorithm, where the step size must be reduced to 0.015 to achieve 20 dB of echo cancellation. Note that the convergence speeds are very different.

The third variation concerns the level of data. Both the AMI and the  $1-z^{-1}$  partial-response are 3-level codes generated from 2-level codes. If the binary to ternary conversion is linear, as in the case of the  $1-z^{-1}$  coder, the coding itself can be absorbed in the channel, and a 2-level echo canceller can be used. Otherwise a 3-level



Figure 8.2. Residual echo vs. number of iterations in a binary data driven transversal filter echo canceller combined with DFE. Two cases are stochastic gradient algorithm (a) and (b) and sign algorithm (c).

Chap. 8. Conclusion 198

echo canceller, where the adaptation is active only if the data are +1 or -1, must be used. Fig. 8.3 shows the comparison of the convergence speeds of binary and ternary (bipolar) echo canceller. It is quite reasonable that a binary echo canceller converges faster than a ternary echo canceller because the adaptation is always active.

## 8.3. 2-stage echo canceller

The 2-stage echo canceller proposed in [59] can be used to reduced the required degree of echo cancellation by putting an automatic gain control (AGC) stage between two echo cancellers, as shown in Fig. 8.4. Each echo canceller has its only error control voltage for adaptation. The way it works can be explained as follows. Assume in the worst case condition that the echo to far-end signal power ratio is  $40 \, dB$ , and  $-20 \, dB$  residual error to far-end signal ratio is desired. The total  $60 \, dB$  of echo cancellation can be split up in two echo cancellers. The first stage may accomplish a  $30 \, dB$  echo cancellation, and the echo to far-end signal ratio is  $10 \, dB$  at the output of the first echo canceller. The AGC then amplifies its input by  $30 \, dB$  to bring the signal level back up to a comfortable dynamic range. The second echo canceller will then see an input signal with echo to signal ratio of  $10 \, dB$ . A  $30 \, dB$  further reduction of echo will achieve a  $20 \, dB$  signal to echo ratio at the output of the second echo canceller. The combination of echo canceller and DFE can still apply to this structure by adding a stage of attenuation in the generation of far-end signal as shown in Fig. 8.5. The degree of this attenuation is equal to the gain in the AGC stage.

Note that although the required echo cancellation of each echo canceller is reduced to 30 dB, the timing jitter requirement remains the same. Therefore, a timing recovery with low timing jitter is still required, and the baudrate sampling technique is a good candidate.



i. .

Figure 8.3. Residual echo vs. number of iterations in a transversal filter echo canceller combined with DFE using stochastic gradient algorithm for adaptation. The two cases are binary data (a) and (b), and ternary (bipolar) as in (c).



Figure 8.4. 2-stage echo canceller with reduced echo cancellation requirement as proposed in reference [59].



Figure 8.5. 2-stage echo canceller with reduced echo cancellation requirement combine with DFE and baudrate sampling timing recovery.

## **APPENDIX**

The received data signal in a baseband subscriber loop receiver can be expressed as:

$$x(t) = \sum_{k=-\infty}^{\infty} a_k \ h(t-kT)$$
 (A1)

where h(t) is the channel response to the input pulse. In subsequent analytical results, binary line coding will be assumed in which  $a_k$  is an independent identically distributed sequence of transmitted data symbols assuming the values +1 and -1. Further assuming that the timing jitter is very small and that the samples are taken at intervals of T, the samples will be:

$$x_n = x(nT) = \sum_{k=-\infty}^{\infty} a_k \ h(nT - kT)$$
 (A2)

Using a simplified notation  $h_{n-k}$  for h(nT-kT), Eq.(A2) can be rewritten as:

$$x_n = \sum_{k=-\infty}^{\infty} a_k \ h_{n-k} = \sum_{k=-\infty}^{\infty} a_{n-k} \ h_n$$
 (A3)

In case 1, where

$$f(\tau) = h(\tau - T) - h(\tau + T) \tag{A4}$$

the timing function generator output is

$$z_k = a_k \ x_{k-1} - a_{k-1} \ x_k \tag{A5}$$

Considering the case where h(t) has a finite duration of 3T, plugging (A3) into (A5), we get:

$$z_k = h_{-1} - h_1 + (a_k a_{k-2}) h_1 - (a_{k-1} a_{k+1}) h_{-1}$$
(A6)

Note that only  $h_{-1}h_1$ , and  $h_1$  are considered. Cases with duration larger than 3 can easily be deduced in the same fashion. Since  $h_1=h_{-1}$  at steady state.

APPENDIX 203

$$z_k = (a_k a_{k-2} - a_{k-1} a_{k+1}) h_1 \tag{A7}$$

Note that  $z_k$  is a wide sense stationary random sequence if  $a_k$  are uncorrelated. To reference  $z_k$  to the input of the phase detector, divide  $z_k$  by the gain of the phase detector, which in this case is the slope s of the timing function, shown in Fig. 7.5(b).

$$\theta_{:k} = z_k \, 2\pi/\left(sT\right) \tag{A8}$$

Exhaustive search shows that this random input jitter  $\theta_{ik}$  only assumes 3 values with the probability:

$$p_{\theta ik}(4h_1\pi/(sT)) = .25$$

$$p_{\theta ik}(-4h_1\pi/(sT)) = .25$$

$$p_{\theta ik}(0) = .5$$
(A9)

The autocorrelation function of  $\theta_{ik}$  can be computed from eq. (A9) and is plotted in Fig. 7.9(a). The power spectrum of the input phase jitter is the Fourier Transform of the autocorrelation function, and is computed and plotted in Fig. 7.9(b).

In case 2, where

$$f(\tau) = h(\tau - T) \tag{A10}$$

the timing function generator output is

$$z_{k} = (a_{k-1} a_{k} a_{k-1} a_{k-2}) x_{k-2} + (a_{k-2} + 2a_{k}) x_{k-1} + (-a_{k-1} - 2a_{k} a_{k-1} a_{k-2}) x_{k}$$
(A11)

Again, consider the case where h(t) has a finite duration of 3T.

$$z_{k} = 3h_{-1} + (-a_{k-1} a_{k+1} - 2a_{k} a_{k-1} a_{k-2} a_{k+1})h_{-1} + (a_{k-1}a_{k-3} - a_{k} a_{k-1} a_{k-2} a_{k-3})h_{1}$$
(A12)

Since  $h_{-1}=0$  at steady state,

$$z_k = (a_{k-1}a_{k-3} - a_k \ a_{k-1} \ a_{k-2} \ a_{k-3})h_1 \tag{A13}$$

To bring  $z_k$  out from the loop to the input, divide  $z_k$  by the gain of the phase detector, which in this case is the slope s of the timing function, shown in Fig. 7.3(c).

APPENDIX 204

$$\theta_{ik} = z_k \, 2\pi/ \, (3sT) \tag{A14}$$

Exhaustive search shows the this random input jitter  $\theta_{ik}$  also assumes 3 values with the probability:

$$p_{\theta ik}(4h_1\pi/(3sT)) = .25$$
  
 $p_{\theta ik}(-4h_1\pi/(3sT)) = .25$   
 $p_{\theta ik}(0) = .5$  (A15)

The autocorrelation function of  $\theta_{ik}$  is computed from eq. (A15) and is plotted in Fig. 7.9(b). The power spectrum of the input phase jitter is white, and plotted in Fig. 7.9(d).

#### REFERENCES

- [1] O. Agazzi, D. A. Hodges, and D. G. Messerschmitt "Large-Scale Integration of Hybrid Method Digital Subscriber Loops" *IEEE Transactions on Communications*, Vol. COM-30, Sept. 1982, pp. 2095-2108.
- [2] O. Agazzi, D. G. Messerschmitt, and D. A. Hodges "Nonlinear Echo Cancellation of Data Signals" *IEEE Transactions on Communications*, Vol. COM-30, Nov. 1982, pp. 2421-2433.
- [3] N. A. M. Verhoeckx, H. C. van der Elzen, W. A. M. Snijders, and P. J. van Gerwen, "Digital Echo Cancellation for Baseband Data Transmission" *IEEE Transactions on Acoust.*, Speech, Signal Processing, Vol. ASSP-27, Dec. 1979, pp. 768-781.
- [4] N. Holte and S. Stueflotten, "A New Digital Echo Canceller for Two Wire Subscriber Lines" *IEEE Transactions on Communications*, Vol. COM-29, Nov. 1981, pp. 1573-1581.
- [5] D. G. Messerschmitt, "Echo Cancellation in Speech and Data Transmission" *IEEE Journal on Selected Areas in Communications*, Vol. SAC-2, Mar. 1984, pp. 283-297.
- [6] P. J. van Gerwen, N. A. M. Verhoeckx, and T. A. C. M. Claasen, "Design Considerations for a 144 kb/s Digital Transmission Unit for the Local Telephone Network," *IEEE Journal on Selected Areas in Communications*, Vol. SAC-2, Mar. 1984, pp. 314-323.
- [7] S. V. Ahamed, P. P. Bohn, and N. L. Gottfried, "A tutorial on Two-Wire Digital Transmission in the Loop Plant" *IEEE Transactions on Communications*, Vol. COM-29, Nov. 1981, pp. 1554-1564.
- [8] N.-S. Lin, Private Communication.
- [9] O. Agazzi, Ph.D. Thesis, University of California, Berkeley, May 1982.
- [10] D.Falconer, "Adaptive Reference Echo Cancellation", *IEEE Transactions on Communications*, Vol. COM-30, No. 9, September 1982, pp. 2083-2094.
- [11] K. H. Mueller, "Combining Echo Cancellation with Decision Feedback Equalization", Bell Syst. Tech. Jour., Vol. 58, No. 2, Feb. 1979, pp. 491-500.
- [12] R. W. Lucky, J. Salz, and E. J. Weldon. Jr., "Principles of Data Communication", Bell Telephone Laboratories, Inc., 1968.
- [13] M. M. Sondi and A. J. Presti, "A self-Adaptive Echo Canceller", Bell Syst. Tech. Jour., Vol. 45, No. 10, December 1966, pp. 1851-1854.

[14] D.L. Dullweiler and Y. S. Chen, "A Single-Chip VLSI Echo Carrier", Bell Syst. Tech. Jour., Vol. 59, No. 2, February 1980, pp. 149-160.

- [15] P.O'Riordan, "Implementation and Testing of Timing recovery Algorithm for Use in Digital Subscriber Loop", Master's Report, University of California, Berkeley, August 1984.
- [16] P.L.Winship, "An Implementation of a Coding, Equalization, and Timing recovery Algorithm for Use in the Digital Subscriber Loop", Master's Report, University of California, Berkeley, December 1984.
- [17] R.D.Gitlin and J.F.Hayes, "Timing Recovery and Scramblers in Data Transmission", Bell Syst. Tech. J., Vol. 54, No. 3, March 1975, pp. 569-593.
- [18] CCITT Recommendation V-35, Data Transmission at 48 Kilobits per Second Using 60-108 KHz Group Band Circuits, Mar del Plata, 1968.
- [19] S.E.Nader and L.F.Lind, "Optimal Data Transmission Filters", *IEEE Transactions on Circuits and Systems*, Vol. CAS-26, No. 1, January 1979, pp. 36-45.
- [20] L.E.Franks, "Carrier and Bit Synchronization in Data Communication- A Tutorial Review". *IEEE Transactions on Communications*, Vol. COM-28, No. 8, August 1980, pp. 1107-1121.
- [21] W.R.Bennett, "Statistics of Regenerative Data Transmission", Bell Syst. Tech. J., November 1958, pp. 1501-1542.
- [22] B.R.Saltzberg, "Timing Recovery for Synchronous Binary Data Transmission", Bell Syst. Tech. J., March 1967, pp. 593-622.
- [23] R. D. Gitlin and J. Salz, "Timing Recovery in PAM Systems" Bell Syst. Tech. J., May 1971, Vol. 50, No. 5, pp. 1645-1669.
- [24] D.L.Lyon, "Timing Recovery in Synchronous Equalized Data Communication", *IEEE Transactions on Communications*, February 1975, pp. 269-274.
- [25] J.E. Mazo, "Optimum Timing Phase for an Infinite Equalizer", Bell Syst. Tech. J., January 1975, Vol. 54, No. 1, pp. 189-201.
- [26] J.E.Mazo, "Jitter Comparison of Tones Generated by Squaring and Fourth-Power Circuits", Bell Syst. Tech. J., Vol. 57, No. 5, May-June 1978, pp. 1489-1498.
- [27] W.K.Pratt, J.Kane, and H.C.Andrews, "Hadamard Transform Image Coding", Proceedings of the IEEE, Vol. 57, No. 1, January 1969.
- [28] T. Suzuki, H. Takatori, M. Ogawa, and K. Tomooka, "Line Equalizer for a Digital Subscriber Loop Employing Switched Capacitor Technology", *IEEE Transactions on Communications*, Vol. COM-30, No. 9, Sept. 1982, pp. 2074-2082.

[29] O. Agazzi, C-P. J. Tzeng, D. G. Messerschmitt, and D. A. Hodges, "Timing Recovery in Digital Subscriber Loops", submitted to *IEEE Transactions on Communications*.

- [30] L. E. Franks and J. P. Bubrouski, "Statistical Properties of Timing Jitter in a PAM Timing Recovery Scheme", *IEEE Transactions on Communications*, Vol. COM-22, No. 7, July 1974, pp. 913-920.
- [31] D. L. Lyon, "Timing Recovery using Data-Derived Waveforms in QAM and SQAM Systems", International Conference on Communications, June 1975.
- [32] S.U.H.Qureshi and G.D.Forney, Jr. "Performance and Properties of a T/2 Equalizer" NTC '77
- [33] R.D.Gitlin and S.B.Weinstein, "Fractionally-Spaced Equalization: An Improved Digital Transversal Equalizer" *Bell Syst. Tech. J.*, February 1981, pp. 275-296.
- [34] R.D.Gitlin, H.C.Meadors, Jr., and S.B.Weinstein, "The Tap-Leakage Algorithm: An algorithm for the Stable Operation of a Digitally Implemented, Fractionally Spaced Adaptive Equalizer", Bell Syst. Tech. J., October 1982, pp. 1817-1839.
- [35] G.Ungerboeck, "Fractionally Tap-Spacing Equalizer and Consequences for Clock Recovery in Data Modems", *IEEE Transactions on Communications*, Vol. COM-24, No. 8, August 1976, pp. 856-864.
- [36] D.L.Duttweiler, J.E.Mazo, and D.G.Messerschmitt, "An Upper Bound on the Error Probability in Decision-Feedback Equalization", *IEEE Transactions on Information Theory*, Vol. IT-20, No. 4, July 1974, pp. 490-497.
- [37] D.A.George, R.R.Bowen, and J.R.Storey, "An Adaptive Decision Feedback Equalizer", *IEEE Transactions on Communications*, Vol. COM-19, No. 3, June 1971, pp. 281-293.
- [38] J.Salz, "Optimum Mean-Square Decision Feedback Equalization", Bell Syst. Tech. J., Vol. 52, No.8, October 1973, pp. 1341-1373.
- [39] D.G.Messerschmitt, "Design of a Finite Impulse Response for the Viterbi Algorithm and Decision-Feedback Equalizer", *International Conference on Communications*, June 1974.
- [40] K. Buttle, G. Aasen, R. Colbeck, R. Gervais, P. Gillingham, H. Schafer, R. White, and D. B. Ribner, "A 160kb/s Full Duplex Echo Canceling Transceiver", ISSCC Digest of Technical Papers February 1985.
- [41] D. D. Falconer, "Timing Jitter Effects on Digital Subscriber Loop Echo Cancellers:
  Part I Analysis of the Effect" submitted for publication.
- [42] D. D. Falconer, "Timing Jitter Effects on Digital Subscriber Loop Echo Cancellers: Part II - Considerations for Squaring Loop Timing Recovery" submitted for

- publication.
- [43] D. G. Messerschmitt, "LINEMOD A tranmission Line Modelling Program".
- [44] K. H. Mueller and M. Muller, "Timing Recovery in Digital Synchronous Data Receivers", *IEEE Transactions on Communications*, Vol. COM-24, No. 5, May 1976, pp. 516-531.
- [45] F. M. Gardner, "Phaselock Techniques", John Wiley, 1979.
- [46] A. Blanchard, "Phase-Locked Loops", John Wiley, 1976.
- [47] R. E. Best, "Phase-Locked Loops", McGraw Hill, 1984.
- [48] S.Barab and A.L.McBride, "Uniform Sampling Analysis of a Hybrid Phase-Locked Loop with a Sample-and-Hold Phase Detector", *IEEE Transactions on Aerospace and Electronic System*, Vol. AES-11, No. 2, March 1975, pp. 210-216.
- [49] H.Yamamoto and S.Mori, "Performance of a Binary Quantized All Digital Phase-Locked Loop with a New Class of Sequential Filter", *IEEE Transactions on Communications*, Vol. COM-26, No. 1, January 1978, pp. 35-45.
- [50] W. Sun, Master's Report, University of California, Berkeley.
- [51] P.A. Ruetz, "Computer Generation of Digital Filter Banks", Master's Report, University of California, Berkeley, March, 1984.
- [52] M. L. Honig and D.G. Messerschmitt, "Adaptive Filters: Structures, Algorithms, and Applications", Kluwer Academic Publishers, 1984.
- [53] M.S.Mueller and J.Salz, "A Unified Theory of Data-Aided Equalization" Bell Syst. Tech. J., Vol. 60, No.9, November 1981, pp. 2023-2038.
- [54] J.G. Proakis, "Digital Communications", McGraw-Hill, 1983.
- [55] Bell Telephone Laboratories. "Transmission Systems for Communications".
- [56] M.Ishikawa, T.Kimura, and N.Tamaki, "A CMOS Adaptive Line Equalizer", *IEEE Journal of Solid-State Circuits*, Vol. SC-19, No. 5, October 1984, pp.788-793.
- [57] D.G.Messerschmitt, "Frequency Detectors for PLL Acquisition in Timing and Carrier Recovery". *IEEE Transactions on Communications*, Vol. COM-27, No. 9, September 1979, pp. 1288-1295.
- [58] O.Agazzi and A.Adan. "System Level Computer Simulations of a Hybrid Digital Subscriber Loop". CENICE (The National Center for Electronic Component Research), Buenos Aires. Argentina, June 1982. (Internal Publication).

[59] M. Fukuda, T. Tsuda, and K. Murano, "Digital Subscriber Loop Transmission Using Echo Canceller and Balancing Networks", ICC 85.