Low Energy RF Transceiver Design

Ben Walter Cook

Electrical Engineering and Computer Sciences
University of California at Berkeley

Technical Report No. UCB/EECS-2007-57
http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-57.html

May 16, 2007
Low Energy RF Transceiver Design

by

Ben Walter Cook

B.Eng. (Vanderbilt University) 2001

A dissertation submitted in partial satisfaction of the
requirements for the degree of

Doctor of Philosophy

in

Engineering – Electrical Engineering and Computer Sciences

in the

GRADUATE DIVISION

of the

UNIVERSITY OF CALIFORNIA, BERKELEY

Committee in charge:
Professor Kristofer S. J. Pister (chair)
Professor Ali M. Niknejad
Professor David L. Wessel

Spring 2007
Abstract

Low Energy RF Transceiver Design

by

Ben Walter Cook

Doctor of Philosophy in Engineering – Electrical Engineering and Computer Sciences
University of California, Berkeley
Professor Kristofer S. J. Pister, Chair

The average consumer has relied upon bidirectional RF communication for phone and internet connectivity for years. These devices are either plugged in to wall outlets or rely on large batteries that must be recharged frequently. A new generation of deeply embedded, short-range wireless applications is emerging, fueled by the extreme reductions in cost and power required for sensing and computation afforded by CMOS and MEMS process advancement. The power consumption of wireless communication links, on the other hand, has not scaled down so dramatically. Short range wireless protocols, such as Bluetooth and 802.15.4, have been developed to meet the communication needs of these applications and have already seen substantial commercial success. However, the excessive energy requirements of current commercially available radios, even those aimed at short range WPAN applications, limit the scope and inhibit the growth of the deeply embedded wireless market. A substantial reduction in energy consumption of short-range RF transceivers is necessary to make future pervasive computing applications feasible.
In this work, the energetic requirements of RF wireless communication are evaluated from both purely theoretical and practical standpoints, revealing a large gap in practically achievable energy efficiency and what is offered in today’s commercial market. In the context of minimizing energy per transferred data bit, each level of the physical design of wireless systems will be discussed – from choice of modulation scheme and bandwidth, down to transceiver architectures and low-level circuit designs. Finally, the implementation and measurement results from a 2.4GHz CMOS RF transceiver prototype are presented. Benefiting from energy conscious high-level system decisions and novel circuit architectures, the transceiver achieves a low energy consumption of 1 nJ per received bit and 3 nJ per transmitted bit with 92 dB of link margin.

Professor Kristofer S. J. Pister,
Dissertation Committee Chair
# Table of Contents

**Chapter 1**

1.1 Motivation ........................................................................................................1
1.2 Research Goals .................................................................................................2
1.3 Thesis Organization .........................................................................................2

**Chapter 2**

2.1 Introduction ......................................................................................................4
2.2 Shannon’s Theorem ..........................................................................................4
2.3 Theoretical System Energy Limits ..................................................................9

**Chapter 3**

3.1 Introduction ....................................................................................................14
3.2 Power and Performance Tradeoffs ................................................................15
3.3 Reducing Overhead Power ............................................................................25
3.4 Efficient PA’s with Low Power Output .........................................................34
3.5 Receiver Noise Factor and Passive Voltage Gain .........................................38
3.6 $E_b/N_0$ and Modulation Scheme ..................................................................42
3.7 Conclusion .....................................................................................................51

**Chapter 4**

4.1 Introduction ....................................................................................................53
4.2 System Specifications .....................................................................................55
4.3 Transceiver Architecture .............................................................................59
4.4 Receiver Design ............................................................................................63
4.5 Transmitter .....................................................................................................87
4.6 Results ............................................................................................................88
4.7 Conclusion and Comparisons ........................................................................98

**Chapter 5**

5.1 Research Summary .......................................................................................103
5.2 Passive Techniques in Future Radios ..........................................................104
Chapter 1

Introduction

1.1 Motivation

The average consumer has relied upon bidirectional RF communication for phone and internet connectivity for years. These devices are either plugged in to wall outlets or rely on large batteries that must be recharged frequently. A new generation of deeply embedded, short-range wireless applications is emerging fueled by the extreme reductions in cost and power required for sensing and computation afforded by CMOS and MEMS process advancement. The power consumption of wireless communication links, on the other hand, has not scaled down so dramatically. Short range wireless protocols, such as Bluetooth and 802.15.4, have been developed to meet the communication needs of these applications and have already seen substantial commercial success. However, the excessive energy requirements of current commercially available radios, even those aimed at short range WPAN applications, limit the scope and inhibit the growth of the deeply embedded wireless market. A substantial reduction in energy consumption of short-range RF transceivers is necessary to make the myriad future pervasive computing applications feasible.
1.2  Research Goals

Short range, ultra-low energy RF is a relatively new design space and, as such, is a rich area for IC research with great potential for energy reduction through design innovation. The goal of this work is to develop a thorough understanding of the unique challenges in designing efficient short range wireless systems and to demonstrate a low-energy prototype transceiver. The energy requirements of RF wireless communication are evaluated from both purely theoretical and practical standpoints, revealing a large gap in practically achievable energy efficiency and what is offered in today’s commercial market. In the context of minimizing energy per transferred data bit, each level of the physical design of wireless systems will be discussed – from choice of modulation scheme and bandwidth, down to transceiver architectures and low-level circuit designs. Given the freedom from compliance with any particular established wireless standard, this work takes a multilateral approach to system design that attempts to balance the results of communication theory with practical circuit implementation issues to minimize energy consumption of the composite wireless system.

1.3  Thesis Organization

Chapter 2 begins with a look at one of the most important results from information theory, Shannon’s celebrated Channel Capacity theorem, which lays out the fundamental requirements for successfully receiving data sent over a noise corrupted channel. Shannon’s theorem, combined with a basic treatment of the tradeoffs between transmit power, path loss, bandwidth and thermal noise results in a fundamental lower bound on the energy required to transmit and receive a single bit of information against which
wireless systems may be judged. This discussion gives rise to an energy efficiency figure of merit that captures just how far from ideal a given system is.

Chapter 3 discusses general design techniques for reducing energy consumption of wireless systems when link margins and overall power budgets are small. Modulation schemes, entropy coding, transmitter and receiver architectures, as well as circuit designs for specific transceiver sub-blocks are among the topics treated.

In chapter 4, the design and implementation of a 2.4GHz transceiver prototype is discussed with regard to the general design techniques discussed in chapter 3. Theory is developed to quantify the behavior of the transceiver’s non-traditional architecture. Important theoretical results include: the power efficiency and noise performance of a tapped-capacitor LC resonant transformer input network, as well as the gain, noise figure and frequency dependent input impedance profile of a hard-switched CMOS passive mixer.

Finally, chapter 5 contains the measurement results from the prototype transceiver implemented in a 130nm RF CMOS process. The transceiver achieves a low energy consumption of 1nJ per received bit and 3nJ per transmitted bit with 92dB of link margin while operating from just 400mV DC so as to accommodate a single solar cell supply.
2.1 Introduction
This chapter explores the energetic requirements of RF wireless communication from both a theoretical and practical standpoint. The focus is on energy per transferred bit rather than continuous power consumption because it is more closely tied to the battery life of a wireless device. The discussion begins with a look at the fundamental lower limit on energy per received bit resulting from the channel capacity theorem set forth by Claude Shannon. Based on this lower bound, a metric for evaluating energy efficiency of practical RF systems will be derived. This metric conveniently isolates the impact of non-idealities of the transmitter, receiver, and modulation scheme, providing a framework for understanding why and by how much will practical systems exceed fundamental energy bounds.

2.2 Shannon’s Theorem
Consider the task of properly detecting a signal with information rate $R$ (in bits per second), and with continuous power $P_0$. The energy per bit in the signal is simply:
\[ E_b = \frac{P_0}{R} \]  

(2.1)

In this section, Shannon’s channel capacity theorem will be used to determine the minimum value of \( E_b \) that will allow successful detection of the signal and relate this to other important system parameters. Shannon’s theorem (2.2) establishes an upper bound on \( R \) for communication over a noisy channel. This bound is called the maximum channel capacity \( C \) – in bits per second.

\[ C = B \log_2 (1 + \text{SNR}) \]  

(2.2)

\( B \) is the signal bandwidth and \( \text{SNR} \) is the ratio of signal power to noise power. Assuming the signal is corrupted by additive white Gaussian noise (AWGN), then (2.2) may be rewritten:

\[ C = B \log_2 \left(1 + \frac{P_0}{N_0 \cdot B}\right) = B \log_2 \left(1 + \frac{E_b}{N_0} \cdot \frac{R}{B}\right) \]  

(2.3)

\( N_0 \) is the noise power spectral density in Watts/Hz. \( P_0 \) is the signal power at the input of the receiver. If the channel is thermal noise limited, then \( N_0 \) is equal to the product \( kT \), where \( T \) is temperature and \( k \) is Boltzmann’s constant.

\[ N_0 = kT \]  

(2.4)

2.2.1 SNR-per-bit \((E_b/N_0)\) and Spectral Efficiency \((R/B)\)

To help clarify the implications of Shannon’s theorem on bandwidth and energy tradeoffs in a communication system, it is necessary to understand the two ratios inside the parentheses in (2.3), \( E_b/N_0 \) and \( R/B \). The ratio \( E_b/N_0 \) is referred to as the SNR-per-bit
and the ratio $R/B$ is a measure of spectral efficiency (in bps/Hz). Both quantities are important metrics for comparing digital modulation schemes. Generally speaking, if a modulation scheme has high spectral efficiency, it is likely to require larger energy per bit for successful reception. Shannon’s theorem establishes a fundamental performance boundary for communication systems based on the relationship between spectral efficiency and energy-per-bit.

It is important to distinguish between $\text{SNR}$ and $E_b/N_0$. $\text{SNR}$ is a ratio of signal power to noise power, while $E_b/N_0$ is a ratio of the energy per bit of the signal to the energy in the noise. The two quantities are related as follows:

$$\text{SNR} = \frac{E_b}{N_0} \cdot \frac{R}{B}$$  \hspace{1cm} (2.5)

For the purposes of evaluating a given scheme’s energy per bit performance, $E_b/N_0$ is more meaningful than $\text{SNR}$ because it provides a way to directly compare the energy requirements of a modulation scheme without considering transmission rate or bandwidth.

### 2.2.2 Maximum Capacity and Minimum $E_b/N_0$

From (2.3), the capacity of a Gaussian channel increases logarithmically with signal power $P_0$. A cursory glance at (2.2) would suggest that $C$ increases linearly with $B$, but the capacity-bandwidth relationship is actually more subtle due to the dependence of $\text{SNR}$ on $B$. It turns out that $C$ increases monotonically with $B$, but only approaches an asymptotic value. Thus, for a given signal power $P_0$ and noise power density $N_0$, the channel capacity reaches its maximum value as $B$ approaches infinity.
\[
C_\infty = \lim_{B \to \infty} \left( B \log_2 \left( 1 + \frac{P_0}{N_0 \cdot B} \right) \right) = \frac{1}{\ln 2} \cdot \frac{P_0}{N_0} = 1.44 \cdot \frac{P_0}{N_0}
\]  

Figure 1 and Figure 2 offer two different perspectives on Shannon’s theorem. In Figure 1, the channel capacity is plotted versus signal bandwidth while \( P_0 \) and \( N_0 \) are held constant and in Figure 2, the maximum spectral efficiency is plotted against \( E_b/N_0 \) (plots adapted from [1]).

For a given signal power \( P_0 \), \( E_b/N_0 \) is minimized by maximizing the information rate \( R \). Recalling that \( P_0 \) in (2.6) may also be expressed as \( E_bR \), then the minimum achievable \( E_b/N_0 \) follows by setting the \( R \) equal to \( C_\infty \).

\[
\left( \frac{E_b}{N_0} \right)_{\text{min}} = \frac{P_0}{N_0} \cdot C_\infty = \ln 2 = -1.6 \text{ dB}
\]  

Figure 1. Maximum achievable channel capacity as a function of bandwidth with constant \( P_0/N_0 = 1 \). \( C_{\text{max}} = 1.44 \cdot P_0/N_0 \)
This powerful result implies that error-free communication can be achieved so long as the noise power density is no more than 1.6 dB greater than the energy per bit in the signal. In a thermal noise limited channel (i.e. $N_0 = kT$), the lower limit for Minimum Detectable Signal energy per bit ($E_{b-MDS}$) at the receiver input becomes:

$$\min\{E_{b-MDS}\} = kT \cdot \ln 2 \approx 3 \cdot 10^{-21} \text{ Joules} \text{ per bit} \quad (2.8)$$

Unfortunately, the theorem does not describe any modulation scheme that reaches the limit, and most popular schemes require far greater $E_b/N_0$ than -1.6 dB. For a given modulation scheme (i.e. binary-PSK, OOK, etc.), the spectral efficiency $R/B$ and minimum $E_b/N_0$ required for demodulation, call it $(E_b/N_0)_{\text{min}}$, are fixed values, independent of transmission rate. However, the $R/B$ and $(E_b/N_0)_{\text{min}}$ values of the system can be changed by incorporating coding. Coding techniques and the energy requirements of various modulation schemes will be discussed briefly in chapter 3.
2.3 Theoretical System Energy Limits

To this point, the discussion has been limited to the energy per bit at the input of a receiver. The goal now is to use these results to find a lower bound on energy consumed by the system (including receiver and transmitter) per bit \( E_{b\text{-Sys}} \):

\[
E_{b\text{-Sys}} = \frac{P_{TX} + P_{RX}}{R} \quad (2.9)
\]

\( P_{TX} \) and \( P_{RX} \) are the power consumed by the transmitter and receiver, respectively. In the best possible case, with a 100% efficient transmitter and zero power receiver, all the energy consumed by the system would go into the transmitted signal. Therefore, the fundamental lower bounds on \( E_{b\text{-Sys}} \) and transmitted energy per bit \( E_{b\text{-TX}} \) are the same.

\[
\min \{ E_{b\text{-Sys}} \} = \min \{ E_{b\text{-TX}} \} \quad (2.10)
\]

To find the lower bound on \( E_{b\text{-Sys}} \), the minimum transmitted energy per bit \( E_{b\text{-TX}} \) must be considered. \( E_{b\text{-TX}} \) must exceed \( E_{b\text{-MDS}} \) to compensate for attenuation of the signal as it propagates from transmitter to receiver, or path loss. Path loss for a given link is a function of the link distance, the frequency of the signal, the environment through which the signal is propagating, and other variables. Accurate modeling of path loss is beyond the scope of this chapter, but a review of some popular models is presented in [2, 3]. The ratio by which \( E_{b\text{-TX}} \) exceeds \( E_{b\text{-MDS}} \) is known as link margin \( (M) \) and is usually expressed in dB.

\[
M = \frac{\text{TX Output Power}}{\text{RX Minimum Signal Power}} = \frac{E_{b\text{-TX}}}{E_{b\text{-MDS}}} \quad (2.11)
\]
$E_b$-$MDS$, the minimum detectable signal energy per bit, is determined by the thermal noise floor $kT$, receiver noise factor $F$, and the SNR per bit $(E_b/N_0)$ required for demodulation.

$$E_{b-MDS} = kT \cdot F \cdot \left( \frac{E_b}{N_0} \right)$$  \hspace{1cm} (2.12)$$

Or, alternatively, by incorporating the signal bandwidth, the minimum detectable signal power $P_{MDS}$ is expressed below.

$$P_{MDS} = kT \cdot B \cdot F \cdot \left( \frac{E_b}{N_0} \right) \cdot \left( \frac{R}{B} \right)$$  \hspace{1cm} (2.13)$$

For a reliable link, the system must have more link margin than path loss. In a thermal noise limited channel, the fundamental lower bound on $E_{b-TX}$, and thus $E_{b-Sys}$, required to achieve a link margin $M$ is:

$$\min \{ E_{b-Sys} \} = \min \{ E_{b-TX} \} = M \cdot kT \cdot \ln 2$$  \hspace{1cm} (2.14)$$

To achieve link margin $M$ while only consuming $M \cdot kT \cdot \ln 2$ Joules per bit, a system must meet the following criteria:

1. The receiver adds no noise
2. The modulation scheme achieves the Shannon limit of -1.6dB for $E_b/N_0$
3. The transmitter is 100% efficient
4. the receiver consumes zero energy per bit

Clearly, such a system is impossible to design. In real systems, especially low power systems, transmitters are far from 100% efficient, the modulation scheme requires more
than the limit, and the receivers are noisy and may consume a large portion of the
total system energy. It is not uncommon for a system, especially a low-energy system,
to consume 10,000 times more energy per bit than this lower limit. For instance, radios
targeting sensor network applications (including this work) have reported link margin of
88-120dB[4-15], resulting in a theoretical minimum energy per bit of 1.9-3000 pJ, but
the actual energy consumed by these systems per bit ranges from about 4.4-1320 nJ.

2.3.1 Evaluating System Energy Efficiency

Since the lower bound on $E_b/N_0$ scales with $M$, and $M$ may vary over several orders of
magnitude from system to system, a simple comparison of $E_b/N_0$ is not really fair. To
make fair comparison, an energy efficiency figure of merit ($\eta$) for communication
systems with is defined below (2.15). In an ideal system, $\eta$ is equal to 1.

$$\eta = \frac{\text{ideal energy/bit}}{\text{actual energy/bit}} = \frac{M \cdot kT \cdot \ln 2}{E_{b-Sys}}$$

(2.15)

To this point, several factors contributing to low energy efficiency in wireless systems
have been discussed. The goal now is to capture the relative impact of said factors by
incorporating them into an expression for $\eta$. The first step is to express link margin in
terms of other parameters.

$$M = \frac{E_{b-TX}}{E_{b-MDS}} = \frac{E_{b-TX}}{F \cdot kT \cdot (E_b/N_0)_{\text{min}}}$$

(2.16)

$(E_b/N_0)_{\text{min}}$ is the minimum SNR-per-bit required for demodulation and $F$ is called the
receiver noise factor. $F$ is a non-ideality factor ($F \geq 1$) characterizing the noise
performance of a receiver and is discussed in greater detail in chapter 3. In the ideal
case, \( F = 1 \) and, as shown in (2.7), \( E_b/N_0 = \ln 2 \). Using equation (2.16), \( \eta \) can now be expressed in a much more intuitive form.

\[
\eta = \left( \frac{E_{b-TX}}{E_{b-Sys}} \right) \cdot \frac{1}{F} \cdot \left( \frac{\ln 2}{E_b/N_0} \right)
\]  

(2.17)

Each of the three terms in (2.17) may assume values from 0 to 1 and has an ideal value of 1. The first term quantifies the portion of the total energy consumed by the overall system that is radiated as RF signal energy in the transmitter. The second term describes how much the link margin is degraded due to noise added by the receiver. The third term quantifies the non-ideality of the system’s modulation/demodulation strategy as compared to the minimum achievable \( E_b/N_0 \) from (2.7).

Wireless systems with very high output power are often able to achieve much higher values for \( \eta \) because their increased power budget allows for more power to be spent in the receiver, resulting in lower noise factor and reduced \( (E_b/N_0) \) due to using coding and coherent demodulation, without reducing the overall system efficiency dramatically. For this reason, it is most useful to compare \( \eta \) for systems with similar values for \( E_{b-sys} \).

Equation (2.17) provides a good starting point for further exploration of low energy system design, but it is not a perfect metric and there are a few caveats attached with its use. First of all, dynamic effects, such as the “startup energy” spent as the voltage regulators stabilize and the transceiver tunes to the proper frequency, have not been considered. Nor has the impact of network synchronization or the overhead bits due to training sequences, packet addressing, encryption, etc. Rather than attempt to capture all the initialization effects that lead to radios being on with no useful data flowing, it has
been assumed that the transmitter and receiver are already time synchronized and their
typical data payload per transmission is large enough that startup energy is negligible.

The following chapter will discuss the design of low-energy wireless communication
systems and discuss techniques that can improve $\eta$, such as, proper choice of modulation
strategy, error control coding, and low power overhead transmitter and receiver
architectures.
Chapter 3

Low Energy Transceiver Design

3.1 Introduction

The energy efficiency (\(\eta\)) of an RF transceiver, relative to fundamental limits, can be expressed as a product of three terms:

\[
\eta = \left( \frac{E_{b-TX}}{E_{b-Sys}} \right) \cdot \frac{1}{F} \cdot \left( \frac{\ln 2}{E_b/N_0} \right)
\]  

(3.1)

1. \(E_{b-TX}/E_{b-sys}\) is the proportion of total energy consumed by the system (transmitter and receiver) that is converted directly to RF transmit power

2. \(1/F\) is the ratio of the fundamental thermal noise floor \(kT\) to the total input referred noise of the receiver. This ratio translates directly to an increase in required transmission power to maintain a given link margin.

3. \(\ln 2 / (E_b/N_0)\) is ratio of the ideal minimum SNR per bit from Shannon’s theorem to the actual minimum SNR per bit required by the system.

This chapter examines the impact of design choices, such as modulation scheme, transceiver architecture, and circuit topology, on the overall system efficiency. The task
of energy minimization is complicated by the fact that such design choices are not independent, but rather deeply interwoven; choosing a modulation scheme to increase term 3 will likely cause a decrease in term 1; designing an extremely low noise receiver to maximize term 2 will also reduce term 1, and so on.

It is worth noting that, in most practical cases, RF designers are not free to make system level decisions because their design must comply with a particular commercial protocol or be compatible with previous generations of a product. In such cases, most of the techniques discussed here will still be useful for energy reduction. However, approaching any fundamental energy limits will require a multilateral approach, wherein energy conscious decisions are made at every level of the system design.

3.2 Power and Performance Tradeoffs

This section discusses general transceiver performance tradeoffs and develops a first-order performance versus power model that will be helpful for maximizing a system link margin for a given power budget by distributing power between the receiver and transmitter optimally.
A simplified block diagram of a low-IF or direct conversion RF transceiver is shown in Figure 3, including only the most relevant circuit blocks. The basic functions of the transmitter are: generate a stable RF signal, modulate the frequency, phase and/or amplitude of the RF signal according to information to be transmitted, and drive the modulated signal onto the antenna with a PA. The receiver functions can be summarized as: low-noise, linear amplification, selection of communication channel, and demodulation. The low noise amplifier (LNA) boosts the incoming signal amplitude to overcome the noise of subsequent stages while adding as little of its own noise and distortion as possible.

3.2.1 Transmitter

To maximize the first term in (3.1), the largest possible proportion of the system’s power budget should be dedicated to the PA generating the RF output power because this directly increases link margin. However, there are several circuits blocks necessary to generate the stable RF signal internally before it can be transmitted. Though these
circuits are required for functionality, their power consumption constitutes overhead in a sense because it does not contribute directly to the system’s link margin.

The modulation scheme and transmitter architecture have a major impact on the overhead power consumed by all the non-PA blocks, such as the upconversion mixers, DAC’s, baseband filters, VCO, dividers, charge pumps, buffers, etc. In certain circumstances, architecture and modulation choices can actually allow the designer to eliminate many of these blocks, reducing overhead substantially.

To first-order, the RF and Baseband overhead power is actually independent of transmitter output power; once the RF signal is generated internally, the output power can be chosen independently. Efficient transmitter designs will spend proportionally small amounts of energy generating and modulating the RF signal, with the greatest share of energy consumed by the PA. Thus, it is easier to make an efficient transmitter when the desired output power is relatively high.

Figure 4 is a first order model of transmitter output power versus power consumption that represents these tradeoffs (adapted from [16]). $P_{OH-TX}$ represents the power consumption of all non-PA blocks in the transmitter. PA efficiency ($e_{PA}$) is assumed to be constant versus power.
Figure 4. Typical power breakdown in an RF transmitter. The majority of the transmitter's power is often consumed generating and modulating the RF signal internally.

\[
P_{\text{DC-TX}} = P_{\text{OH-TX}} + P_{\text{PA}} = P_{\text{OH-TX}} + \frac{1}{e_{\text{PA}}} P_{\text{OUT}}
\]  

(3.2)

The assumption of a constant efficiency PA may first seem like a gross simplification because it is quite difficult to design a single PA that maintains constant efficiency for a wide range of output power. However, it is not difficult to design a PA that operates efficiently in a small neighborhood around one appropriately chosen power output point. The model does not assume that one system will be able to reach all points on this curve with similar efficiency, but rather that upon choosing one power output point, an efficient PA can be designed. PA’s with efficiencies greater than 40% have been demonstrated with output power from 100µW to 10mW and beyond [4, 10, 11, 13, 17].

PA design, as well as the impact of modulation scheme, transmitter architecture, and circuit topology will be discussed later in this chapter.

3.2.2 Receiver

Receiver performance impacts the system energy efficiency in two ways. First, the power consumed by the receiver reduces the first term in (3.1). Secondly, the receiver
adds its own internally generated noise to the signal as it passes through each stage, resulting in the system noise factor always being greater than one. Noise factor \((F)\) is defined as the ratio of the SNR at the receiver input to the SNR at the output. \(F\) is the factor by which link margin is degraded by the receiver’s own internal noise generators. To maintain a given link margin, an increase in \(F\) must be compensated by an equivalent increase in transmitted power.

In the absence of an input signal, \(F\) can be expressed as the ratio of the system’s total output noise to the output noise due to the source resistance (i.e. the antenna). Referring to stage \(S1\) with voltage gain \(A_v\) in Figure 5 (left), the squared voltage noise at the output is the sum of the source noise times \(|A_v|^2\) and the noise added by \(S1\). Thus, \(F\) can be expressed:

\[
F_{S1} = \frac{SNR_{im}}{SNR_{out}} = \frac{V_{n,S1}^2 + V_{n,src}^2 \cdot |A_v|^2}{V_{n,src}^2 \cdot |A_v|^2}
\]

\((3.3)\)

![Noisy Amplifying Stage s1](image)

![Equivalent Input-Referred Representation](image)

Figure 5. Left: Noise factor calculation for voltage amplifying stage s1 with source noise due to Rs. Right: Input-referred representation of s1.
Without loss of generality, noise contributions are added using voltage gains and squared voltage noise rather than power gain and noise power. Summing noise voltage is more convenient when the impedances between stages within the receiver are not specified, which is typically the case in integrated transceivers. Rms voltage noise is used here because the noise sources of each stage are assumed to be uncorrelated. Alternatively, the noise added by $S1$ can be represented with an equivalent input voltage source that produces the same total output noise (Figure 5, right).

$$V_{ni,S1}^2 = \frac{V_{n,Sl}^2}{A_{sl}^2}$$  \hspace{0.5cm} (3.4)

$V_{ni}^2$ is called the input referred noise voltage of $S1$. Referring noise to the input is useful for determining minimum detectable signal levels because it gives a direct measure of how large an input signal must be to overcome the noise contributed by the system and source resistance. From (3.4), the noise factor of $S1$ can be expressed in terms of its input referred voltage noise.

$$F_{S1} = \frac{SNR_{in}}{SNR_{out}} = \frac{V_{ni,S1}^2 + V_{n,src}^2}{V_{n,src}^2} = 1 + \frac{V_{n,Sl}^2}{V_{n,src}^2}$$  \hspace{0.5cm} (3.5)

A receiver is a cascade of stages, each having a different voltage gain and noise contribution (Figure 6). Each stage amplifies the signal and noise at its input and adds its own noise. In general, the noise added by each stage is uncorrelated with the signal at its input. If $A_{vk}$ and $V_{n,k}^2$ represent the voltage gain and output noise of the kth stage, respectively, then the noise factor the cascaded system can be expressed.
The impact of noise added by a given stage is reduced by the square of the total voltage gain preceding it (3.6). Typically, the first active stage in a receiver is a low-noise amplifier (LNA) achieving 15-20dB of voltage gain. Thus, the following stages can have much greater input referred noise than the LNA and still only a minor effect on the cascaded system noise factor. The voltage gain in the LNA relaxes the noise requirements of the stages that follow and the system noise factor is often dominated by the LNA.
The total LNA output noise is largely determined by the input transconductor, consisting of one or more transistors biased for small signal amplification. The output noise of a CMOS transconductor can be related to its current consumption directly:

\[
\frac{V_n^2}{\Delta f} = 4kT \cdot \gamma \cdot g_m = \frac{2 \cdot kT \cdot \gamma \cdot I_d}{V_{dsat}}
\]

(3.7)

\(V_{dsat}\) is called the saturation voltage and the right side of (3.7) holds for \(v_{dsat} \geq 100\text{mV}\).

Though (3.7) just represents the input noise of a single MOS transistor, it is an accurate estimate for many common RF LNA topologies, such as the common-source and the cascode. The noise of mixers, low-frequency filters, and other stages following the LNA will generally be inversely related to current consumption through a similar relation.

The approximate noise factor of simple CMOS LNA is expressed below, where \(R\) is the source impedance (typically 50Ω)

\[
F_{LNA} = 1 + \frac{\gamma}{g_m \cdot R} = 1 + \frac{\gamma \cdot V_{dsat}}{2 \cdot I_d \cdot R} = 1 + \frac{\alpha}{P_{LNA}}
\]

(3.8)

\[
\alpha = \frac{\gamma \cdot V_{dd} \cdot V_{dsat}}{2 \cdot R}
\]

(3.9)

For the purpose of understanding the general power and performance tradeoffs in the receiver, the cascaded noise factor can be approximated by the LNA noise factor. In cases where the LNA does not have substantial voltage gain or if later stages are very noisy, the system noise factor may be dominated by other blocks, but the general shape of the performance versus power curve for the complete system will still hold. Based on these assumptions, the first order receiver performance model is shown in Figure 7 (adapted from [16]).
Just as in the case of the transmitter, there are several circuit blocks required for functionality that do not directly improve the noise performance of the system. These are the blocks responsible for frequency translation, channel selection, and demodulation. In a low-IF or direct conversion architecture, channel selection and demodulation are accomplished with a VCO, mixers, low frequency filters, and other circuits. The power consumed in these blocks is referred to as the receiver overhead power $P_{OH-RX}$, as denoted in Figure 7.

### 3.2.3 Power Distribution between PA and LNA

The simplified models for transceiver performance versus power consumption can help gain intuition about how power should be distributed between the transmitter and receiver to achieve maximum link margin. The first step is to express total link margin
in terms of the power consumed by the system. Combining equations (2.13) and (3.8), the minimum detectable signal in the receiver is expressed below.

\[ P_{MDS} = kT \cdot B \cdot \text{SNR} \cdot F \equiv \beta \cdot \left( 1 + \frac{\alpha}{P_{LNA}} \right) \]  

(3.10)

The power output from the PA is simply:

\[ P_{OUT} = e_{PA} \cdot P_{PA} \]  

(3.11)

Therefore, the link margin follows.

\[ M = \frac{P_{OUT}}{P_{MDS}} = \frac{e_{PA} \cdot P_{PA}}{\beta \cdot \left( 1 + \frac{\alpha}{P_{LNA}} \right)} \]  

(3.12)

Assuming \( P_{SUM} \) is the total power available to split between the PA and LNA, (3.12) can be optimized with the aid of a simple substitution.

\[ P_{PA} = P_{SUM} - P_{LNA} \]  

(3.13)

Substituting (3.13) into (3.12), the link margin can now be maximized in terms of \( P_{LNA} \) assuming a fixed value for \( P_{SUM} \).

\[ \max \{ M \}\bigg|_{P_{SUM}=C} \Rightarrow \frac{dM}{dP_{LNA}} = 0 \quad \Rightarrow \quad P_{PA} = P_{LNA} \cdot \left( 1 + \frac{P_{LNA}}{\alpha} \right) \]  

(3.14)

This result implies that, when very little power is available for the PA and LNA, the power should be split evenly between them to maximize link margin. However, as available power grows, a larger proportion should be burned in the PA. This is an intuitive result because, as LNA power increases, the rate of improvement in noise factor
approaches zero whereas power output can always be increased with an increase in PA power.

3.3 Reducing Overhead Power

The overhead power in the transmitter and receiver is strongly dependent on the circuit topology chosen to implement each block as well as the amplitude and phase precision required of the modulation scheme. In fact, with very simple modulation schemes, such as OOK and 2-FSK, it is possible to relax hardware performance requirements or even eliminate circuit blocks altogether.

3.3.1 Low Overhead Modulation Schemes

When choosing a modulation scheme for low-energy, \((E_b/N_0)\) does not tell the complete story. Even if \((E_b/N_0)\) is low, the overall system can still be inefficient if the power needed to generate, modulate, and demodulate the signal is comparable to or larger than the transmitted power. For applications requiring relatively small link margin (i.e. low transmit power), such as WPAN and sensor networks, it becomes particularly important to choose a modulation scheme that requires little power to implement so that the system may remain efficient even with low power output. An ideal modulation scheme would maximize link margin or capacity for a given signal power (i.e. smallest \((E_b/N_0)_{\text{min}}\)) without requiring complex, high-power circuits.
In contrast to the highly complex signal constellations used in QAM and PAM, FSK and OOK have a common trait that only one nonzero signal amplitude must be generated. This has important consequences for system efficiency. First of all, the PA can be implemented with a nonlinear amplifier producing only a single output amplitude – making much higher efficiency possible [18, 19]. Secondly, since information is not carried in the amplitude of the signal, the receive chain need not remain linear after channel selection, so demodulation can be accomplished with a 1-bit quantized waveform and simple logic circuits, rather than with an ADC and DSP. Finally, with FSK (and some forms of PSK) it is possible to generate the necessary frequency shifts by directly modulating the frequency of the VCO, thereby eliminating the transmit mixer and saving power. A minimal block diagram of a 2-FSK transceiver is shown in Figure 8.

The potential power savings of direct VCO modulation depend strongly on the phase accuracy required of the transmitter. If moderate frequency or phase errors are tolerable,
the VCO can simply be tuned directly to the channel with a digital FLL and modulated open-loop [11], resulting in an extremely simple, low power implementation. For phase-error intolerant specs such as GSM, a variant of direct VCO modulation known as the 2-point method is often used. In the simplest version of the 2-point method, a continuous time fractional-N PLL with relatively low bandwidth attempts to hold the VCO frequency steady while an external input modulates the VCO frequency. A high precision DAC feeds forward a signal to cancel the “error” perceived by the PLL due to the modulation [20]. Though the 2-point method eliminates the need for a transmit mixer, the power consumed by the DAC and PLL curtail the potential power savings. This method has been verified for 802.15.4 [9], Bluetooth [14], GSM [17] and other standards.

From a hardware standpoint, the modulation schemes with lowest overhead are OOK and 2-FSK (with large frequency deviations) because they require only a single non-zero signal amplitude and are tolerant of moderate phase/frequency errors. These relaxed specifications permit a simple low power system architecture with no upconversion mixer in the transmitter nor ADC in the receiver so that a larger proportion of the overall power can be spent in the PA and LNA to maximize link margin. However, even the most barebones low-IF or direct conversion implementations will still require an RF VCO to operate. Thus, \textit{in the limit of system simplicity, overhead power is VCO power.}

3.3.2 Overhead Power in the VCO

A VCO is an autonomous circuit using positive feedback or negative resistance to create periodic oscillation at one frequency; that frequency is set by an RC, RL, or resonant LC network. The vast majority of VCO’s designed for communication systems use a
parallel LC resonator (or LC tank) to select the frequency of oscillation because of its potential for superior noise performance. The power requirements and noise performance of an LC VCO are largely determined by the impedance at resonance \( R_T \) and quality factor \( Q_{tank} \) of this resonant LC tank.

Integrated circuit processes are inherently better suited to making capacitors than inductors and, for frequencies below about 10GHz, the value of \( Q_{tank} \) is usually limited by the losses in the inductor. The inductor quality factor \( Q_L \) is:

\[
Q_L = \frac{\omega_0 \cdot L}{R_L} = Q_{tank}
\]  

(3.15)

For the parallel LC tank in Figure 9, the approximate magnitude of the tank impedance at resonance \( R_T \) is given by:

\[
R_T = \omega_0 \cdot L \cdot Q_L
\]  

(3.16)

![Figure 9. LC Tank with a lossy inductor and the parallel approximation.](image)
The maximum tank impedance is also limited by the parasitic capacitance of the active
devices in the VCO, inductor self capacitance, and the amount of tuning range required.
These parameters set a bound on the minimum tank capacitance allowed for the VCO.

In any VCO, a certain minimum amount of current is needed for oscillation to begin, but
the current required to meet output swing requirements is usually much greater.

Typically, $V_o$ must be at least a few hundred milliVolts. $V_o$ can be expressed as a
constant times the product of $I_{SS}$ and $R_T$ for both of the two popular VCO topologies in
Figure 10. Hence, $R_T$ must be maximized to minimize current, making high value, high-
Q inductors critical to reducing power in the VCO.

The choice of VCO topology is also an important consideration for minimizing power.
For instance, $V_o$ as a function of $I_{SS}$ and $R_T$ for the NMOS-only VCO is [21]:

$$V_o = \frac{2}{\pi} I_{SS} \cdot R_T$$

(3.17)
Whereas, $V_o$ for the complementary (CMOS) VCO is:

$$V_o \approx \frac{4}{\pi} I_{SS} \cdot R_T$$  \hspace{1cm} (3.18)

The CMOS VCO will deliver twice the output swing for a given current, but its maximum achievable swing is just half that of the NMOS only device, which swings about the supply rail. Thus, the CMOS VCO would be the preferred choice as long as it has enough supply headroom available to generate sufficient swing. For a given bias current, the CMOS VCO provides twice the voltage swing because the commutating current $I_{SS}$ flows through a parallel impedance of $2R_T$, whereas the impedance seen by $I_{SS}$
in the NMOS VCO is only $R_T$. The CMOS VCO can also be seen as a vertical stack of two VCO’s (an NMOS only and a PMOS only) sharing the same bias current and resonant tank. Stacking RF circuits to reuse bias current is a powerful tool for improving system efficiency.

3.3.3 Voltage Headroom and RF Circuit Stacking
To minimize current in each circuit block, the available voltage headroom must be used optimally. Many mobile systems use a 3.3V lithium supply, but the voltage swing required by the PA, VCO, or LNA may be much lower. For instance, if a VCO is powered by a 3.3V supply but only needs to generate a $300\text{mV}_{\text{0-pk}}$ signal to drive mixers, buffers, or frequency dividers, there will be substantial waste because the VCO swing spec could be met with a much lower supply voltage.

Since supply voltage is typically not a flexible design variable, circuit techniques are needed to optimize use of headroom when supply voltage is high. One way to reduce wasted power is by stacking RF circuits [11]. Stacking is accomplished by placing two RF blocks in series with respect to DC bias currents flowing from the supply and using passive components to decouple their high frequency behavior. Thus, the bias current used in one block is reused by another block. For integrated transceivers, stacking is only feasible for high-frequency circuits where effective isolation can be implemented with on-chip decoupling capacitors or inductors.
A few different stacked configurations are shown in Figure 11. The effect of stacking two small-signal LNA’s is to either double the transconductance $g_m$ (if the inputs and outputs are coupled in parallel), or to increase the voltage gain $A_v$ (if signals traverse the LNA’s in series). Stacking two PA’s doubles the output current, provided the halved voltage headroom is still sufficient. PA stacking techniques are discussed in more detail in the next section. In [11], the VCO was stacked with the LNA in the receiver and with the PA in the transmitter. In this design, the current available to the PA and LNA was set by the VCO’s current consumption.
3.3.4 Resonant drive for PA and Mixers

A substantial portion of the overhead power in a transceiver may be devoted to driving the input capacitance of the PA and mixers with a large RF signal derived from the VCO or frequency synthesizer. In many cases, this overhead power may be reduced by incorporating the input capacitance of the PA or mixer into a resonant tank or even driving them directly from the VCO tank.

Resonant drive of capacitive loads can reduce power if the impedance of the resonant tank is larger than that of the capacitive load at the frequency of interest. Hence, higher frequency systems will often tune out PA and mixer capacitance because the size of inductors drops with frequency while their Q increases. On the other hand, adding the inductance will incur an area or cost penalty that may outweigh the power advantage and the shape of the driving waveform will be limited to sinusoidal-only.

Another option is to incorporate the input capacitance of the PA and mixers into the VCO’s high-Q tank, thus making additional buffers and inductors unnecessary. This technique is most useful at moderate frequencies where the minimum allowed VCO tank capacitance – set by available inductors, tuning range requirements and parasitics – is much larger than the PA or mixer input capacitance. In this case, the system can basically give up a fraction of its tuning range to incorporate the capacitive loads into its tank without increasing the system power budget. On the other hand, if the capacitive loads are larger than the minimum VCO tank capacitance, then using a separate tank for these loads may offer more benefit.
Direct VCO drive also has other drawbacks. First, direct drive in the receiver increases susceptibility to LO pulling from interfering signals that can couple into the VCO tank through the mixer. Moderate LO pulling results in additional phase errors in the received signal as the VCO deviates from its unperturbed phase trajectory. If the interferer couples strongly enough, it may even injection lock the VCO and change its average frequency [22]. Secondly, the transmitter is limited to modulation schemes requiring only a single amplitude. Hence, direct VCO drive is best for constant-envelope (or single-amplitude) phase-error tolerant schemes such as OOK and 2-FSK with large frequency deviations.

3.4 Efficient PA’s with Low Power Output

As output power creeps below 1mW or so, designing an efficient transmitter becomes increasingly difficult. First of all, there are numerous system blocks whose power consumption does not necessarily scale down with transmitted power – resulting in a proportionally large power overhead. Secondly, with typical supply voltages of 1-3V and an antenna impedance of roughly 50Ω, standard PA topologies will be inherently inefficient when putting out such little power. As discussed, power overhead can be minimized by choosing the right modulation scheme, VCO topology, and, if necessary, RF circuit stacking. A couple techniques for increasing the efficiency of low power PA’s will now be addressed.

If the modulation scheme uses a constant envelope signal, the PA can be implemented with a nonlinear amplifier, making higher efficiency possible. To prevent wasting power, the active element(s) in a nonlinear PA should switch on and off completely and have close to 0V across them when strongly conducting – implying the PA should be
driven at or near its maximum possible voltage swing [18, 19]. Hence, the most efficient output power for the PA is determined by the real part of its load impedance $R_{\text{Load}}$ and available zero-to-peak output voltage swing $v_{o,\text{max}}$.

$$P_{\text{max eff}} = \frac{v_{o,\text{max}}^2}{2R_{\text{Load}}}$$  (3.19)

To design an efficient PA with very low power output, then, it is desirable to have a small $v_{o,\text{max}}$ and large $R_{\text{Load}}$.

3.4.1 PA Topology and $V_{\text{max}}$

Generally speaking, supply voltage is fixed by other design constraints, so $v_{o,\text{max}}$ can only be reduced by changing the PA topology. Figure 12 (adapted from [16]) illustrates a few different PA topologies with different values for $v_{o,\text{max}}$. At the far right, two identical push-pull PA’s are effectively stacked on top of one another to cut $v_{o,\text{max}}$ by a factor of 4. Each electron in the output current flows through the load four times. Thus, for a given average supply current $I_{dc}$, this PA can deliver 4 times as much current to the load as the PA at the far left. The stacked push-pull topology was presented in [11], demonstrating 40% efficiency with 250µW output power in the 900MHz ISM band.
Figure 12. Illustration of three nonlinear PA’s and their maximum power output. When the supply voltage is fixed, PA’s can still be designed for high efficiency across a wide range of output power by manipulating the circuit topology to vary the maximum output swing.

3.4.2 Boosting Load Impedance with Resonant Networks

Another key to increasing efficiency at low power output is to increase $R_{Load}$. $R_{Load}$ can be boosted by modifying the antenna design to raise its impedance or employing a transformer. Figure 13, top illustrates the use of an ideal transformer to boost load impedance in the PA. An ideal transformer boosts the load impedance by the square of the turns ratio ($N$).
Figure 13. Top: An ideal transformer can be used to boost PA load impedance and provide voltage gain in prior to the LNA in the receiver. Bottom: A simple LC network implements the same function near its resonant frequency.

Unfortunately, off-the-shelf inductive transformers do not perform well at high frequency and thus are usually limited to low-frequency applications. At higher frequencies, $R_{\text{Load}}$ can be boosted with a resonant LC network (Figure 13, bottom) [18, 19]. The ratio of the transformed $R_L$ to the original $R_A$ typically scales with the square of the overall network quality factor ($Q_{\text{Tank}}$). Hence, transforming impedance by large ratios is only useful for narrowband systems wherein moderate values of $Q_{\text{Tank}}$ are acceptable.

High quality passive components (particularly inductors) are crucial for efficiently boosting impedance because parasitic series resistance in the inductor places an upper
limit on $Q_{tank}$ and, therefore, the maximum achievable impedance transformation ratio. Furthermore, the network efficiency will degrade as the transformation ratio increases because a significant portion of the signal power will be lost in the passive components. The impact of parasitic resistance on the performance of a tapped-capacitor resonant LC transformer will be analyzed in Chapter 4.

3.5 Receiver Noise Factor and Passive Voltage Gain

As previously mentioned, the receiver noise figure ($F$) is the ratio of the SNR at the input to the SNR at the output and $F$ must be compensated by increased transmit power to maintain a given link margin. Voltage gain early in the receive chain is critical to reducing power consumption of subsequent stages. Typically, the first stage of a receiver is an active LNA in which both voltage and power gain are positive (in dB). However, it is possible to achieve voltage gain while having zero or negative power gain (in dB) by transforming the input impedance to a higher value. For example, inductive transformers, resonant LC circuits, or even resonant electromechanical devices, can achieve substantial voltage gain while consuming zero power (Figure 13).

When used for voltage gain, these networks have the added benefit that, unlike active amplifiers, they remain linear in the face of large input signals. However, just as large impedance transformation ratios in the transmitter will degrade efficiency, large passive voltage gain from a resonant network will degrade the system noise factor due to the noise contributed by the parasitic series resistance in the inductor. Hence, high value, high quality inductors are required for achieving significant passive voltage gain. The noise figure, voltage gain, and input impedance of tapped-capacitor LC network will be analyzed in the next chapter.
Passive voltage gain via resonant networks is a powerful tool for reducing receiver power consumption and is particularly well suited to CMOS because MOS transistors accept voltage as input and have capacitive input impedance that can be incorporated into the resonance. Figure 14 illustrates the impact of using a non-ideal transformer to achieve passive voltage gain before the input of an LNA. The transformer simultaneously increases the value of $R$ in (3.8) and adds a non-zero term equal to its own noise factor. At low power consumption, the passive network improves overall noise performance. However, at higher power, the LNA noise factor asymptotically approaches 1, and the overall noise performance is limited by the network noise factor.

3.5.1 Linearity

The biggest drawback of passive voltage gain prior to the LNA is that it degrades the linearity of the system. The input transconductor of an LNA or active mixer has a nonlinear transfer curve that can be represented by a power series expansion.
As the input signal $v_i$ grows larger, the nonlinear terms in (3.20) grow faster than the linear term and, at some point, their magnitude will surpass it. Intermodulation is a form of distortion that occurs when two sinusoidal signals at frequencies $f_1$ and $f_2$ are applied to a nonlinear transfer function and produce distortion products at other frequencies. In the case of third-order intermodulation ($IM3$), the important distortion products appear at frequencies $2f_2 - f_1$ and $2f_1 - f_2$ and can actually fold over inside the desired signal band and cause interference. Hence, intermodulation distortion can overpower the desired signal and prevent reception. If $v_i$ is represented by the sum of two equal power sinusoids in (3.20), the magnitude of the $IM3$ products follows [23]:

$$IM3 = \frac{3}{4} g_{m3} \cdot v_i^3$$  \hfill (3.21)

When $v_i$ is small, the fundamental term is much larger than the intermodulation product. However, since $IM3$ is proportional to $v_i^3$ while the fundamental is proportional to $v_i$, the $IM3$ product will eventually surpass the fundamental.

One common metric for characterizing a receiver’s susceptibility to third order intermodulation is called $IIP3$. If two sinusoids of equal magnitude ($v_i$) at frequencies $f_1$ and $f_2$ are applied to the system, $IIP3$ is the input power level point at which the power in the intermodulation products at $2f_2 - f_1$ or $2f_1 - f_2$ should theoretically become equal to the power in the fundamental, implying that a system with a higher $IIP3$, is less susceptible to intermodulation. In reality, the system will often saturate well before the $IM3$ products can approach the fundamental. Hence, $IIP3$ is actually an extrapolated intercept based on a measurement of the relative power of the fundamental and $IM3$.
components for an input amplitude that is not large enough to cause saturation. By setting $IM3$ equal to the fundamental term in (3.20), $IIP3$ can be derived.

$$IIP3 = 2\sqrt[3]{\frac{g_{m1}}{3 \cdot g_{m3}}}$$

$IIP3$ for a receiver is usually measured at the output of the mixer because this is the first stage that filters out wideband interference. Hence, if there is a high gain LNA preceding the mixer, the nonlinearity of the mixer will typically be the limiting factor in setting $IIP3$ whereas the linearity of the LNA is less of an issue.

Second order intermodulation creates unwanted products at frequencies $f_1 + f_2$ and $f_1 - f_2$, which are often easier to remove with filtering. $IIP2$ is a similar intercept point characterizing second order intermodulation distortion. Depending on the mixer topology, $IIP3$ can be increased by using feedback, increasing $v_{dsat}$, increasing mixer drive amplitude (passive mixers), or other techniques. However, increasing $IIP3$ typically requires extra power consumption or complexity without necessarily improving noise performance. Passive voltage gain effectively reduces $IIP3$ by the amount of gain it provides (Figure 15). Hence, passive gain improves noise performance with increasing power, but does so at the expense of system linearity.
Figure 15. Passive gain in the RF front-end degrades IIP3 by an amount equal to the gain.

3.6 $E_b/N_0$ and Modulation Scheme

The choice of modulation scheme directly impacts a communication system’s bandwidth efficiency ($R/B$) and minimum achievable energy per bit ($E_b/N_0$). To maximize term 3 in (3.1), a wireless system should use a modulation scheme that comes as close to the Shannon minimum limit for $E_b/N_0$ as possible. On the other hand, it is also desirable to keep system complexity low so that the power consumed by the signal generation and modulation circuitry does not become excessive. A reasonable question to ask is: which has the potential for lowest energy per bit, a complex modulation scheme that packs
many bits of data into each signal transition, or a simple binary scheme? The answer to this question is not obvious because there is a tradeoff; complex modulation schemes can achieve higher information rates for a given signaling rate, but they typically also require higher SNR to demodulate, implying more transmit power is needed to maintain the link margin.

Figure 16 provides a comparison of several popular (uncoded) modulation schemes with respect to the Shannon limit, plotting $R/B$ versus the $E_b/N_0$ required for reliable demodulation (i.e. BER = $10^{-4}$). If system link margin is held constant, then the best modulation strategy will largely be determined by which resource is more precious, bandwidth or energy. Schemes with lower $E_b/N_0$ will deliver more data for a fixed amount of energy, while those with higher $R/B$ will deliver highest transmission rate for a fixed amount of bandwidth. Figure 16 shows that complex schemes can be used to achieve either extremely high bandwidth efficiency (i.e. 64-QAM) or high energy efficiency (i.e. 256-FSK), while simpler binary and quaternary signaling schemes tend to fall somewhere in the middle.
Figure 16. Plot of spectral efficiency (R/B) versus required Eb/N0 for several modulation schemes at BER = 10^-4 (adapted from [1]).

802.11g is an example of a standard that dynamically changes its modulation scheme, allowing it to achieve maximum datarate when the received signal has high SNR but to still maintain a link at a reduced datarate when SNR drops. At its maximum datarate 802.11g employs 64-QAM (OFDM on 48 sub-carriers) to achieve 54Mbps in the crowded 2.4GHz ISM band while only occupying 11MHz of bandwidth. In the case of 64-QAM, high bandwidth efficiency comes at the cost of poor energy efficiency as evidenced by its high $E_b/N_0$ requirement. On the other hand, 802.11g specifies a 6Mbps mode which uses BPSK (OFDM on 48 sub-carriers) also occupying 11MHz and having the same coding rate as the 54Mbps mode. Using BPSK, the data rate decreases by a
factor of 9 but the 802.11 spec requires a 60X receiver sensitivity improvement over the
54Mbps mode, owing to the lower \((E_b/N_0)_{\text{min}}\) of BPSK versus 64-QAM [24].

802.11g in its highest data rate represents a good example of “what not to do” if energy
conservation is the goal because 64-QAM has a high \((E_b/N_0)_{\text{min}}\) and its implementation is
generally power hungry and quite complex. The receivers are high power because
demodulation requires a fast, high-precision ADC, substantial digital signal processing,
and linear amplification along the entire receive chain. The 802.11g transmitters tend to
be power hungry because generating the 64-QAM signals requires a linear PA and a fast,
low-noise PLL and VCO. Since the transistor devices constituting the amplifiers (and
all blocks) in a transceiver are inherently nonlinear, achieving linear amplification in the
receive chain and PA comes at the cost of increased power and/or complexity.

In theory, the lowest energy uncoded modulation scheme would be orthogonal M-ary
FSK with M approaching infinity [1]. The orthogonality conditions for M-ary FSK is
satisfied if the tone separation is a multiple of ½ the symbol rate. This strategy is not
popular because \((E_b/N_0)_{\text{min}}\) only decreases incrementally with large M, while the
occupied bandwidth and system complexity grow steadily. In practical systems targeting
low energy, 2,4-PSK, 2-FSK, and OOK are the most common modulation methods –
representing a compromise between energy efficiency and simplicity of implementation.
Radios designed for sensor network applications have used either PSK [7, 9], binary
FSK [4, 6, 11, 13, 14], or OOK[10, 12, 13, 15]. The original 802.15.1 standard
(Bluetooth) uses Gaussian 2-FSK and the 802.15.4 standard uses a form of QPSK (i.e. 4-
PSK) that can be implemented as 2-FSK[25, 26]. A newer version of Bluetooth adopts
QPSK and 8-DPSK as alternate modulation techniques to extend data rate to 3Mbps, but
the energy efficiency of the 8-DPSK systems will drop somewhat since the \((E_b/N_0)_{\text{min}}\) for 8-DPSK is substantially higher than the original GFSK format.

3.6.1 Error Correcting Codes (ECC)

With respect to modulation scheme, a tradeoff between spectral efficiency and energy efficiency has emerged from both theoretical and practical perspectives. First of all, Shannon’s capacity theorem shows that the minimum achievable energy per bit for any communication system is logarithmically related to spectral efficiency and several popular (uncoded) modulation schemes, though not approaching the Shannon limit, do exhibit a strong positive relationship between \(R/B\) and \(E_b/N_0\). Further, from a practical perspective, the schemes with highest \(R/B\), such as \(m\)-PAM or \(m\)-QAM with large \(m\), require complex and high power hardware to implement. The confluence of these factors suggest that simpler schemes, such as 2-FSK, OOK, and 2,4-PSK, will offer the best tradeoff when minimizing energy is the goal.

Even with an optimal demodulator, 2,4-PSK, 2-FSK, and OOK still require at least 10 times higher \((E_b/N_0)_{\text{min}}\) than the Shannon limit to achieve reasonably low probability of error (i.e. \(\text{BER} = 10^{-4}\)). The capacity equation says that, to approach the Shannon limit and reclaim some of this wasted energy, the bandwidth efficiency \(R/B\) will have to be reduced. Error correcting codes (ECC), such as Hamming, Reed-Solomon, Turbo Codes, etc., can reduce \((E_b/N_0)_{\text{min}}\) significantly, but also incur substantial computational power overhead that could increase \(E_{b,\text{Sys}}\) enough to outweigh the \((E_b/N_0)_{\text{min}}\) reduction, particularly in low power systems. ECC’s also generally involve a tradeoff between system latency, decoder complexity and coding gain, with higher latency and complexity delivering more coding gain.
In [27], the \((E_b/N_0)\) reduction (or \textit{coding gain}) and digital computation energy of several ECC’s were evaluated for a 0.18\(\mu\m\) CMOS process with 1.8V supply (Figure 17).

Though complex ECC’s, such as Turbo codes, have traditionally only found use in higher power systems, these estimates would suggest that digital computation energy is now low enough that ECC’s are an effective option for even low power systems.

Furthermore, ECC’s will only become more favorable as supply voltages drop and digital process features continue to scale down.

3.6.2 Direct Sequence Spread Spectrum

Direct Sequence Spread Spectrum (DSSS) techniques involve coding that effectively reduces spectral efficiency yet, provides little coding gain. Spread spectrum systems employ pseudo-noise (PN) codes to spread the transmitted signal over a larger
bandwidth as it passes through the physical channel. The spectral spreading is achieved by multiplying the signal with a Pseudo-Noise code (PN code) prior to transmission. PN codes consist of a sequence of chips with value +1 or -1 with a resultant frequency spectrum that exhibits noise-like properties. This procedure increases signal bandwidth by a ratio equal to the length of the PN code and this ratio is known as the *Processing gain* (not to be confused with coding gain). At the receiver, the incoming signal is simply multiplied by the same PN code as that used in the transmitter. Since the PN code is just a sequence of +1 and -1, the second multiplication by the PN code just restores the original data signal [28].

In the case of orthogonal PN codes, the spreading and de-spreading process does provide a small amount of coding gain that increases with code length [1]. Two codes are said to be orthogonal if their inner-product is zero, meaning they are uncorrelated. Orthogonality of two binary code sequences implies that applying the exclusive-OR operation to the codes generates an equal number of 1’s and 0’s. In an orthogonally coded DSSS system, data are assembled into groups k bits long, and any k bit group corresponds to one of the $2^k$ mutually orthogonal PN codes of length $2^k$. Hence, each length $2^k$ code is a symbol, representing k bits [28].

Interestingly, orthogonal codes with length M, have the exact same bandwidth and energy requirements as orthogonal M-ary FSK. For orthogonal symbols of size $N=2^k$, a tight upper bound for the probability of a symbol error ($P_N$), is shown below [1].

$$P_N < 2 \cdot e^{-\left[\sqrt{\frac{k}{m}} \cdot \frac{\sigma}{\mu}\right]^2}$$

(3.23)

The probability of a single bit error ($P_b$) is related to $P_N$:
Figure 18. Coding gain versus bandwidth expansion for orthogonal codes and selected ECC’s from the previous figure.

\[ P_b = \frac{2^{k-1}}{2^k - 1} \cdot P_N \]  \hspace{1cm} (3.24)

For a given \( P_b \), (3.23) and (3.24) may be combined and solved for \( E_b/N_0 \) to find the energetic benefit or coding gain as a function of code size. Figure 18 illustrates the relationship between orthogonal code length (i.e. bandwidth expansion) and coding gain for a fixed BER of 10\(^{-5}\). Note that the maximum coding gain for an infinitely long code is about 14.3 dB, at which point the system can actually achieve Shannon’s limit. The rate 1/3 turbo code in the previous figure achieves approximately the same coding gain as a PN code of length 1024, but only expands the signal bandwidth by a factor of 3.
Processing gain can grow arbitrarily large, often enabling reliable reception even when $SNR$ (but not $E_b/N_0$) is well below -1.6dB. However, the primary purpose of PN codes is usually just to spread the signal over a wider bandwidth, which is useful for: mitigation of multi-path fading, improved localization accuracy (i.e. GPS), multiple user access (i.e. CDMA), resistance to certain types of jamming and more [1, 28].

For peer-to-peer wireless applications, spread spectrum systems do suffer from some drawbacks – particularly when large processing gains are involved. First of all, spreading the desired signal in the physical channel implies the receiver must accept a wider signal bandwidth, increasing the likelihood of encountering any unwanted interferers. Secondly, the storage and computation required of the receiver may add substantial power overhead. Finally, despite the fact that the de-spreading operation in the receiver reduces the impact of uncorrelated interfering signals within its passband, it is possible for a narrowband interferer inside the band to completely jam a spread spectrum system if it is substantially stronger than the desired signal. A strong jammer can saturate amplifiers in the receive chain or cause the receiver’s AGC to reduce gain until the desired signal is no longer detectable.

The most prolific commercial applications of PN codes include GPS and CDMA systems. GPS operates on a restricted band so that the desired signal coming from a satellite does not have to compete with unwanted interfering signals. The long PN code, providing more than 30dB of processing gain, allows the receiver to resolve the signal with better time resolution to improve ranging accuracy. CDMA cellular systems also operate on restricted bands to guarantee that the only signals inside the receive band are coming from the cell tower. PN codes are used in these systems to provide access to
multiple users all sharing the same bandwidth. Each receiver is identified by a unique code that allows it to differentiate its intended signal from other signals sent by the tower.

CDMA and GPS systems can derive benefit from spread spectrum techniques without significant drawbacks because the receive channel is free from foreign interferers by design. If, on the other hand, a system is operating in a frequency band which is likely to have unwanted interfering signals of unknown strength (such as in ISM bands), then spreading the signal over a very large bandwidth can introduce enough interference to degrade link margin substantially. Nonetheless, moderate length orthogonal codes can be a reasonable solution for efficiency enhancement because the hardware involved is very simple and some bandwidth expansion may be tolerable for low datarate systems.

3.7 Conclusion

Designing wireless systems for high energy efficiency involves interdependent tradeoffs that become more challenging as the power budget scales down. For low power systems, approaching ideal efficiency will only be possible with alignment of system design choices across many levels, from the choice modulation and coding scheme down to transceiver architecture and even transistor level design of the individual circuit components.

From the discussions in this chapter, some basic guiding principles have emerged. First of all, it is important to keep the transceiver architecture as simple as possible to reduce overhead power. Using a low-order, single amplitude modulation scheme with relaxed precision requirements, such as 2-FSK or OOK, can help simplify the architecture and/or
eliminate power hungry circuit blocks, effectively giving up bandwidth efficiency for system simplicity. Secondly, using high-value, high-Q inductors in the right places can ease receiver noise tradeoffs, boost efficiency of low-power PA’s and reduce overhead power in the VCO. Finally, with regard to modulation scheme, there exists a fundamental tradeoff between energy efficiency and spectral efficiency, suggesting that schemes with high spectral efficiency should be avoided if possible. Coding techniques offer an opportunity to further leverage this tradeoff and, as digital computation power continues to scale down, these techniques will become more and more attractive for boosting energy efficiency, even for systems with low power budgets.
4.1 Introduction

This chapter discusses the architecture, design, circuit theory, and test measurements from a low-energy 2.4GHz CMOS RF transceiver designed for integration into a complete wireless sensor node on a single-chip. Each node in a wireless sensor network is a self-powered, autonomous device capable of collecting, processing, and storing sensor data and communicating wirelessly (Figure 19). Typically, wireless sensor nodes need only communicate over a relatively short range, but must do so with high reliability and extremely low power to enable long lifetime. From a system deployment perspective, mote lifetimes measured in years are required for most applications in building and industrial automation. Nodes must operate from batteries and/or scavenged power and their activity is kept on a very low duty-cycle to minimize the system’s average current consumption [29].
Figure 19. A complete wireless sensor node is an autonomous mixed-signal system with integrated sensing, communication, computation and power storage capabilities. Adapted from [29].

Based on a survey of sensor node hardware, the energy required for wireless communication dwarfs that needed for all other sensor node functions. Sensing operations are often passive or even generate power, while the energy required for an 8 bit analog-to-digital conversion [30] or microprocessor instruction (8-bit [31] or 32-bit [32]) has been reduced to just a few 10’s of pJ per operation. On the other hand, even the lowest energy transceivers designed for short range communication require tens to thousands of nJ to send and receive a single bit [4-8, 10-15]. The objective for this design was to reduce the energy per bit as much as possible while still maintaining adequate performance for sensor network applications.
4.2 System Specifications

The popular 2.4GHz ISM band was chosen for this system because it is an unlicensed band that is accepted worldwide. Further, 2.4GHz is a high enough frequency (compared to 433MHz, 868Mhz, and 900MHz ISM bands) to permit integration of high quality inductors of reasonable size on-chip but it still holds a power advantage over higher frequency bands (such as 5.8GHz ISM) because of its lower path-loss and the fact that any RF circuits driven by large signals, such as frequency dividers, will require less current.

4.2.1 Link Margin

For this system, a target of 20m indoor range was chosen to accommodate communication between rooms within a building. Indoor path-loss at 2.4GHz can only be modeled in a statistical sense because of its dependence on the geometry and materials constituting the surrounding environment. Furthermore, even very common time varying disturbances, such as people passing through a room, have been shown to impact path loss for a stationary link by as much as 30dB [33]. Thus, it is not possible to guarantee a reliable link at any distance in all situations. In order to generate reasonable link margin targets for the system, a combination of empirical data and a simplified path-loss modeling technique were employed [2, 3, 33, 34]. If $\lambda$ is the wavelength at 2.4GHz and $r$ is the distance from the RF source, then a popular modification to the Friis free space equation can be used to approximate path loss.

$$L_{PATH} = \left( \frac{4\pi r_0}{\lambda} \right)^2 \left( \frac{r}{r_0} \right)^n$$  \hspace{1cm} (4.1)
The variable $r_0$ is a reference distance ($r_0 = 1\text{m}$ is a default value) beyond which the inverse square law characteristic of the Friis equation no longer governs path loss. The exponent $n$ characterizes the attenuation beyond $r_0$ and has been measured empirically for various propagation conditions. For short range indoor propagation in the low GHz range, $n = 4$ is a common choice for the exponent [2, 33], suggesting that a link margin is on the order of 90dB is sufficient. By (2.14), the fundamental minimum achievable energy per bit for the system with 90dB link margin (at $T = 300\text{K}$) would be about $3\text{pJ}$.

4.2.2 Linearity and Interference

A channel bandwidth of less than 1MHz was chosen so as to simultaneously achieve sufficient data rate and allow for a dense network of closely spaced nodes on multiple channels within the 85MHz wide 2.4GHz ISM band. The biggest drawback of operating in the 2.4GHz ISM band is overcrowding with potentially high-power interferers, such as WiFi (802.11), Bluetooth (802.15.3), 802.15.4, and cordless phones. To maintain functionality in this crowded band, the radio must be able to avoid direct interference by adjusting its frequency and avoid intermodulation by having a high degree of linearity in the receiver front-end.

WiFi transmitters pose the most common threat because they are virtually pervasive and have relatively high output power of up to $+20\text{dBm}$. Since it is likely that sensor nodes will be placed in close proximity to WiFi radios, the goal for this system was to maintain functionality when placed within 3m of a WiFi transmitter. Applying (4.1), with a conservative choice of 2 for the exponent, implies that interfering WiFi signals as large as $-20\text{dBm}$ can be anticipated at the receiver input. Hence, the receiver’s 1dB input compression point should at least be better than $-20\text{dBm}$. 

56
Nonetheless, blocking signals as large as -20dBm will still cause receiver sensitivity to be limited by distortion rather than thermal noise. The radio must be able to avoid active WiFi channel frequencies for any possibility of communication to exist, but even then, large signals on nearby channels can still corrupt data transmission via second and third order nonlinearities in the receiver. A generic amplitude and phase modulated interfering signal $I(t)$ is expressed mathematically below.

$$I(t) = A(t) \cdot \cos \left( 2\pi f_0 t + \theta(t) \right)$$  \hspace{1cm} (4.2)

Second order nonlinearities will generate an unwanted baseband signal in the presence of two-tone or single-tone interference. If two interfering tones are present, second order intermodulation will generate a tone at $|f_1 - f_2|$. The receiver will also detect the amplitude envelope of a single tone interferer and downconvert the envelope directly to 0 Hz where it will have twice its original bandwidth due to the squaring operation. Squaring (4.2) leads to a zero-frequency term and a term at $2f_0$.

$$\left[ I(t) \right]^2 = \frac{1}{2} A(t)^2 + \frac{1}{2} A(t)^2 \cdot \cos \left( 4\pi f_0 t + 2\theta(t) \right)$$  \hspace{1cm} (4.3)

The $2f_0$ term can easily be removed with proper filtering, but the zero-frequency term may fall directly on top of the desired baseband signal. When interfering signals are not amplitude modulated or their AM spectrum is narrow, the zero-frequency term can be avoided by using a low-IF receiver topology. WiFi interferers have a relatively wide AM spectrum, so using a low-IF topology to avoid second-order interference effects requires a relatively high IF. For a given interfering signal power level at the input, the second order input intercept point ($IIP2$) can be used to calculate the resulting magnitude of the IM2 products when referred to the input (all quantities expressed in dB).
\[ P_{\text{IM2}} = 2 \cdot P_{\text{RF}} - IIP2 \]  

(4.4)

Third order interference effects will occur in the presence of two high-powered interferers and, if the interferers are at appropriate frequency offsets (i.e. \( \Delta f \) and \( 2\Delta f \)), the resulting IM3 component will fall right on top of the desired signal. The third order input intercept point (IIP3) can be used to calculate the magnitude of the IM3 products for a given 2-tone interfering signal.

\[ P_{\text{IM3}} = 3 \cdot P_{\text{RF}} - 2 \cdot IIP3 \]  

(4.5)

As an example, assume the maximum interference level is -20dBm and the receiver sensitivity must remain better than -60dBm in the presence of said interferer with 10dB SNR required for demodulation. Thus, the sum of all noise and distortion must be less than -70dBm. Assuming the distortion products dominate over thermal noise at this power level, the resulting specs for IIP2 and IIP3 follow from (4.4) and (4.5) as +30dBm and +5dBm, respectively. If the actual intercept points fall below these, somewhat arbitrary, specs, then the receiver sensitivity will be further degraded, limited by distortion. Assuming the high-powered interferer does not cause significant gain compression, the system will remain functional with a sensitivity limited by the largest intermodulation product.

The intermodulation targets calculated in this example are quite stringent compared to the specs of 802.15.4, 802.15.1 (Bluetooth) and even 802.11 itself, which only require an IIP3 of about -32dBm, -20dBm, and -29dBm, respectively [24-26, 35, 36]. Neither specification directly stipulates an IIP2 requirement, so the IIP2 spec must be calculated from an adjacent channel blocking test. A more in-depth analysis of general receiver
requirements for coexistence with 802.11 and Bluetooth systems is conducted in [36], where the authors suggested -11dBm as an $I_{IP3}$ spec.

### 4.3 Transceiver Architecture

A simplified block diagram of the transceiver is shown in Figure 20 [4]. The process chosen for this design was standard 130nm, bulk CMOS with high-density MIM capacitors and a thick top metal layer for high quality on-chip inductors. The system operates from a nominal supply voltage of 400mV to accommodate a single solar cell as the power supply. In sunlight the entire transceiver could operate continuously from a 2.6mmx2.6mm solar cell [37].

The ultra-low supply voltage allows for dramatic reductions in the power consumption of digital functions and the threshold of 130nm CMOS transistors is low enough to achieve low-frequency analog amplification with the aid of forward body biasing.

![Figure 20. Simplified block diagram of the 2.4GHz prototype transceiver.](image)
However, this leaves very little headroom for the analog amplifiers, so it is important to have a relatively low gain prior to the baseband stages and to filter out wideband interference early in the chain.

To help alleviate the stringent headroom and front-end linearity constraints of this system, a passive receiver front-end topology was devised and all active RF circuits in the transceiver utilize symmetric, center-tapped inductive loads to double available headroom by swinging about the supply rail. This passive front-end topology is well suited to low voltage applications because it is highly linear and has only moderate RF gain, yet still maintains good noise performance. The PA and mixers are driven directly from the high Q LC tank of the VCO without buffering both to save power and to improve performance by making use of the doubled output swing.

As mentioned in chapter 3, one danger of directly driving the mixer switches from the VCO is the potential for increased sensitivity to LO pulling. Large input signals can couple into the VCO tank through the gate capacitance of the mixer switches. If the frequency offset between the interferer and the VCO is very small, the coupling will be exacerbated and can cause substantial phase distortion or even injection locking [22]. The symmetry of the passive mixer and VCO causes these interferers to appear as common mode additive charge in the VCO tank, lessening the amount of disturbance. Nonetheless, in experiments, interferers larger than -35dBm with a frequency offset less than 5MHz did cause substantial LO pulling.

The modulation scheme chosen for the transceiver was 2-FSK with a modulation index greater than one. As discussed in chapter 3, 2-FSK allows for a simple and efficient
system implementation and reasonable $E_b/N_0$ performance. Increasing the modulation index (i.e. increasing the frequency separation between a 1 and 0) further relaxes phase error tolerance, which helps alleviate LO pulling issues in the receiver and makes direct VCO modulation possible in the transmitter without need for a complex PLL. The relatively large FSK tone separation chosen for this system stands in contrast to the modulation schemes employed in Bluetooth and 802.15.4 (assuming O-QPSK is implemented as MSK) [25, 26] in which tone separation of only 1/3 to ½ the bitrate is used to minimize occupied bandwidth. Sacrificing spectral efficiency for reduced system complexity is particularly favorable for wireless sensor network applications wherein data rates below 1Mbps are the norm and 85MHz of unlicensed spectrum is available in the 2.4GHz ISM band.

Modulation in the transmitter is accomplished by adjusting the VCO tank center frequency via digitally switched capacitors. For operation in the field, a digital frequency centering loop is needed to set the VCO frequency and hold it steady. Unfortunately, due to time constraints, this loop was not implemented and hand-tuning was required for testing. The receiver downconverts the 2-FSK signal either to DC or to a low-IF depending on its mode of operation. The signal is subsequently filtered and limited so the receiver provides a 1-bit quantized output for demodulation.

A single LC matching network is used for both the receiver and transmitter, making a front-end switch unnecessary and reducing inductor count. This network is a differential tapped capacitor resonant transformer (Figure 20, upper left). The purpose of the LC network is to boost the PA load impedance in the transmitter and to achieve substantial passive voltage gain in the receiver front-end while presenting a large real impedance to
the mixer input. In this design, the voltage gain from receiver input to the mixer output is greater than 15dB and the PA load impedance is boosted from 50Ω to about 1kΩ.

A reconfigurable PA/mixer topology was developed to minimize the parallel capacitance contributed by the front-end transistors by reducing transistor count. In essence, a single quad of transistors can be configured as a PA or passive mixer, depending on bias voltages and the states of a couple switches. Figure 21 illustrates this reconfigurable topology. Since both PA and mixer are driven directly from the VCO tank, this topology has an added benefit of substantially reducing capacitive loading on the VCO.

**Figure 21.** The reconfigurable PA/Mixer front-end can be configured as a PA (left) or as a passive mixer (right), depending on applied bias voltages and the states of a few switches. This topology reduces capacitive loading on the VCO and input LC network by minimizing transistor count.
4.4 Receiver Design

In its low-power mode, the receiver uses a low-IF architecture, sacrificing image rejection in exchange for cutting power in half. The factor of two reduction in power is due to the fact that only the in-phase VCO signal and baseband chains are needed while the quadrature circuits are disabled. The downside of this low power mode is that eliminating image rejection reduces SNR by 3dB and makes the receiver vulnerable to interfering signals at both positive and negative offsets of \( \Delta f_{IF} \) from the VCO.

When the system requires maximum link margin and minimum susceptibility to interference, a back-gate coupled quadrature VCO generates I & Q signals (Figure 22) [38] and two matching BB chains can be activated to enable full-quadrature downconversion. In this mode, the receiver uses direct-conversion, achieving DC suppression via the bandpass response of the baseband filters. The back-gate coupled QVCO architecture was used here because it produces quadrature outputs without the additional current requirements of coupling transistors, while the cross-coupled NMOS only topology was chosen to maximize the available VCO swing from the 400mV supply.
Figure 22. The quadrature coupled VCO used in this design takes advantage of the body effect to injection lock two VCOs together with 90 degree phase separation, thereby avoiding the power and noise penalty of explicit coupling transistors[38].

At the receiver input, the integrated passive LC network achieves impedance matching and voltage gain. The output of this network connects directly to double balanced passive mixers. Hence, the receiver front-end is entirely passive, making use of the LC network for RF voltage gain and passive mixers for downconversion. A programmable capacitor at the passive mixer output places a low-pass RC corner at about 1MHz in order to filter out wideband interferers before they reach the baseband filters. The resistive element in this RC filter is set by the low frequency output resistance of the passive mixer which is dependent on the amplitude and DC level of the VCO signal driving the switches, so programmability was needed to account for this dependence. The mixer outputs differentially drive a bandpass filter comprising a pair of linearized CMOS inverters. The baseband filter outputs feed into a piecewise logarithmic RSSI that hard-limits the signal, providing a square voltage waveform for demodulation.

4.4.1 LC Input Network

The tapped capacitor resonator presents both series and parallel resonant modes to the receiver input port. Impedance matching from the input port to a real impedance at the
output is achieved at the parallel resonance. On-chip resonant networks have typically not been used to achieve large passive voltage gain because the noise contributions of non-ideal passive components increases with gain. However, as the achievable Q of integrated inductors rises, this noise-gain tradeoff improves. This section examines the effect of finite inductor Q on the matching, noise, and voltage gain of the network.

The simplified RLC network shown in Figure 23 is used in the following analysis. Note that the source driving the RF port has magnitude $2V_i$ to account for the voltage dropped across the source resistance $R_S$. If the input impedance of the network is matched to $R_S$, then $V_o = V_i$. There is some parasitic capacitance in parallel with the inductor, due to both finite inductor self resonance frequency (SRF) and the transistors of the PA and mixer which attach directly to the network, but it is neglected in the following analysis for simplicity.
To design this network for maximum voltage gain, the first step is to select the inductor with the highest $LQ$ product at 2.4GHz with sufficiently high SRF and reasonable die area consumption. For integrated inductors, the value and $Q$ for the inductor are constrained by parameters of the IC process, such as the metal layer material conductance and thickness, the substrate conductance, and the distance between the top metal layer and the substrate. Once the inductor is chosen, the values of $C_1$ and $C_2$ must be selected to achieve the appropriate resonant frequency, gain, matching, and noise figure.
At its output, the matching network appears as a simple parallel LC tank with a lossy capacitor and inductor. The lossy capacitor consists of elements $C_1$, $C_2$, and source resistance $R_S$ and its effective $Q$ ($Q_C$) is set by $R_S$ and the ratio of capacitors $C_2$ and $C_1$.

$$Q_C = \omega R_S \frac{C_2}{C_1} \left( C_2 + C_1 \right) + \frac{1}{\omega R_S C_1}$$  \hspace{1cm} (4.6)

$Q_C$ may assume a wide range of values, permitting design flexibility. The overall network $Q$ at the parallel resonance is a parallel combination of the inductor $Q$ ($Q_L$) and $Q_C$. The output impedance of the network at resonance is real and its magnitude is:

$$R_o \bigg|_{\omega=\omega_0} = \omega L \frac{Q_C Q_L}{Q_C + Q_L}$$  \hspace{1cm} (4.7)

In order to resonate at the right frequency, the composite RC network must present an imaginary impedance of equal magnitude and opposite sign to the inductor at 2.4GHz. For a given value of $C_I$, the imaginary part of the network impedance will have the same value if $C_2$ is zero or infinite, with a somewhat larger imaginary impedance at intermediate values of $C_2$. Hence, for extremely large or small values of $C_2$, the frequency is set by $L$ and $C_I$, while intermediate values of $C_2$ will create a higher frequency resonance.
Figure 24. The center frequency of the tapped-capacitor resonator versus $C_2$. $L = 10\,\text{nH}$, $C_1 = 500\,\text{fF}$, and $QL = 18$.

Since $C_2$ connects to an off-chip antenna or RF input source, there is likely to be a substantial parasitic capacitance arising from chip and pc board pads and traces. This parasitic appears in parallel with $C_2$ and is difficult to predict. Figure 24 is a plot of the center frequency of a tapped capacitor resonant transformer, sweeping the value of $C_2$ while $L$ and $C_1$ are held constant. Notice that, for the given component values, the frequency starts and ends at the same point and its maximum deviation is less than 5%. The relative insensitivity of the center frequency to changes in $C_2$ is one of the major benefits of this network.
Noise at the network output is contributed by both $R_S$ and $R_L$. For best noise factor, the proportion of the total output noise due to $R_L$ should be minimized. Given $Q_L$, the value of $Q_C$ can be selected to get the best gain versus noise tradeoff from the network. If $Q_C$ is very large, then the overall network Q will be maximized (limited by the inductor) and most noise at the output will come from $R_L$, leading to high noise factor. On the other hand, the network has the lowest noise factor when $Q_C$ is much smaller than $Q_L$ because then the losses and output noise are dominated by $R_S$.

To quantify the relationships between $L$, $Q_L$, and $Q_C$ the first step is to determine the voltage gain of a noise voltage source at both $R_L$ and $R_S$ to the output, denoted $A_{VL}$ and $A_{VS}$, respectively:

$$|A_{VL}(\omega_o)| = \frac{2Q_c Q_L}{Q_C + Q_L} \approx \omega_o \frac{Q_L}{Q_C + Q_L} \sqrt{\frac{\omega_o L}{R_S Q_C}}$$  \hspace{1cm} (4.8)

$$|A_{VS}(\omega_o)| = \frac{2Q_c Q_L}{Q_C + Q_L} \sqrt{\frac{\omega_o L}{R_S Q_C}}$$  \hspace{1cm} (4.9)

Therefore, the noise factor ($F$) of the network at resonance becomes:

$$F|_{\omega=\omega_o} = 1 + \frac{R_S}{R_L} \left( \frac{|A_{VL}(\omega_o)|}{|A_{VS}(\omega_o)|} \right)^2 = 1 + \frac{Q_C}{Q_L}$$  \hspace{1cm} (4.10)

The maximum voltage gain is achieved when the source impedance is perfectly matched to $R_L$. This is an intuitive result because all power delivered to the network must be dissipated in $R_L$ and the output voltage is largest when the current through $R_L$ is maximum. Matching occurs when $Q_L$ and $Q_C$ are equal. Thus, from (4.10), the noise factor is 2 ($NF=3\text{dB}$) when matched. The voltage gain of the network when matched is:
Figure 25. Top: Voltage gain (Av in V/V) and Noise figure (NF in dB) of the tapped capacitor plotted against $C_2$. Bottom: $S_{11}$ in dB. $L = 10\,\text{nH}$, $C_1 = 500\,\text{fF}$, and $QL = 18$. For $C_2 = 3.15\,\text{pF}$, near-ideal matching occurs and the network has its maximum gain and a 3dB noise figure.

\[ |A_{VS}(\omega_o)|_{\text{max}} = \frac{2\omega_o L}{\sqrt{R_S R_L}} \quad (4.11) \]

Figure 25 is a plot from a simulation of the voltage gain, noise factor, and $S_{11}$ of the LC network versus $C_2$. A 10nH inductor with $QL = 18$ was used for this simulation, based on a conservative estimate of the performance of the custom-designed differential coil used in the actual system. The value of $C_2$ (approximately 3.15pF) that results in near-ideal matching is indicated in the figure by the gray dashed line. At this value of $C_2$ the network achieves its maximum possible gain and a 3dB noise figure.
gain reaches only a shallow maximum, a smaller capacitance was used in order to trade matching and some voltage gain for improved noise figure.

4.4.2 Passive Mixers

At the output of the LC network, passive mixers downconvert the RF signal. The mixers must present a relatively high impedance to the matching network to avoid reducing its gain. In this section, the input impedance, conversion gain, and noise factor of a passive mixer will be related to the switch on-resistance and characteristics of the driving waveform. The circuit model for the passive mixer used in the following analysis is shown in Figure 26.
The NMOS switches in the passive mixer are driven directly from the VCO’s high-Q tank to save power and achieve sufficient gate drive from the low supply voltage. The driving signal is approximately sinusoidal and, since conductance of a MOSFET in the triode region is linear with $v_{gs}$, the resulting switch conductance waveform resembles a rectified sine wave. However, to simplify the following analysis, the conductance waveform is approximated by a pulse train with a variable duty cycle as in [39]. For sinusoidal drive, variation of the conductance duty cycle is realized by varying the DC
level of the driving waveform relative to the switch threshold (Figure 27). The accuracy of this approximation will be discussed later in this chapter.

The conversion gain of the passive mixer at 0Hz offset can be derived by considering sampling a sinusoid that is perfectly in-phase with the switch conductance waveform $g_{sw}(\theta)$. The output voltage is simply the average of the input voltage while the switch is conducting. For calculating the gain near 0Hz, the switch resistance can be ignored since the output voltage will have an infinite number of cycles to settle. The switch

![Figure 27. Top: Sinusoidal switch conductance waveform and pulsed approximation used in this analysis. Bottom: Mixing function corresponding to pulsed approximation.](image-url)
resistance and load capacitance determine the time constant for settling at the output, but do not affect 0Hz offset voltage gain.

The output voltage is dependent on the phase of the input wave relative to $g_{sw}(\theta)$, reaching its maximum value when the input is in-phase and producing a zero output when the input is orthogonal (90° out of phase). Since the low frequency gain is determined by the peak value of the output, the gain can be calculated by ignoring the orthogonal wave and considering only the in-phase signal. The gain is expressed below:

$$G_{conv} = \frac{1}{4\pi D} \left( \int_{-\pi D}^{\pi D} \cos \theta d\theta - \int_{-\pi D}^{\pi D} \cos \theta d\theta \right) = \frac{1}{2\pi D} \int_{-\pi D}^{\pi D} \cos \theta d\theta = \frac{\sin \pi D}{\pi D}$$  \hspace{1cm} (4.12)

The quantity $D$ is the conduction duty cycle, thus $D$ can assume values from 0 to 1. However, equation (4.12) is only valid for $0 < D < \frac{1}{2}$ because overlap in the conduction cycles of the switches must be avoided. For very small $D$, the gain approaches 1 and gain decreases monotonically to $2/\pi$ when $D = \frac{1}{2}$. As frequency offset increases, the gain will roll off due to the low-pass filter formed by $C_L$, $R_S$, and $R_{SW}$. The value of this output pole is:

$$\omega_p = \frac{1}{R_o C_L} = \frac{2D}{(R_s + R_{SW}) C_L}$$  \hspace{1cm} (4.13)

The mixer also has gain at odd harmonics of the switching frequency. The conversion gain for each odd harmonic is:

$$G_{conv}(n \cdot f_s) = \frac{\sin n\pi D}{n\pi D}, \text{ for odd } n$$  \hspace{1cm} (4.14)

With proper choice of $D$, the gain at a particular harmonic can be rejected. For instance, if $D = 1/3$, the gain at the 3rd harmonic is 0.
Next, the input impedance of the passive mixer is calculated for signals both at 0Hz offset and at large frequency offset from the LO. Consider the case of an input sinusoid at 0Hz offset. If $C_L$ is sufficiently large such that the mixer output pole ($\omega_p$) is a much lower frequency than the RF input signal, then one can assume the mixer output voltage holds a quasi-static value over a single conduction period. Thus, the mixer output capacitor may be modeled as an ideal voltage source for this calculation with a DC value equal to the average of the input voltage during the sampling period. Since this circuit model contains no imaginary components, the resulting input impedance is real. To calculate $R_{in}$, the power delivered from the source must be integrated to find the energy transferred over one complete period.

If the mixer input impedance were represented by a single resistor $R_{in}$, the average power delivered from the source would be:

$$P_{avg} = \frac{1}{T} \cdot \frac{|V_{in}|^2}{R_s + R_{in}} \cdot \frac{T}{2} \cdot \cos^2\left(\frac{2\pi t}{T}\right) \cdot dt = \frac{|V_{in}|^2}{2(R_s + R_{in})}$$

(4.15)

Using the assumptions described in the previous paragraph, the actual average power delivered from the source to the mixer is shown below. Note that this calculation must consider the average of both in-phase and orthogonal inputs because the power delivered, even when the normalized by the period, is dependent on the phase relationship of the input signal to the mixing function.

$$P_{avg} = \frac{|V_{in}|^2}{R_s + R_{SW}} \cdot \frac{1}{2} \left[2 \int_{-\pi D}^{\pi D} \left(\cos \theta - \frac{\sin \pi D}{\pi D}\right)^2 d\theta + 2 \int_{-\pi D}^{\pi D} (\sin \theta - 0)^2 d\theta\right]$$

(4.16)
Hence, the equivalent input resistance at 0Hz offset follows by evaluating (4.16), setting it equal to (4.15), and solving for $R_{in}$ as a function of $R_{SW}$, $R_S$, and $D$. The result is shown below.

$$R_{in\mid\Delta f=0} = \frac{R_{SW} + R_S}{2D\left(1-\left(\frac{\sin\pi D}{\pi D}\right)^2\right)} - R_S \quad 0 \leq D \leq \frac{1}{2} \quad (4.17)$$

Now consider the input impedance for the case of an input signal whose frequency offset is much greater than $\omega_p$ and thus, the input signal has nearly zero gain to the output.

Again, it is assumed $C_L$ is large enough that the mixer output voltage is quasi static over one cycle but here, due to attenuation from the mixer output pole at the large frequency offset, it is further assumed that the output voltage is independent of the input signal. In this case, the mixer output is modeled as a short to ground and power is integrated to find the energy transferred from the source during one period:

$$P_{avg} = \frac{|V_{in}|^2}{R_S + R_{SW}} \cdot \frac{1}{2} \left( 2 \int_{-\pi D}^{\pi D} (\cos \theta)^2 \, d\theta + 2 \int_{-\pi D}^{\pi D} (\sin \theta)^2 \, d\theta \right) \quad (4.18)$$

The input impedance far from the carrier frequency is real and its value is:

$$R_{in\mid\Delta f \gg \omega_p} = \frac{R_{SW} + R_S}{2D} - R_S \quad 0 \leq D \leq \frac{1}{2} \quad (4.19)$$

If $D = \frac{1}{2}$, then the input impedance at large frequency offsets is just $R_{SW}$. The input impedance for a quadrature passive mixer can be calculated in a similar fashion and the results are shown below:
\[
R_{\text{in}} \big|_{M=0} = \frac{R_{\text{sw}} + R_s}{4D \left(1 - \left(\frac{\sin \pi D}{\pi D}\right)^2\right)} - R_s \quad 0 \leq D \leq \frac{1}{4} \tag{4.20}
\]

\[
R_{\text{in}} \big|_{M >> f_s} = \frac{R_{\text{sw}} + R_s}{4D} - R_s \quad 0 \leq D \leq \frac{1}{4} \tag{4.21}
\]

Figure 28 is a simulation of the mixer input impedance against frequency offset for a quadrature mixer with \( D = \frac{1}{4} \). The input impedance closely resembles that of a very high Q parallel LC tank with center frequency set by the VCO and 3dB bandwidth equal to twice the mixer output pole frequency from (4.13). Thus, the passive mixer can be designed to present a very low impedance to signals far from the carrier while remaining high impedance in a narrow band around the switching frequency. The result is that signals at small frequency offsets are passed through the mixer while those at large offsets are strongly attenuated not only at the mixer output, but at the mixer input as well. This filtering effect reduces the amplitude of wideband interferers so that they do not appear as large signals at the mixer input, improving wideband linearity.
The voltage gain from $V_i$ to the mixer input node $V_x$ ($A_{V_x}$) is easily derived from the voltage divider formed with $R_s$ and $R_m$. The interference rejection ratio ($IRR$) is the ratio of $A_{V_x}$ at 0Hz offset to $A_{V_x}$ at large offset. Maximum rejection is achieved for $D = \frac{1}{4}$ in the quadrature mixer and at $D = \frac{1}{2}$ for the single phase case. $IRR$ is 19.2dB under the conditions listed at the top of Figure 29.
Figure 29. Voltage gain from RF source (Vi) to the mixer input (Vx) and output (Vo). The passive mixer's sharp impedance profile attenuates wideband interferers at its input. The interference rejection ratio (IRR) is the ratio of the attenuation of large frequency offset signals to those with 0Hz offset.

\[
\text{IRR} = 19.2 \text{dB}
\]

Finally, the noise performance of a differential passive mixer with both single-phase and quadrature driving signals is analyzed below. As mentioned previously, the mixer downconverts both input signals and thermal noise near the switching frequency and all odd harmonics with a gain given in (4.14). Therefore the total mixer output noise power density at 0Hz is the infinite summation of the noise at all odd harmonics weighted by the gain at each harmonic:

\[
\max \{I_{RR}\}_{I\&Q} = \left| \frac{A_{\text{Vx}}}{A_{Vx}} \right|_{\Delta \gg D} = 1 + \frac{R_S}{R_{SW}} \left( \frac{8}{\pi^2} \right)
\]

(4.22)
The gain terms at each harmonic are closely related to the Fourier series of the mixing function \( m(\theta) \) (see Figure 27), which is:

\[
\Im (m(\theta)) = \sum_{m=-\infty}^{\infty} 2D \sin \frac{n \pi D}{n \pi D} \sum_{n=2m+1}^{\infty} \left( \frac{\sin \frac{n \pi D}{n \pi D}}{n \pi D} \right)^2
\]  

(4.24)

Therefore, the infinite summation can be computed using Parseval’s relation to find the sum of the noise power at all harmonics and thus the total mixer output noise.

\[
\frac{1}{2\pi} \int_{-\pi}^{\pi} |m(\theta)|^2 d\theta = 2D = \sum_{m=-\infty}^{\infty} \left( 2D \frac{\sin \frac{n \pi D}{n \pi D}}{n \pi D} \right)^2 \quad \text{(Parseval)}
\]

(4.25)

From (4.23) and (4.25), the total noise density at the mixer output is:

\[
\frac{V_n^2}{\Delta f} = 4kT \left( R_s + R_{SW} \right) \sum_{m=-\infty}^{\infty} \left( \frac{\sin \frac{n \pi D}{n \pi D}}{n \pi D} \right)^2
\]  

\[
\sum_{n=2m+1}^{\infty} \left( \frac{\sin \frac{n \pi D}{n \pi D}}{n \pi D} \right)^2
\]

(4.26)

Given the signal gain and total output noise, the output SNR and hence, the noise factor of the passive mixer follows:

\[
F = \left( \frac{R_s + R_{SW}}{R_s} \right) \frac{1}{2D} \left( \frac{\pi D}{\sin \frac{\pi D}{\pi D}} \right)^2 \quad 0 \leq D \leq \frac{1}{2}
\]

(4.27)

Taking the derivative of (4.27) with respect to \( D \) and setting to zero, reveals that there exists an optimal \( D \) for a single-phase passive mixer delivering best noise factor:

\[
D_{opt} = \frac{\tan \frac{\pi D}{2\pi D}}{2 \pi D} = 0.375
\]

(4.28)
Setting $D = 0.375$ in (4.27) and evaluating the limit as $R_{sw}$ goes to zero, the minimum value of $F$ for the single phase mixer is 2.17 (or 3.36 dB).

For quadrature downconversion, both I & Q mixers are connected to the same input node and overlap in the switching waveforms of the two mixers must be avoided. Since overlap is avoided, the thermal noise at sampling instants of the I and Q mixers is uncorrelated, but the output signals are correlated. The noise factor for a quadrature passive mixer is:

$$F_{I\&Q} = \frac{R_s + R_{3W}}{R_s} \cdot \frac{1}{4D} \left( \frac{\pi D}{\sin \pi D} \right)^2 \quad 0 \leq D \leq \frac{1}{4}$$

(4.29)

The best case noise factor for the passive mixer with quadrature downconversion occurs at $D = \frac{1}{4}$ and its value is given below. Note that, as $R_{sw}$ tends to zero, the mixer noise factor is only $\pi^2/8$, or about 0.9dB.

$$F_{I\&Q} \Big|_{D=0.25} = \frac{R_s + R_{3W}}{R_s} \cdot \frac{\pi^2}{8}$$

(4.30)

Below in Figure 30 is the noise factor and in Figure 31 the voltage gain of a single-phase, double-balanced passive mixer, both using the circuit parameters shown at the top of Figure 30. The dashed line represents the theoretical results predicted by the analysis above, whereas the solid black and grey lines represent spectreRF simulations of bsim3 devices using square pulse drive and sinusoidal drive, respectively. The results for the switch based model derived in this work assume $R_{sw}$ to be 5Ω with a source resistance of 50 Ω. The W/L ratios of the bsim3 devices in the spectreRF simulations were chosen to achieve an on-resistance of 5 Ω at the peak of the driving waveform. The dc level of the sinusoidal driving wave was shifted approximately to the threshold voltage of the bsim3 devices.
devices, about 0.37V, in order to achieve a duty cycle of 0.5, and the zero-to-peak amplitude was chosen as 0.63V to match the maximum gate overdrive of the pulsed wave simulation.

**Passive Mixer Noise Figure**

- **Theory (Switch Model):** $R_s=50\Omega$, $R_{sw}=5\Omega$, $D = 0.5$
- **Bsim3 Square Pulse Drive:** $R_s=50\Omega$, $W/L = (96/0.13)$, $V_{on} = 1V$, $V_{off} = 0V$, $D = 0.5$
- **Bsim3 Sine Wave Drive:** $R_s=50\Omega$, $W/L = (96/0.13)$, $V_{0, pk} = 0.63V$, $V_{dc} = V_{th} = 0.37V$

![Graph showing Passive Mixer Noise Figure](image)

Figure 30. Simulation of passive mixer noise figure versus frequency for three cases: basic switch model theory, bsim3 models with ideal square pulse drive, and bsim3 with sinusoidal drive.
Intuitively, the noise figure is expected to be somewhat lower in the case of sinusoidal drive because the reduced harmonic content of the driving wave causes less noise folding. Similarly, the sinusoidally driven mixer has slightly higher gain because, though the drive voltage is greater than $v_{th}$ for a full half cycle, the effective sampling window is less than 0.5 because the slope of the sine wave is finite.

Since the models developed here have not dealt with device capacitance, the actual mixer performance deviates from theory as frequency increases. As expected, the voltage gain begins to drop off at higher frequencies as the device input capacitance begins to attenuate the incoming signal prior to mixing. The noise figure also increases.
because, as the gain from the input source is attenuated, a larger proportion of the total output noise is contributed by the switching devices themselves.

It is important to note that the simulation data shown above are intended to isolate and validate the theory describing the passive mixer performance only and, as the impact of the LC network is not included, these do not represent the component values used on the actual chip. In the real transceiver, the passive gain preceding the mixer boosted the effective source impedance substantially and therefore much smaller transistors were used.

Given the noise folding properties of the mixer, it is possible to improve upon its performance by filtering out noise from $R_s$ at odd harmonics of the switching frequency. The tapped-capacitor resonator at the input of the receiver discussed here does effectively remove source noise at odd harmonics. However, the thermal noise from the switching devices still comes through unfiltered at all odd harmonics and the LC network contributes its own noise, as previously discussed. In order to isolate the noise contribution of the LC network and the passive mixer when simulating the front-end, two versions of spectreRF simulations were performed: the first assumed the resistive losses in the inductor were noiseless and the second did not. Table 1 summarizes the noise and voltage gain simulation results for the passive front-end.
<table>
<thead>
<tr>
<th></th>
<th>NF Mixer</th>
<th>NF LC Network</th>
<th>( A_c ) _R_s to Mixer output</th>
<th>VCO Drive Amplitude</th>
<th>W/L Mixer transistors</th>
</tr>
</thead>
<tbody>
<tr>
<td>Quadrature Front-end</td>
<td>3.1 dB</td>
<td>1.1 dB</td>
<td>16.2 dB</td>
<td>385( mV_{\beta_{ph}} )</td>
<td>15/0.13</td>
</tr>
<tr>
<td>Single-phase Front-end</td>
<td>4.5 dB</td>
<td>0.9 dB</td>
<td>15.3 dB</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Table 1. SpectreRF simulation data for the passive front-end implemented on this transceiver.

4.4.3 Baseband Chain

The first baseband stage following the mixer is a bandpass filter with a lower cutoff below 100kHz and a programmable upper cutoff frequency between 300kHz and 1MHz. The schematic is shown in Figure 32. At very low frequencies, the amplifier has approximately unity gain because the input signal only appears at the NMOS gates and these devices drive diode connected PMOS loads with similar \( g_m \). The combination of \( C_{AC} \) and the Miller-gain reduced \( R_d \) create a high-pass filter with a corner well below 100kHz, passing the input signal to the gates of the PMOS devices in the passband. Thus, the current required to meet noise constraints is reduced because amplifier utilizes the \( g_m \) of both N and P devices in the passband. The RC corner set by \( C_L \) and \( R_L \) defines the upper cutoff frequency.
Forward body biasing of the PMOS devices is used to set the common mode level of the first baseband stage at mid-rail (200mV). The output of this stage is DC-connected to the PMOS inputs of the subsequent stage which shares the same Nwell. The DC gain through the entire baseband chain is approximately 1. There are 4 baseband stages and a limiter which delivers a square waveform for demodulation at the output. The outputs of each baseband stage drive single transistor amplitude detectors and the currents of these detectors are summed to create a piecewise logarithmic RSSI signal [40] which gives a “linear in dB” estimate of the input from about -110 to -30dBm.
4.5 Transmitter

The goal for the transmitter was to achieve reasonable global efficiency with a low power output in the range of 100-500µW. The power output target was derived from a system level analysis of link margin and power consumption based on a generic transceiver model as discussed in chapter 3. To maintain high global efficiency at low power output, power hungry upconversion mixers and LO buffers must be eliminated. Furthermore, the severe voltage headroom constraint makes active upconversion mixers impractical. Since the large FSK tone separation of this transceiver’s modulation scheme is tolerant of moderate frequency and phase errors, a very simple transmit topology was used. In transmit mode, the VCO and PA are the only RF blocks consuming current and binary FSK is accomplished by directly modulating the VCO tank capacitance. A digital frequency locking loop (FLL) is necessary to select the channel frequency but, due to time constraints, it was not implemented on this chip. The simulated power estimate for a complete digital FLL similar to that reported in [11] is about 25µW.

Given the large output swing afforded by using a differential design and swinging above the supply rail, the PA could easily put out several milliWatts efficiently. To maximize the efficiency of this class-C PA at the selected power output, it is necessary to boost the load impedance so that the PA uses all available voltage headroom. The differential PA drives the same tapped-capacitor resonator analyzed for the receiver. The optimum load impedance for 300µW power output and 800mV zero-to-peak differential voltage is around 1kΩ.
Equation (4.7) expresses the PA load impedance as a function of the component values in the LC network. Given the target impedance of 1kΩ at resonance, the maximum allowable size of the inductor is found by setting $C_2 = 0$ and $R_o = 1\Omega$ in (4.31):

$$L_{max} = \frac{\sqrt{(R_s + R_L)R_o}}{\omega_o}$$  \hspace{1cm} (4.31)

The IC process used for this design can achieve maximum on-chip Q at 2.4GHz of 20, making $L_{max}$ about 14nH.

Just as losses in the inductor degrade the noise performance of the LC network in the receiver, the finite $Q_L$ effectively reduces the network efficiency in the transmitter because some of the PA output power is dissipated in $R_L$. The efficiency of the network is expressed below:

$$E_{LC} = \frac{P_{R_s}}{P_{R_s} + P_{R_L}} = \frac{Q_L}{Q_C + Q_L}$$  \hspace{1cm} (4.32)

The highest network efficiency is achieved when $C_2 = 0$, where $Q_C$ is minimized. The capacitor $C_2$ was made programmable to accommodate the different optimum capacitance values for the receiver and transmitter.

### 4.6 Results

The transceiver chip was measured on a 4-layer printed circuit board constructed from standard FR4 substrate material. Gold plated traces were used on the board to allow for direct wirebonding and the chip was secured to a gold pad on the pc board by means of conductive silver epoxy. Aluminum wirebonds connected directly from the aluminum pads of the chip to gold pads on the board. The chip-on-board setup was used primarily
to minimize parasitic inductance and capacitance on the RF port. The chip was placed as close to the edge of the board as possible to keep RF traces short and cutouts were made in the board’s groundplane underneath the RF pads and traces to reduce capacitance.

Since the transceiver has a differential RF port, a surface-mounted chip balun was used to transform the signal to single-ended for connection to the RF source and spectrum analyzer. Test structures were included on the board to calibrate out losses in the balun, bondwires and board traces. Two baluns were placed in series with the differential ports connected by board traces and a pair of bondwires. The length and geometry of the traces on the test structures were made identical to those connecting the RF port of the chip to the balun and SMA connector. Figure 33 is a diagram of the test structures used for calibration.
A pair of 1.5V AA alkaline batteries were used to power the test board to provide a very low noise supply. Due to the system’s extremely low-supply voltage, an op-amp based linear voltage regulator had to be built on the board because no off-the-shelf linear regulators could produce such a low voltage. A simplified schematic of the power supply generation circuitry is shown in Figure 34.
4.6.1 Receiver Measurements

The noise figure was measured by applying an RF input from a calibrated source and comparing the SNR at the receiver input to the SNR at baseband using a spectrum analyzer. Due to the low noise and gain of the passive front-end, as well as its relatively high output impedance, it was difficult to get an accurate noise figure measurement of the front-end alone. Instead, the signal was amplified as it passed through the baseband filters and then buffered to drive off-chip to the spectrum analyzer. At the end of the receive chain, the RSSI quantized the signal and provided rail-to-rail swing regardless of the strength of the input signal. The 1-bit quantization step of the limiter converts amplitude noise from previous stages to phase noise and actually increases the $E_b/N_0$ necessary for demodulation by about 3dB [41]. In this design, the performance degradation due to coarse quantization was tolerated in exchange for hardware simplicity.
When in quadrature downconversion mode, the I & Q baseband chains provided separate outputs and the SNR of each was measured individually. A small degree of correlation between the I & Q signal paths was introduced into the system due the imperfect 90 degree phase separation of the VCO driving signals and mismatch in the transistors of the two passive mixers. This phase error was measured at the mixer outputs by applying a -50dBm RF signal and measuring the phase relationship between the two outputs on a scope. The phase error at the receiver’s nominal operating point was less than 5 degrees on three different test boards.

Linearity measurements were taken with the receiver operating in single-phase downconversion mode and consuming 330µW. RF signals were applied at the receiver input and the differential mixer outputs were converted to single-ended with off-chip baluns and buffered with op-amps to drive the spectrum analyzer. The RF input power was swept from -34dBm to -14dBm in 2dB increments and the resulting fundamental and IM3 components were recorded. \( IIP2 \) was not measured for this receiver, in part because \( IIP2 \) is a strong function of matching and many samples would be needed to get an accurate figure. Hence, Monte-Carlo simulation data were used to estimate \( IIP2 \) at +40dBm. This simulation assumed \( \sigma_{Vt} = 30mV \) as the standard deviation for threshold mismatch between transistors in the mixer. The measurements and simulation results for \( IIP3 \) and the 1dB compression point are shown in Figure 35.
Figure 35. Simulation results and measurements for IIP3 and 1-dB compression point.

Figure 36 is a plot of measured receiver noise figure versus total power consumption. There are multiple data points at each receiver power level representing a sweep of the input signal frequency across the passband of the baseband filters. At very low power levels, the VCO has low amplitude swing and the baseband amplifiers have reduced bias current. As a result, both the input referred noise from the amplifiers and the effective $R_{sw}$ of the passive mixers are increased. The linearity of the receiver is also degraded due to the reduced VCO swing. However, as receiver power consumption increases above 300µW, the VCO swing approaches its maximum value of 400mV zero-to-peak, improving the mixer noise and linearity. The sharp jumps in the noise performance, appearing between 300 and 330µW in single-phase and between the 530 and 670µW...
measurement points in quadrature mode, result from coarse quantization in the VCO current control. Due to a design error in the VCO inductor, the Q of the tank was substantially lower than expected. Hence, all receiver measurements were taken at the two highest current settings of the 4-bit resistive DAC of the VCO.

The on-chip RSSI response was measured by applying an RF input signal and sweeping its power from -120dBm to -20dBm. The results are shown in Figure 37. Though the RSSI was not truly “linear in dB”, it did respond monotonically to input power and had a sensitivity range of about -110dBm to -30dBm.

Figure 36. Measured receiver noise figure versus power consumption.
The receiver achieved a noise figure of 7dB and an \( IIP3 \) of -7.5dBm at its nominal operating point while consuming 330\( \mu \)W. Current consumption in the receiver was fairly evenly split between the VCO and the baseband amplifiers. A 5.5dB noise figure was measured in quadrature downconversion mode. Overlap in the conduction cycles of the I & Q mixers was avoided in quadrature mode by biasing the mixer switches well below threshold. Thus, a full 3dB noise improvement was not seen in quadrature mode because the transistors were not switched on as strongly, leading to increased \( R_{sw} \). Note that all measurements in the receiver and transmitter are taken at a frequency slightly below 2.4GHz. The VCO was unable reach the desired band because of a design error in the custom made symmetric, center-tapped tank inductance. However, the VCO did have a wide tuning range of about 1.95GHz to 2.38GHz.
4.6.2 Transmitter Measurements

An unmodulated transmitter output spectrum is shown in Figure 38. The phase noise is estimated from this measurement as -106dBc/Hz at 1MHz offset. The relatively low output power of -8.2dBm seen in this measurement reflects the losses in the balun, cables, connectors, and board traces, totaling 2.6dB.

Figure 39 is a plot of global transmitter efficiency and PA efficiency as a function of power output. At 300μW power output, the PA is 45% efficient and the overall efficiency is 30%. The capacitor C2 is programmed to its minimum value to maximize the efficiency of the LC network as described above. The current necessary to provide
maximum VCO swing is about 20% higher in the transmitter than in the receiver. This increase is due to a change in the VCO loading caused by the large signal swing at the PA output during transmission. The gate capacitance of the PA/mixer transistors forms a significant portion of the overall VCO load, and since the PA output is approximately 180° out of phase with the VCO driving signal, the effective value of $C_{gd}$ on each PA transistor is increased substantially when transmitting.

Figure 39. Efficiency of PA and overall TX versus output power.
4.7 Conclusion and Comparisons

The chip was fabricated in a 130nm RF CMOS process and a die photo is shown in Figure 40. This front-end topology simultaneously achieves good noise figure and IIP3 at low voltage and low power. The measurement results for the transceiver are summarized in Table 2.

<table>
<thead>
<tr>
<th>Overall</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Supply Voltage</td>
<td>Min/Typ/Max</td>
</tr>
<tr>
<td></td>
<td>360/400/600 mV</td>
</tr>
<tr>
<td>2-FSK Deviation</td>
<td>Min/Max</td>
</tr>
<tr>
<td></td>
<td>300/1000 kHz</td>
</tr>
<tr>
<td>RX</td>
<td></td>
</tr>
<tr>
<td>Power Consumption</td>
<td>Min/Max</td>
</tr>
<tr>
<td></td>
<td>200/750 μW</td>
</tr>
<tr>
<td>Noise Figure</td>
<td>Min/Max</td>
</tr>
<tr>
<td></td>
<td>5.1/11.8 dB</td>
</tr>
<tr>
<td>IIP3</td>
<td>Typ</td>
</tr>
<tr>
<td></td>
<td>-7.5 dBm</td>
</tr>
<tr>
<td>TX</td>
<td></td>
</tr>
<tr>
<td>Power Consumption</td>
<td>Min/Max</td>
</tr>
<tr>
<td></td>
<td>700/1120 μW</td>
</tr>
<tr>
<td>Output Power</td>
<td>Min/Max</td>
</tr>
<tr>
<td></td>
<td>140/320 μW</td>
</tr>
<tr>
<td>PA Efficiency</td>
<td>200&lt;P&lt;300 μW</td>
</tr>
<tr>
<td></td>
<td>&gt;44 %</td>
</tr>
<tr>
<td>VC0</td>
<td></td>
</tr>
<tr>
<td>Power Consumption</td>
<td>Min&lt;sub&gt;only&lt;/sub&gt;/Max&lt;sub&gt;I+Q&lt;/sub&gt;</td>
</tr>
<tr>
<td></td>
<td>160/700 μW</td>
</tr>
<tr>
<td>Frequency</td>
<td>Min/Max</td>
</tr>
<tr>
<td></td>
<td>1.95/2.38 GHz</td>
</tr>
<tr>
<td>Quadrature Mismatch</td>
<td>Meas. Δφ at I&amp;Q Mixer Out</td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td>Phase Noise @1MHz</td>
<td>@ P&lt;sub&gt;VCO&lt;/sub&gt;= 270 μW</td>
</tr>
<tr>
<td></td>
<td>-106 dBC/Hz</td>
</tr>
</tbody>
</table>

Table 2. Summary of measured performance.
Despite its extremely low supply voltage and power consumption, the passive front-end of this receiver achieves noise and distortion performance far more than adequate to meet the specifications of Bluetooth, 802.15.4, and 802.11 [24-26, 35, 36]. However, the overall transceiver is not compatible with any of these specs due to the lack of modulation precision in the transmitter and the relatively unsophisticated baseband filtering and demodulation strategy in the receiver. The baseband filtering circuits were necessarily very simple due to the low supply voltage, while the
modulation/demodulation strategy implemented here reflects a preference for hardware simplicity and low-power in lieu of spectral efficiency.

Figure 41 provides a comparison of the energy efficiency ($\eta$) of this transceiver and reported results from a selection of commercial and academic radios [4, 5, 7, 9, 11-14, 42, 43]. The system energy efficiency ($\eta$) is simply the ratio of the ideal minimum energy per bit for a given system to the total energy per bit it consumes. Thus, $\eta$ has an ideal value of 1 (0dB) and becomes increasingly negative as efficiency degrades. As derived in chapter 2, $\eta$ can be expressed much more intuitively as the product of three terms.

$$\eta = \left(\frac{E_{b-TX}}{E_{b-Sys}} \right) \frac{1}{F} \left( \frac{\ln 2}{E_b/N_0} \right)$$

(4.33)

In Figure 41, the $\eta$ values for each system are listed along with a breakdown of efficiency degradation into three categories representing the constituent factors of $\eta$. None of these systems directly reported $E_b/N_0$, so it was implied indirectly from link margin, noise factor, power output and bandwidth.
The radios are listed left-to-right in order of increasing link margin. In general, if bandwidth is held constant, systems with high link margin can be expected to achieve better energy efficiency because a large portion of the total energy budget is dedicated to the PA, reducing the impact of overhead power on global efficiency. Hence, the receivers in such systems can afford an increase in power consumption to reduce noise factor and reduce \((E_b/N_0)_{\text{min}}\) needed for demodulation by applying coding and using a higher precision ADC.

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Modulation Scheme</strong></td>
<td>GMSK</td>
<td>2-FSK</td>
<td>BPSK</td>
<td>QOQPSK</td>
<td>64 QAM</td>
<td>OQPSK</td>
<td>OOK</td>
<td>2-FSK</td>
<td>2-FSK</td>
<td>GFSK</td>
</tr>
<tr>
<td><strong>Data Rate Kbps</strong></td>
<td>270</td>
<td>25</td>
<td>6000</td>
<td>250</td>
<td>54000</td>
<td>250</td>
<td>5</td>
<td>300</td>
<td>100</td>
<td>1000</td>
</tr>
<tr>
<td><strong>Total Power TX+RX mW</strong></td>
<td>9130</td>
<td>32</td>
<td>3465</td>
<td>55</td>
<td>3465</td>
<td>58</td>
<td>1.6</td>
<td>1.3</td>
<td>2.4</td>
<td>119</td>
</tr>
<tr>
<td><strong>Link Margin dB</strong></td>
<td>142</td>
<td>121</td>
<td>115</td>
<td>105</td>
<td>95</td>
<td>94</td>
<td>93</td>
<td>92</td>
<td>88</td>
<td>85</td>
</tr>
</tbody>
</table>

Figure 41. Energy efficiency comparison of this work with selected commercial and academic transceivers.
The system efficiency for this work is -30dB, a figure that compares well to other radios with similar link margin. The focus here has been to develop a transceiver topology capable of relatively high transmit efficiency and low noise factor with very low power. Thus, although link margin is low, receiver noise factor and transmitter inefficiency only claim about 13dB of the total 30dB degradation. The bulk of the wasted energy is due to the non-ideal modulation scheme and, more specifically, the simple, non-coherent 1-bit demodulation strategy. A substantial energy efficiency enhancement could be achieved in a future version of this system by performing coherent demodulation with a multi-bit ADC and utilizing forward error correction coding [1]. Combining the 31pJ/sample, 8-bit ADC in [30], the rate 1/3 turbo code and power consumption data in [27] (scaled to 1V in 130nm CMOS) and accounting for the chip-rate increase of 3X due to coding, suggests that greater than 10dB reduction in $E_b/N_0$ could be achieved with an associated increase in energy consumption of only about 800 pJ/bit. Incorporation of more sophisticated yet low power, demodulation schemes is a logical next step for the field of extremely low energy wireless communication.
Chapter 5

Conclusion

5.1 Research Summary

As the scope of market demand for wireless functionality grows, radios are no longer just going to link our phones and laptops to the outside world, they will be embedded into nearly every imaginable device we use. To meet the needs of this rapidly growing embedded wireless market, radios must be able to communicate reliably over relatively short distances and have a long lifetime without battery replacement or recharging.

Wireless communication is very often the most energy expensive function of such devices, thus making reduction in the energy/bit of the communication link crucial to extending their lifetime.

This thesis has shown that currently available commercial short-range radios targeting the embedded market are still several orders of magnitude from any fundamental lower limit on energy/bit. Chapter 2 explored the theoretical limits on the energy consumption of wireless communication and developed a metric for evaluating any wireless system’s energy efficiency in comparison with fundamental limits. Chapter 3 discussed system tradeoffs within RF transceivers and several practical design techniques aimed at
improving efficiency. In chapter 4, the design of a 2.4GHz transceiver prototype achieving extremely low energy consumption was presented. A novel passive receiver front-end topology was developed to meet the unique requirements of this design and theory was developed to describe its performance.

As measurement and theory have shown in chapter 4, the passive mixer can achieve surprisingly low noise figure (as low as 0.92dB as $R_{sw}$ tends to zero (4.30)) and, as one would expect, it offers excellent linearity. In this design, the passive mixer was preceded by passive resonant gain, thus saving power by increasing the effective source resistance and reducing both the size of the mixer transistors and the current needed in the subsequent baseband circuitry. The transistors of the mixer were driven directly from the resonant tank of the VCO to reduce power by leveraging the high Q of the tank and to maximize the drive signal amplitude by swinging above the supply rail.

5.2 Passive Techniques in Future Radios

Digital CMOS process features have already scaled down well below the 130nm technology employed in this design and supply voltages have dropped below 1V. As modern RF designers now face the serious challenge of maintaining a high-degree of wideband linearity and low noise with the limitations of a low supply voltage, passive front-ends may become an attractive alternative to conventional front-end designs. The high-frequency performance and power consumption of passive mixers will improve with scaling in much the same way as standard digital logic circuits. The excellent noise performance of the passive mixer will make front-ends with no RF LNA, and perhaps without any passive gain, feasible for future radios. Placing all voltage amplification at
low frequency (after passive downconversion) where it can be linearized with feedback will allow systems to meet linearity specs at very low supply voltages.

Another benefit of the passive front-end, already discussed briefly in chapter 4, is the sharp filtering effect created by the input impedance of the mixer. When the mixer is connected to the antenna directly or through an LC network, its input impedance causes attenuation of wideband interference before it can enter the receiver front-end. This is analogous to having an extremely high Q LC tank in parallel with the antenna and tuned to the mixer switching frequency. The fundamental limit on the attenuation achieved by this mechanism is set by the ratio of the source resistance to the mixer switch resistance (4.22), making extreme rejection ratios possible, particularly at low frequencies.

However, at high-frequencies, input parasitics and device capacitance will reduce the voltage gain of the mixer and thus degrade attenuation. Simulation data from a quadrature passive mixer ($D = 0.25$) directly connected to a 50 $\Omega$ source, suggest that 15-20dB rejection can be achieved at 2.4GHz in 130nm CMOS without a noise or voltage gain penalty. Fortunately, as device dimensions scale further downward, this filtering mechanism will become more effective because the resistance and capacitance of the mixer switches will continue to drop, enabling larger rejection ratios at higher frequencies and possibly supplanting SAW filters in many future radios.


