# New RF Transmitter Techniques for RFID, Cellular, and mmW Applications



NaiChung Kuo Ali Niknejad

# Electrical Engineering and Computer Sciences University of California at Berkeley

Technical Report No. UCB/EECS-2019-158 http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-158.html

December 1, 2019

Copyright © 2019, by the author(s). All rights reserved.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.

Acknowledgement

I wish to thank my advisor, family, and friends.

# New RF Transmitter Techniques for RFID, Cellular, and mmW Applications

by

Nai-Chung Kuo

A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy

in

Engineering - Electrical Engineering and Computer Sciences

in the

Graduate Division

of the

University of California, Berkeley

Committee in charge:

Professor Ali M. Niknejad, Chair Professor Elad Alon Professor Liwei Lin

Fall 2018

The dissertation of Nai-Chung Kuo, titled New RF Transmitter Techniques for RFID, Cellular, and mmW Applications, is approved:

| Chair | <br>Date |  |
|-------|----------|--|
|       | <br>Date |  |
|       | <br>Date |  |

University of California, Berkeley

# New RF Transmitter Techniques for RFID, Cellular, and mmW Applications

Copyright 2018 by Nai-Chung Kuo

#### Abstract

#### New RF Transmitter Techniques for RFID, Cellular, and mmW Applications

by

Nai-Chung Kuo

Doctor of Philosophy in Engineering - Electrical Engineering and Computer Sciences

University of California, Berkeley

Professor Ali M. Niknejad, Chair

Reader/Transmitter (Tx) designs for inductive-power-transfer (IPT) systems have to power up the tag and must be able to listen to it. The low power transfer efficiency (PTE) and the high Tx-to-Rx interference are the main challenges when a tiny tag is involved. To improve PTE, an analytical approach for optimizing the Tx coil and the miniature rectenna has been developed. The equation-based approach optimizes the IPT design rapidly, locating the optimal IPT frequency and the coil geometries. An optimized 2.2-mm IPT uses a RF power of 33.1 dBm, and the designed 0.01-mm<sup>2</sup> CMOS rectenna harvests a dc power of 0.1 mW. To suppress the Tx-to-Rx leakage, a new T/Rx architecture has been invented with a two-tone Tx that simultaneously charges the tag and excites a third-order intermodulation (IM3) frequency from the tag nonlinearity. The IM3 tone is modulated by the tag data and transmitted back to the reader Rx via the same coupled-coils. The Tx/Rx frequency separation allows significant filtering on the Tx-to-Rx leakage. This technique was adopted in IPT systems and far-field systems with a custom-designed tag and even demonstrated with a commercial UHF Gen2 tag. Multi-tone Tx has also been studied for further performance improvement.

The other key application of RF transmitters is cellular communication. We have fabricated an all-digital CMOS Tx core on three interposers with high output power, > 50% efficiency, and a collective bandwidth from 0.4 to 4 GHz. To further achieve frequency reconfiguration with a single package, a band-selecting interposer was designed to combine three identical CMOS Tx cores. The band selection is carried out by reconfiguring the switching devices in the CMOS PAs. Peak power higher than 23 dBm and efficiency better than 25% are achieved from 0.4 to 4 GHz, by rotating the three sub-Txs. Finally, an E-band OOK-QPSK Tx element in 28nm bulk CMOS is presented. The Tx element is a suitable building block for digitally-modulated phased arrays and high-speed communications. Employing the Tx elements, the implemented Watt-level-EIRP mmW digital array demonstrates good efficiency and the capability of synthesizing a pattern null in a given direction. The leakage-suppression technique exploits the combination redundancy as the

array synthesizes the desired spatial symbols, so the conventional high-resolution elements are not required.

# Contents

| Co | onter                                                                 | nts                                                                                                                                                                                                                                                                          | i                                                          |
|----|-----------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------|
| Li | st of                                                                 | Figures                                                                                                                                                                                                                                                                      | iii                                                        |
| Li | st of                                                                 | Tables                                                                                                                                                                                                                                                                       | xi                                                         |
| 1  | <b>INT</b><br>1.1<br>1.2<br>1.3<br>1.4<br>1.5                         | <b>CRODUCTION</b> Emerging Applications for Miniature RFID                                                                                                                                                                                                                   | <b>1</b><br>1<br>8<br>12<br>15<br>19                       |
| 2  | <ul> <li><b>IPT</b></li> <li>2.1</li> <li>2.2</li> <li>2.3</li> </ul> | The Designs for a Miniature Rectenna         IPT Design Example based on EM Simulation for a CMOS Tag with 0.1 mm         Coil Size       Equation-Based Optimization for Inductive Power Transfer to a Miniature         CMOS Rectenna       Equation in Weakly-Coupled IPT | <ul> <li>22</li> <li>22</li> <li>27</li> <li>51</li> </ul> |
| 3  | Upl<br>3.1<br>3.2<br>3.3<br>3.4<br>3.5<br>3.6                         | ink Designs for a Miniature RectennaBackscattering UplinkHD2 UplinkNear-Field IM3 UplinkFar-Field IM3 Uplink:Custom TagFar-Field IM3 Uplink:Commercial UHF Gen-2 TagIM3 Uplink with Three-Tone Tx for Enhanced Uplink Power                                                  | <b>55</b><br>61<br>65<br>71<br>77<br>83                    |
| 4  | Wic<br>4.1<br>4.2<br>4.3<br>4.4                                       | <b>leband and Efficient RF Transmitters</b> All-Digital RF Transmitter System Overview         Design And Analysis of the Inverse Class-D PA         Design of the CMOS Phase Modulator         Interposer Designs                                                           | <b>104</b><br>104<br>105<br>111<br>121                     |

|          | 4.5  | Measured Results                                                        | 127 |
|----------|------|-------------------------------------------------------------------------|-----|
|          | 4.6  | Design of the Band-Switching Interposer                                 | 139 |
|          | 4.7  | Switch PA Reconfigured for Band Switching                               | 150 |
|          | 4.8  | Measured Results                                                        | 157 |
|          | 4.9  | Performance Comparison                                                  | 166 |
|          | 4.10 | Appendix                                                                | 169 |
| <b>5</b> | mm   | W Constellation Formation Exploiting Combination Redundancy             | 171 |
|          | 5.1  | Leakage Suppression in Conventional Array                               | 171 |
|          | 5.2  | Combination Redundancy and Leakage Suppression in a Digitally-Modulated |     |
|          |      | Phased Array                                                            | 175 |
|          | 5.3  | mmW Tx Element                                                          | 182 |
|          | 5.4  | Phased Array Implement and Characterization                             | 192 |
|          | 5.5  | Measured Results on Leakage Suppression                                 | 204 |
|          | 5.6  | Appendix                                                                | 210 |
| 6        | Con  | clusion and Future Works                                                | 214 |

# List of Figures

| 1.1  | (a) SHIELD motivation and (b) the desired "probe" illustrated by DARPA                                         | 2  |
|------|----------------------------------------------------------------------------------------------------------------|----|
| 1.2  | Photograph of the miniature dielet developed by UC Berkeley team                                               | 3  |
| 1.3  | Simplified block diagram of the (a) conventional backscattering UL and (b) IM3                                 |    |
|      | UL using the upper IM3 component.                                                                              | 7  |
| 1.4  | Frequency allocation in the United States from 300 MHz to 3 GHz                                                | 9  |
| 1.5  | Conventional parallel path multi-band/multi-mode TRx and next-generation TRx                                   |    |
|      | concept based on adaptive RF function blocks                                                                   | 10 |
| 1.6  | Illustration of the realized single-band Tx packages and the single-output package.                            | 13 |
| 1.7  | Phased array spatial leakage suppression for reduced interference to a second device.                          | 15 |
| 2.1  | IPT schematic, design parameters, and die photographs for the 2 and 5 GHz                                      |    |
|      | designs in 65-nm CMOS technology.                                                                              | 23 |
| 2.2  | Reader coil optimization. (a) Mutual inductance $M$ . (b) Self resistance $R_{reader}$ .                       |    |
|      | (c) Optimized quantity $M^2 \omega^2 / R_{reader}$ .                                                           | 25 |
| 2.3  | Tag inductor turns and width design. (a) Mutual inductance $M$ . (b) Tag induc-                                |    |
|      | tance $L_{tag}$ . (c) Tag inductor quality factor $Q_{tag}$ .                                                  | 26 |
| 2.4  | Measured tag output voltage for the $2/4.7$ GHz tags versus (a) varactor bias and                              |    |
|      | (b) reader power (under the optimal varactor bias)                                                             | 26 |
| 2.5  | Measured tag output voltage for the $2/4.7$ GHz tags versus (a) varactor bias and                              |    |
|      | (b) reader power (under the optimal varactor bias)                                                             | 27 |
| 2.6  | (a) The z-direction magnetic field induced by a uniform unit-current loop $(B_{z,dc,I=1})$ .                   |    |
|      | (b) $FOM_{Tx}$ under a low frequency                                                                           | 29 |
| 2.7  | Current distributions and mathematical models for (a) a single-ended (SE) coil                                 |    |
|      | and (b) a differential (DF) coil                                                                               | 30 |
| 2.8  | (a) $dB[B_{z,dc,I=1}^2 sin(\beta l)^2/l^3]$ : last term on the RHS of (2.12). (b) Last term on                 |    |
|      | the RHS of (2.13): $dB[4B_{z,dc,I=1}^2 \sin(\beta l/2)^2/l^3]$ . $(d, \epsilon_{eq}) = (1, 1)$ .               | 33 |
| 2.9  | (a) Maximum $dB[B_{z,dc,I=1}^2 sin(\beta l)^2/l^3]$ and (b) maximum $dB[4B_{z,dc,I=1}^2 sin(\beta l/2)^2/l^3]$ |    |
|      | to the operation frequency. $(d, \epsilon_{eq}) = (1, 1)$ .                                                    | 33 |
| 2.10 | Optimized reader-coil $FOM_{Tx}$ for the SE coil configuration: (a) substrate thick-                           |    |
|      | ness $h = 0.8 \text{ mm}$ (b) $h = 1.6 \text{ mm}$                                                             | 36 |
| 2.11 | Optimized reader-coil $FOM_{Tx}$ for the DF coil configuration: (a) substrate thick-                           |    |
|      | ness $h = 0.8 \text{ mm}$ (b) $h = 1.6 \text{ mm}$                                                             | 37 |
|      |                                                                                                                |    |

| 2.12         | (a) Inductor reactance $(\omega L_{tag})$ and enclosed area $(A_{tag})$ and (b) inductor quality factor $(Q_{tag,L})$ with different rectenna coil designs    | 39             |
|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|
| 2.13         | Schematic and die photograph of the designed IPT and CMOS rectenna 4                                                                                          | 0              |
| 2.14         | Rectenna $FOM_{Rx}$ with different rectenna coil designs                                                                                                      | 1              |
| 2.15         | (a) Photograph of the three PCB readers: SE1, SE2, and DF. (b) EM-simulated                                                                                   |                |
|              | $FOM_{Tx}$ for the three reader coils                                                                                                                         | 4              |
| 2.16         | EM-simulated input resistance, mutual inductance, and $FOM_{Tr}$ for the designed                                                                             |                |
|              | DF coil with and without the ground plane.                                                                                                                    | 5              |
| 2.17         | Measured input reflection coefficient (S11) for the three reader coil designs 4                                                                               | 6              |
| 2.18         | Measured rectenna output voltage versus the varactor bias for the three IPT                                                                                   | .0             |
| 2.10         | designs                                                                                                                                                       | 16             |
| 2 19         | Measured reflection powers at the two inputs of the DF reader coil versus the                                                                                 | .0             |
| 2.10         | phase difference between the two inputs                                                                                                                       | $\overline{7}$ |
| 2 20         | Optimized $EOM_{\pi}$ for the SE coil on 1.6-mm EB4 with coupling distance of 1.2                                                                             |                |
| 2.20         | $m_{1x}$ (The results with coupling distance of 1.1 mm are also plotted for the next                                                                          |                |
|              | dosign oxample [22])                                                                                                                                          | 0              |
| 2 21         | Bectenna FOMs $(FOM_{\rm p})$ with different rectenna coil designs with inductor size                                                                         | :0             |
| 2.21         | of 200 $\mu m \times 200 \mu m$                                                                                                                               | 6              |
| າ າາ         | Calculated $R$ , and $R$ , and $R$ , and $R$ , and $R$ is the set of $L/\lambda$ . The                                                                        | 0              |
| 4.22         | (closed form) analytical approximations are approximated $f(r)$ and $f(r)$ (closed form) analytical approximations are approximated 5                         | 1              |
| ე ევ         | $\begin{array}{c} (closed-form) analytical approximations are almotated $                                                                                     | 11<br>50       |
| 2.20         | Schematic and photograph of the example IPT systems $5$                                                                                                       | )2<br>(2       |
| 2.24<br>2.25 | Simulated and measured output de voltage versus input power for By2                                                                                           | ю<br>:Л        |
| 2.20         | Derformance comparison between $P_{x2}$ and $P_{y2}$ . Only the power up curves are                                                                           | / <del>1</del> |
| 2.20         | shown                                                                                                                                                         | 54             |
|              | SHOWII                                                                                                                                                        | 14             |
| 3.1          | (a) Block diagram for the backscattering measurement. (b) Output spectrum of                                                                                  |                |
|              | the backscattering signal. (c) Demodulated baseband constellation using oscillo-                                                                              |                |
|              | scope and square demodulation                                                                                                                                 | 6              |
| 3.2          | (a) Block diagram of the first IF-based backscattering uplink. (b) Down-converted                                                                             |                |
|              | Rx spectrum with a 20-kb/s square-wave uplink signal. (c) Demodulated con-                                                                                    |                |
|              | stellation for a 20-kb/s PRBS uplink                                                                                                                          | 59             |
| 3.3          | (a) Block diagram of the second IF-based backscattering uplink. (b) Down-                                                                                     |                |
|              | converted Rx spectrum with a 40-kb/s square-wave uplink signal. (c) Constella-                                                                                |                |
|              | tion for a 10-kb/s PRBS uplink using square demodulation (top) and quadrature                                                                                 |                |
|              | demodulation (bottom). $\ldots \ldots \ldots$ | 50             |
| 3.4          | IPT block diagram with uplink employing (a) conventional direct backscattering                                                                                | -              |
| -            | uplink and (b) proposed SH uplink                                                                                                                             | ;3             |
| 3.5          | (a) Schematic for the SH uplink prototype. (b) Measured Rx SH power and noise                                                                                 | -              |
| 0.0          | (at 3.6 GHz) versus the varactor bias.                                                                                                                        | <b>;</b> 4     |
| 3.6          | (a) Measured Rx spectrum for the SH uplink. (b) Measured Rx spectrum for the                                                                                  | -              |
| 5.0          | fundamental backscattering.                                                                                                                                   | 56             |
|              |                                                                                                                                                               | ~              |

| 3.7          | (a)Schematic and (b)die/PCB photographs of the designed CMOS rectenna and IPT in the IM3 uplink.                                                                                 | 67  |
|--------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| 3.8          | (a) Rectenna dc voltage versus the varactor bias for one-tone (CW) and two-tone excitation. (b) Simulated rectifier ac swing $(V_{rect})$ , output voltage $(V_{out})$ , and the |     |
| 3.9          | tag-generated IM3 currents ( $I_{IM3,rect}$ ) and Rx IM3 power at 5.808 GHz (IM3 freq.).<br>System illustration of the proposed IM3 uplink embedded in the two-tone excited      | 69  |
|              | IWPT system.                                                                                                                                                                     | 70  |
| 3.10         | (a) Measured Rx IF spectrum without the tag. (b) Measured Rx IF spectrum                                                                                                         |     |
|              | with a square-wave $(V_{low} = 0 \text{ V and } V_{high} = 2 \text{ V})$ applied to Vbias                                                                                        | 70  |
| 3.11         | (a) 5-Mb/s PRBS uplink EVM and $V_{out}$ versus $V_{low}$ ( $V_{high} = 2$ V). (b) Baseband                                                                                      | 71  |
| 3 12         | PCB tag schematic and photograph                                                                                                                                                 | 72  |
| 3.13         | (a) Rectifier harvested dc voltage and (b) rectifier-generated IM3 power versus                                                                                                  | 12  |
|              | the total rectifier input power (modulator not included).                                                                                                                        | 73  |
| 3.14         | (a) Measured Rx spectrum (at the LNA) output of the IM3 uplink. and (b)                                                                                                          |     |
|              | Measured Rx spectrum (at the directional coupler output) of the backscattering                                                                                                   |     |
|              | uplink                                                                                                                                                                           | 76  |
| 3.15         | Block diagram of the implemented (conventional) backscattering reader system.                                                                                                    | 78  |
| 3.16         | Scope-sampled demodulated Rx signal at the demodulator output (Q-channel).                                                                                                       | 79  |
| 3.17         | (a) Demodulated UL signal and (b) decoded data with the backscattering reader.                                                                                                   | 79  |
| 3.18         | (a) Block diagram of the proposed FDD reader system. (b) Tx waveform (with                                                                                                       | 0.1 |
| 9 10         | attenuation).                                                                                                                                                                    | 81  |
| 3.19         | Demodulated III signal and decoded data at coupling distance of (a) 20 cm and                                                                                                    | 81  |
| 5.20         | (b) 40 cm using the proposed FDD reader                                                                                                                                          | 82  |
| 3.21         | Simplified block diagram of the IM3 UL using the upper IM3 component                                                                                                             | 84  |
| 3.22         | Modeled IM3 currents and Tx waveform PAPR versus the second-tone magnitude                                                                                                       | 01  |
|              | $(a_2)$                                                                                                                                                                          | 86  |
| 3.23         | Schematic and photograph of the custom-designed rectifier                                                                                                                        | 86  |
| 3.24         | Simulated rectifier (a) output voltage and (b) input $ S_{11} $ versus the power deliv-                                                                                          |     |
|              | ered into the rectifier.                                                                                                                                                         | 87  |
| 3.25         | Measured rectifier input $ S_{11} $ and harvested dc voltage versus the incident (source)                                                                                        |     |
|              | power                                                                                                                                                                            | 88  |
| 3.26         | Two-tone waveform and envelope with 11-dBm peak power ( $V_{peak} = 1.12$ ) for                                                                                                  |     |
| a a <b>-</b> | $PAPR = 1.1, 1.3, and 1.5, \dots, \dots, \dots, \dots, \dots, \dots, \dots, \dots, \dots, \dots$                                                                                 | 89  |
| 3.27         | Measured and simulated IM3 power and harvested dc voltage                                                                                                                        | 89  |
| 3.28         | (a) Calculated (a) IM3 current, (b) Tx waveform PAPR, (c) waveform peak value,                                                                                                   | 01  |
| 2.20         | and (d) normalized IM3 current. $\dots \dots \dots$                              | 91  |
| ა.29         | Calculated waveform FAFK and normalized IW3 current for (a) $\phi_1 + \phi_3 = \pi/2$<br>and (b) $\phi_1 + \phi_2 = \pi$                                                         | 05  |
| 3 30         | Calculated normalized IM3 current excited by two-tone and three-tone Tx wave                                                                                                     | 90  |
| 5.50         | form.                                                                                                                                                                            | 94  |
|              |                                                                                                                                                                                  | -   |

| 3.31 | (a) Optimal three-tone Tx envelopes with peak voltage of 1 and PAPR of 1.1 and 1.3. (b) Two-tone Tx envelopes with the auxiliary tone turned off                              | 95           |
|------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|
| 3.32 | Measured and simulated IM3 power and harvested dc voltage with optimal three-<br>tone waveforms for different PAPR                                                            | 96           |
| 3.33 | (a) System block diagram. (b) Measured spectrum at critical system nodes                                                                                                      | 98           |
| 3.34 | Measured reader Rx spectrum (after LNA) of the IM3 uplink with (a) two-tone<br>and (b) three-tone Tx waveform.                                                                | 99           |
| 3.35 | Demodulated (PRBS) constellation with (a) two-tone and (b) three-tone Tx wave-<br>form                                                                                        | 99           |
| 3.36 | Block diagram of the WPT/UL system communicating with a commercial UHF                                                                                                        | 101          |
| 3.37 | UL signal after quadrature down-conversion and digital filtering with (a) two-tone<br>and (b) three tone Ty waveform                                                          | 101          |
| 3.38 | EVM distribution with (a) two-tone and (b) three-tone Tx waveform                                                                                                             | $102 \\ 103$ |
| 4.1  | Block diagram of the realized all-digital RF transmitter with on-chip amplitude<br>and phase modulator and on-interposer transformer.                                         | 105          |
| 4.2  | (a) Schematic of the inverse Class-D cell ( $V_{DD} = 2.5, R_{on} = 1$ ). (b) Power (in dBm) vs. $Z_L$ . (c) DE vs. $Z_L$ . The switch current/voltage waveform with $Z_L$ of |              |
| 4.3  | (d) $10 \Omega$ , (e) $100 \Omega$ , and (f) $10+10j \Omega$                                                                                                                  | 106          |
| 4.4  | inverse Class-D power core versus $x \equiv R_L/R_{on}$                                                                                                                       | 108          |
| 4.5  | $R_{on} \ (x \equiv R_L/R_{on})$                                                                                                                                              | 109          |
| 4.6  | conductance is swept with step size of 0.01 S                                                                                                                                 | 110          |
| 1.0  | current with (b) $Z_L=10 \ \Omega$ and (c) $Z_L=10+10j \ \Omega$                                                                                                              | 112          |
| 1.1  | $(Z_L)$ at (a) 0.6 and (b) 3.6 GHz.                                                                                                                                           | 112          |
| 4.8  | Simulated drain efficiency (DE) for the designed inverse Class-D core vs. the load                                                                                            |              |
| 1.0  | impedance $(Z_L)$ at (a) 0.6 and (b) 3.6 GHz.                                                                                                                                 | 113          |
| 4.9  | Schematic of (a) the digitally-modulated phase modulator (b) current DAC in<br>the IO mixer and (c) IO integrator                                                             | 11/          |
| 4.10 | (a) Mixer I/Q bias current (b) mixer output fundamental voltage $V_{mixer,fund}$ , and<br>(c) $V_{mixer} = PM$                                                                | 114          |
| 4.11 | (c) $V_{mixer,fund}$ phase and phase step vs. $PM_{code}$                                                                                                                     | 115          |
| 1 10 | CMOS output $(D_{out})$                                                                                                                                                       | 116          |
| 4.12 | integrator output 1/Q waveform with frequency of (a) 0.6 and (b) 3.6 GHz. Four integrator currents are tested.                                                                | 118          |

| 4.13 | (a) Phase step for the 0.6-GHz CML-CMOS output $(D_{out})$ to $PM_{code}$ . (b) Simulated $D_{out}$ noise.                                           | 119          |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|
| 4.14 | The 0.6-GHz mixer (a) output I/Q current composition and (b) output voltage $(V_{mirer})$ for six sample phase codes. Integrator SEL = 13            | 119          |
| 4.15 | The 0.6-GHz mixer (a) output I/Q current composition and (b) output voltage $(V_{mixer})$ for six sample phase codes. Integrator SEL = 1             | 120          |
| 4.16 | (a) Phase step for the 3.6-GHz CML-CMOS output $(D_{out})$ to $PM_{code}$ . (b) Simulated $D_{out}$ noise.                                           | 121          |
| 4.17 | The 3.6-GHz mixer (a) output I/Q current composition and (b) output voltage $(V_{-})$ for six phase codes. Integrator SEL = 13                       | 199          |
| 4.18 | $(v_{mixer})$ for six phase codes. Integrator SEL = 15                                                                                               | 122          |
| 4.19 | HDI interposers                                                                                                                                      | $123 \\ 123$ |
| 4.20 | $R_p, X_{p1}, X_{p2}$ , and $X_{p1}//X_{p2}$ as functions of the operating frequency. The input impedance $Z_{in} = \frac{R_p}{/jX_{p1}}//jX_{p2}$ . | 126          |
| 4.21 | Simulated input impedances for LB, MB, $HB_A$ ( $C_s = 1.5$ pF), and $HB_B$ ( $C_s = 0.5$ pF) vs. the operating frequency                            | 126          |
| 4.22 | Simulated (a) peak power and (b) drain efficiency for the designed transmitter                                                                       | 100          |
| 4.23 | (a) Chip photograph. (b) Front and back view of the LB package. (c) Pho-                                                                             | 120          |
| 4.24 | tographs of the LB, MB, and HB interposers after die attachment Measured peak power and drain/system efficiency for the LB, MB, HBA, and             | 129          |
| 4.25 | HBB packages                                                                                                                                         | 130<br>131   |
| 4.26 | Phase responses at (a) 0.6 GHz with $SEL = 5$ , (b) 0.6 GHz with $SEL = 13$ , and (c) 3.6 GHz with $SEL = 13$                                        | 139          |
| 4.27 | Measured 64-QAM constellation and Tx performance at the six testing frequen-                                                                         | 102          |
| 4.28 | cies: 0.6, 1.2, 1.8, 2.4, 3.0, 3.6 GHz                                                                                                               | 133          |
| 4.29 | WLAN signal vs. the signal PAPR                                                                                                                      | 135          |
| 4.30 | frequencies (i.e. 0.6, 1.2, 1.8, 3.0, 3.6 GHz)                                                                                                       | 135          |
| 1 31 | (MB)                                                                                                                                                 | 136          |
| 4.01 | signal vs. the signal PAPR.                                                                                                                          | 137          |
| 4.32 | six testing frequencies from 0.6 to 3.6 GHz.                                                                                                         | 138          |
| 4.33 | Measured LTE spectrum and demodulated 64-QAM constellation at (a) 1.2 GHz (LB), (b) 2.4 GHz (MB), and (c) 3.6 GHz $(HB_B)$                           | 138          |
| 4.34 | (a) Photograph and (b) block diagram and schematic of the proposed single-<br>output wideband Tx package                                             | 140          |

| 4.35 | Simplified schematic of the LB sub-Tx                                                           | 141   |
|------|-------------------------------------------------------------------------------------------------|-------|
| 4.36 | Simulated power of the LB sub-Tx versus the load $(Z_{load,LB}) S_{11} (Z_0 = 50 \Omega)$ at    |       |
|      | (a) 0.6, (b) 1.0, (c) 1.6, and (d) 2.2 GHz                                                      | 142   |
| 4.37 | $S_{11}$ of the load $(Z_{load,LB})$ from 0.4 to 2.2 GHz                                        | 142   |
| 4.38 | Simulated output (a) power and (b) DE for the LB sub-Tx with the interposer                     |       |
|      | lumped model and EM-simulated S-parameters.                                                     | 143   |
| 4.39 | Simplified schematic of the MB sub-Tx.                                                          | 145   |
| 4.40 | Simulated output power of the MB sub-Tx versus the transformer load impedance                   |       |
|      | $(Z_{load,MB})$ in terms of $S_{11}$ .                                                          | 145   |
| 4.41 | $S_{11}$ of the transformer load impedance $(Z_{load,MB})$ versus frequency from 1.0 to         |       |
|      | 2.8 GHz with step size of 0.2 GHz.                                                              | 147   |
| 4.42 | Simulated output power for the MB sub-Tx with the interposer simplified lumped                  |       |
|      | model                                                                                           | 148   |
| 4.43 | Simplified schematic of the HB sub-Tx                                                           | 148   |
| 4.44 | Simulated output power for the HB sub-Tx with the interposer lumped model                       | 149   |
| 4.45 | Simulated Tx output power (with interposer S-parameter) with the three sub-Tx                   |       |
|      | turned on alternatively                                                                         | 151   |
| 4.46 | Simulated Tx output power (with interpsoer S-parameter) with the three sub-Tx                   |       |
|      | turned on alternatively                                                                         | 152   |
| 4.47 | (a) Equivalent inductor series reisitance $(R_{eq})$ and (b) $R_{eq}$ v.s. switch on-resistance | .152  |
| 4.48 | Simulated Tx power (with interposer S-parameter) with non-ideal supply SPDT                     |       |
|      | switches (with non-zero on-resitance)                                                           | 153   |
| 4.49 | Schematic illustration with the switch PA reconfigured for the supported band-                  |       |
|      | switching scheme: DTx Reconfiguration 2                                                         | 154   |
| 4.50 | (a) EM-simulated power and (b) efficiency of the two DTx reconfiguration schemes                |       |
|      | compared to that with ideal external switches                                                   | 155   |
| 4.51 | Simulated parasitic loading for the MB sub-Tx $(Z_{para,MB})$ with DTx Reconfigu-               |       |
|      | ration 1 and 2.                                                                                 | 157   |
| 4.52 | Simulated parasitic loading for the HB sub-Tx $(Z_{para,HB})$ with DTx Reconfigu-               |       |
|      | ration 1 and 2. $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$                                    | 158   |
| 4.53 | Measaured (a) Tx package output power and (b) DE with almost-ideal external                     | 1.0.0 |
|      | band-selecting switches (solder on/off)                                                         | 160   |
| 4.54 | Measaured (a) Tx output power and (b) drain efficiecy (DE) with DTx reconfig-                   | 1.01  |
|      | uration 2                                                                                       | 161   |
| 4.55 | Normalized Tx output magnitude and phase v.s. AM code                                           | 162   |
| 4.56 | AM-PM phase shift versus PA load impedance.                                                     | 162   |
| 4.57 | Phase responses at (a) 0.85, (b) 2.2, (c) 3.1 GHz.                                              | 164   |
| 4.58 | Measured 64-QAM constellation and Tx performance at the six testing frequen-                    | 105   |
| 1 50 | cies: (a) $0.85$ , (b) $2.2$ , (c) $3.1$ GHz.                                                   | 165   |
| 4.59 | Measured WLAN spectrum at 2.2 GHz (MB Sub-Tx).                                                  | 105   |
| 4.60 | Measured LTE spectrum and demodulated 64-QAM constellation at (a) $0.85$                        | 100   |
|      | GHZ, (D) 2.2 $GHZ$ , and (C) 3.1 $GHZ$                                                          | 100   |

| 4.61         | Schematics for the LO and (LVDS Rx and DeSer) clock receivers and the LVDS data receiver (for the AM and PM streams)                                          | 170 |
|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| 5.1          | Illustration of an 8-element array deployed along the y-axis.                                                                                                 | 172 |
| 5.2          | Array pattern along $\phi = 90^{\circ}$ with uniform antenna excitations.                                                                                     | 173 |
| 5.3          | Array pattern with sidelobe canceller                                                                                                                         | 173 |
| 5.4          | Array EIRP degradation with sidelobe canceller.                                                                                                               | 174 |
| 5.5          | Dolph–Tschebyscheff array pattern and element excitations.                                                                                                    | 174 |
| 5.6          | (a) Element output symbols and the spatially-combined symbols with two ele-                                                                                   |     |
|              | ments. (b) Spatial symbols with 8 elements and three QPSK constellations. (c) Spatial symbols with 6 elements and the only two possible 16QAM constellations. | 176 |
| 5.7          | Radiation patterns for 3 of the 70 combinations that synthesize spatial symbol                                                                                |     |
|              | 4+4j                                                                                                                                                          | 180 |
| 5.8          | Array pattern with uniform excitation (solid line) and the lowest radiation ex-                                                                               |     |
|              | ploiting the combination redundancy (symbol line)                                                                                                             | 181 |
| 5.9          | Radiation patterns for the four 16QAM symbols that use the optimal combina-                                                                                   |     |
|              | tions for the lowest radiation at $\theta = 20^{\circ}$ .                                                                                                     | 181 |
| 5.10         | Array pattern with uniform excitation (solid line) and the lowest radiations ex-                                                                              |     |
|              | ploiting the combination redundancy (symbol lines)                                                                                                            | 183 |
| 5.11         | (a) Transmitter block diagram. (b) Die photograph. The chip size is $1.45 \times 0.87$                                                                        |     |
| <b>v</b> 10  | mm2. $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$                                                                                           | 184 |
| 5.12         | Schematic for the LO generation and distribution.                                                                                                             | 185 |
| 5.13         | (a) Input return loss of the LO port. (b) Injection pair gate swings with 5-dBm                                                                               | 100 |
| 514          | LO input source power. (c) 1x locking range.                                                                                                                  | 180 |
| 5.14<br>5.15 | Schematic for the QPSK modulator.                                                                                                                             | 101 |
| 5.10<br>5.16 | IVDS Pr schematic                                                                                                                                             | 100 |
| 5.10         | Mossured Tx output free running frequency                                                                                                                     | 180 |
| 5.18         | Tx (76.3 GHz) output phase noise                                                                                                                              | 190 |
| 5.19         | Measured and simulated linear PA S-parameters                                                                                                                 | 191 |
| 5.20         | Measured PA large-signal power and efficiency.                                                                                                                | 191 |
| 5.21         | (a) Tx output QPSK constellation at 75.4 GHz and (b) dc power breakdown                                                                                       | 192 |
| 5.22         | Photograph of the array PCB board with COB assembly. (b) Rogers PCB stackup                                                                                   | 193 |
| 5.23         | (a) Simulated S-parameters for the antenna array. (b) Simulated antenna $S_{11}$                                                                              |     |
|              | with $\pm 100 \ \mu m$ offset in the bondwire lengths.                                                                                                        | 194 |
| 5.24         | (a) Simulated antenna gain and (b) Tx EIRP for the 8 elements                                                                                                 | 196 |
| 5.25         | Simulated array EIRP with uniform excitations                                                                                                                 | 197 |
| 5.26         | Setup for far-field measurement                                                                                                                               | 197 |
| 5.27         | Measured Tx EIRP at $\theta = 0$ for the eight Tx elements                                                                                                    | 198 |
| 5.28         | Measured locking ranges for the Tx elements under antenna board input power                                                                                   |     |
|              | of 20 and 26 dBm.                                                                                                                                             | 199 |

| 5.29 | Demodulated 0.4-Gb/s QPSK constellation from the 8 elements, with and without                                 |     |
|------|---------------------------------------------------------------------------------------------------------------|-----|
|      | the IQ compensation.                                                                                          | 200 |
| 5.30 | Output phase calibration (via aligning the phases to the sixth element) for the                               |     |
|      | (a) first, (b) third, (c) fifth, and (d) seventh Tx element                                                   | 202 |
| 5.31 | (a) Tx EIRP and constellation EVM at $\theta = 0^{\circ}$ and (b) spatial leakage at $\theta = 20^{\circ}$    |     |
|      | for the redundancy-rich QPSK symbols.                                                                         | 205 |
| 5.32 | Measured spatial leakages associated with the high-EIRP QPSK constellation                                    |     |
|      | and the low-EIRP constellation with optimal combinations                                                      | 207 |
| 5.33 | Tx EIRP and constellation EVM at $\theta = 0^{\circ}$ and (b) spatial leakages at $\theta = 30^{\circ}$       |     |
|      | for the redundancy-rich 16QAM symbols                                                                         | 208 |
| 5.34 | Measured spatial leakages associated with the high-EIRP 16QAM constellation                                   |     |
|      | and the low-EIRP constellation with optimal combinations                                                      | 210 |
| 5.35 | 16QAM EVM degradation with deviation in the Rx angle                                                          | 211 |
| 5.36 | Leakage over the continuous receiving angle $(\theta)$ with the optimal combinations                          |     |
|      | recorded with resolution of $\theta = 1^{\circ}$ . (QPSK constellation synthesized at $\theta = 0^{\circ}$ ). | 212 |
| 5.37 | Lowest leakage with the optimal combinations recorded with resolution of $\theta =$                           |     |
|      | 0.01°. (QPSK constellation synthesized at $\theta = 0^{\circ}$ )                                              | 213 |
|      |                                                                                                               |     |

# List of Tables

| 1.1 | IPT design and optimization characteristics (ordered by reported date)          | 6   |
|-----|---------------------------------------------------------------------------------|-----|
| 2.1 | Optimized IPT performance using SE and DF coil                                  | 42  |
| 2.2 | IPT performance with a miniature tag                                            | 43  |
| 2.3 | Reported IPT Performance with a miniature tag (tag size $< 1 \text{ mm}^2$ )    | 47  |
| 2.4 | Reported IPT performance with a miniature tag [21].                             | 49  |
| 2.5 | Reported IPT Performance with a miniature tag $[22]$                            | 50  |
| 3.1 | System parameters for the far-field WPT system                                  | 74  |
| 3.2 | Uplink carrier generation with two-tone Tx waveform.                            | 90  |
| 3.3 | Uplink carrier generation with three-tone Tx waveform                           | 95  |
| 3.4 | Reported FDD RF-powered tag UL carrier generation.                              | 100 |
| 4.1 | Reported CMOS CW performance with peak power $> 26$ dBm (ordered by op-         |     |
|     | erating frequency).                                                             | 167 |
| 4.2 | Peak performance of wideband/multi-band CMOS Tx designs                         | 168 |
| 4.3 | Reported 20-MHz WLAN CMOS Txs with average power $> 17$ dBm                     | 168 |
| 4.4 | Reported $10/20$ MHz LTE CMOS Txs with average power > 20 dBm (ordered by       |     |
|     | operating frequency)                                                            | 169 |
| 5.1 | Combination redundancy for the QPSK/QAM spatial symbols.                        | 177 |
| 5.2 | 16QAM efficiency achieved by conventional array and the proposed digital array. | 178 |
| 5.3 | Spatially-combined QPSK/16QAM constellations with the highest EIRP.             | 203 |
| 5.4 | Bulk CMOS/SiGe phased array Txs with antenna integration at around 60 GHz       | 203 |
| 5.5 | Spatially-combined QPSK constellations and the corresponding leakage            | 206 |
| 5.6 | Spatially-combined 16QAM constellations and the corresponding leakage.          | 209 |
|     |                                                                                 |     |

#### Acknowledgments

First, I would like to express my deepest gratitude to my advisor, Prof. Niknejad, for his guides and supports over the past six years. Together we have worked on many projects and produced a dozen IEEE journal articles. Well-known for his contributions to CMOS RF circuits and systems, the real-world Ali is also a gentle and delightful person that perfectly serves as role models in many aspects of life beyond research, and the access of which is a privilege of his students. I have never seen him lose temper or even say a harsh word. His attention to family is also a great demonstration to me on how to balance the efforts dedicated to the research and to the family. Although eventually I struggled on both sides, I could have done much worse if I hadn't been in Ali's group and covered by his understanding, patient and funding. I was also treated equally given that I always turned down his invitation to play soccer.

During this journey, I am glad to have the opportunity to work with Prof. Nikolić and Prof. Alon on the DARPA RF-FPGA project. Prof. Nikolic has helped me preparing two (very lengthy) T-MTT journal papers to conclude my efforts in the RF-FPGA program. Prof. Alon chaired my qualification exam and was also a committee member in my preliminary exam. Prof. Lin from the ME department is also a committee member in my qualification exam. In addition, he has co-PI the DAPRA SHIELD program, and a lot of my research outcomes and publications are dedicated to this program.

I am fortunate to be in part of Berkeley Wireless Research Center (BWRC) where I met many talented scholars and good friends. The staffs are helpful and friendly. The coauthors and research colleagues that contributed to this thesis are: Bo Zhao, Jun-Chau Chien, Bonjern Yang, Lingkai Kong, Angie Wang, Chaoying Wu, Vason Srini, Lorenzo Lotti, Andrew Townley, Greg LaCaille, Sameet Ramakrishnan, Luke Calderin, and Antonio Puglielli My life at BWRC was also enriched by the company of many other graduate students and visitors: Yi-An Li, Pi-Feng Chiu, Pengpeng Lu, Yongjun Li, Jaeduk Han, Luya Zhang, Ping-Chen Huang, Yue Lu, Yu-Ching Yeh, Haoyen Tang, Ali Ameri, Sashank K, Nima B, Seobin Jung, Yida Duan, Wen Li, Xiao Xiao, Kosta Trotskovsky, Shinwon Kang, Siva Thyagarajan, Lu Ye, Steven Callender, Min-Han Hsieh, and Wilson Chang. The staff members: Brian, Fred, Jame Dunn, Candy Corpus, Melissa Trevizo, Jessica, Olivia, and Sarah.

I would also like to thank my parents for their supports and my Master thesis advisor, Prof. Wang, at NTU for leading me into the area of microwave engineering.

Finally, my wife Tina has been accompanying me since 2008. If it hadn't been her, I am not sure I would have the momentum to choose to study abroad in the first place (and eventually met Ali and the BWRC colleagues). During my PhD study, we eventually came to live together and spent a lot of time traveling and irrigating our relationship. Without her I would have drained all I have on research, maybe finished this thesis a couple years earlier, but would have missed so much fundamental happiness (and also some bitterness) the life has to offer.

# Chapter 1 INTRODUCTION

# 1.1 Emerging Applications for Miniature RFID

#### **Emerging Applications**

Near-field wireless power transfer through inductive coupling, such as strongly-coupled contactless charging systems, has been used for decades [1]. While strongly-coupled inductive power transfer (IPT) systems usually employ a secondary coil larger than 10 cm, many applications, such as wireless sensing and wafer level testing [2–4], radio-frequency identification (RFID) [5], and implanted medical devices (IMDs) [6–8] all require the receivers to have a small footprint to be easily carried, so the size of the secondary coil in such IPT systems are usually smaller than 1 cm. An emerging application, proposed by the Defense Advance Research Project Agency (DARPA), is to have miniature RFID-based passive sensors mounted on electronic devices to detect counterfeit parts [9–11]. The size of the receiver or the secondary coil in such IPT systems might be as small as 0.1 mm.

### **DARPA SHIELD Program**

According to [9–11], the Pentagon has known for years that a significant number of the replacement parts it buys for its missile guidance and satellite systems contain substandard counterfeit microchips, but finding these fakes—as they make their way through a complex global supply chain of fabrication facilities, assembly plants and parts distributors—can be like searching for a needle in a haystack. The military estimates that up to 15 percent of all spare and replacement parts for its weapons, vehicles and other equipment are counterfeit, making them vulnerable to dangerous malfunctions. Fig. 1.1.a is a photograph of the chip-recycling manufacturing, found in one of the computers at VisionTech's offices [11]. Government investigators concluded that it likely portrays the production of counterfeit chips for the company.

The goal of DARPA's SHIELD (Supply Chain Hardware Integrity for Electronics Defense) program is to eliminate counterfeit integrated circuits from the electronics supply chain



Figure 1.1: (a) SHIELD motivation and (b) the desired "probe" illustrated by DARPA

by making counterfeiting too complex and time-consuming to be cost effective. SHIELD aims to combine NSA-level encryption, sensors, near-field power and communications into a microscopic-scale chip capable of being inserted into the packaging of an integrated circuit. The 100  $\mu$ m × 100  $\mu$ m "dielet" will act as a hardware root of trust, detecting any attempt to access or reverse engineer the dielet. Authentication of the IC will be achieved through the use of an external probe that can provide power to the dielet and establish a secure link between the dielet and a server as well as verify the provenance of the IC. Fig. 1.1.b illustrates the desired "probe" and "dielet" to eliminate the counterfeit [9].

The previous reported approaches for IPT and uplink designs are limited and cannot be adopted to this challenging applications with the world-smallest dielet. Therefore, advanced techniques have been developed and demonstrated by our team at UC Berkeley during the projoect span from 2015 to 2018. Our team has implemented a 100  $\mu$ m × 100  $\mu$ m dielet that can be remotely powered and is capable of bi-directional communication for authentication [12]. The photograph of the dielet is shown in Fig. 1.2.

#### Current Approaches and Limitations in IPT Designs

At a coupling distance substantially lower than the electrical wavelength, inductive power transfer (IPT) is almost exclusively used to charge the rectenna [2, 6, 7, 13–25]. Under a short coupling distance, the miniature rectenna footprint, usually dictated by application, can still result in a low power transfer efficiency (PTE). For example, at a coupling distance of 1 mm, an RF-dc PTE of only -37 dB was achieved for a 0.13-mm<sup>2</sup> tag coil in a 1.5-GHz IPT [15]. The weak coupling and the small tag coil suggests pushing the IPT operation frequency into the GHz region to enhance the induced coil voltage, but using an excessively



Figure 1.2: Photograph of the miniature dielet developed by UC Berkeley team.

high frequency might drop the PTE due to the degraded coil and rectifier quality factors [21]. It has been elaborated that the appropriate IPT frequency is subject to optimization, along with the coupled-coil geometries and the rectifier design [21].

Coupled-coil optimization for two-coil [2, 6, 7, 14–17, 21, 22], three-coil [18, 23, 24], and four-coil [25] IPT have been reported. Although some works pay attention to the specific absorption rate (SAR) [17, 26] and the power delivered to the load [17, 24, 27], the PTE is usually the primary optimized target. The PTE in the RF section is the optimized target in most works, either with a given load resistance [7, 16–18, 23–25, 27, 28] or assuming an arbitrary load impedance for the rectenna coil is available[2, 6, 26], where the PTE becomes the two-port maximum gain ( $G_{max}$ ); however, rectenna designs emphasizing a small footprint or at a low frequency usually perform impedance matching by only putting a parallel capacitor to the rectenna coil, so the achieved PTE is lower than  $G_{max}$ . In addition, good PTE in the RF section does not guarantee a good RF-dc PTE with the rectifier attached. This is because the optimal RF-RF PTE might be achieved under a load impedance significantly deviated from the rectifier input resistance. Again, for practical applications the rectifier should be included in the IPT design along with the coupled-coils, and the RF-dc PTE should be optimized [21, 22].

The PTE optimization can be accelerated by analytical approaches that express the PTE in terms of the coupled-coil geometrical parameters. The reported analytical IPT optimizations in [6, 14, 24] are valid with the operation frequency lower than 50 MHz. Uniform current distribution on the coils has been assumed and the radiation resistance has been ignored. However, the two assumptions do not hold for the design context employing a

miniature rectenna coil, where the IPT frequency has to be enhanced into the GHz region [19–22], and the reader coil size is no longer negligible compared to the electrical wavelength. Moreover, the IPT operation frequency is not a design variable in [6, 14, 24], and the optimized quantities are the PTE in the RF section, not including the rectifier.

#### Achieved Breakthrough on the IPT Design

The major breakthrough on the IPT design has been detailed in our published work [29]. This work advances the previous works by including the rectifier, frequency-dependent current distribution on coil, and the radiation resistance into the analytical PTE formula for GHz applications. Still, the PTE optimization has numerous variables to be adjusted, including the coupled-coil geometries, IPT frequency, and the rectifier design. The complexity of the optimization was reduced in some works with the IPT frequency (or the reader-coil geometry) predetermined [7, 18, 22, 24]. In general, the PTE optimization has to iterate between the designs of the reader coil and the rectenna coil, as performed in [6, 7, 18, 24] to gradually approach the optimal design, and it can be time-consuming even if the optimization is analytical.

In our design context employing a miniature tag, the tag coil is exposed to a uniform magnetic field; as the result, the reader-coil and the tag-coil designs can be decoupled and optimized individually. This work identifies the figure-of-merits (FOM) of the reader coil and the rectenna. Both reader and rectenna FOMs can be optimized and plotted as functions of the IPT frequency, and they co-decide the optimal IPT frequency and the coil geometries that correspond to the highest RF-dc PTE. No iteration between the reader and the rectenna design is required. Previous efforts defining the reader and the tag FOM can also simplify the PTE optimization [16–18], but they assigned the coupling factor  $(k_{12})$  to the reader FOM, so the reader and tag designs were not completely decoupled. Our previous work [21], without formally introducing the reader and the rectenna FOM, utilizes the tag miniaturization to decouple the single-ended (SE) reader and the tag designs; however, it relies fully on the time-consuming EM simulation so the explored design space is very limited, which results in suboptimal PTE performance.

The analytical approach and the decoupling of the reader and the rectenna designs expedite the IPT design, while a better reader-coil topology is still desired to fundamentally improve the PTE. While most previous works adopted SE reader coils [2, 6, 13, 15–22], this work analyzes both the SE and the differentially-driven (DF) reader-coil topologies via the developed analytical approach and demonstrates analytically and experimentally that a DF coil outperforms the same-size SE coil. The realized IPT employing the DF coil topology (and the equation-based design approach) exhibits the highest RF-dc IPT FOM among the reported works that use a miniature rectenna. The coil size of the CMOS rectenna is only 0.01 mm<sup>2</sup> and the coupling distance is 2.2 mm. The rectifier harvests a dc power of 0.1 mW while the total required RF power is 33.1 dBm at 4.8 GHz. The required power is only 1.6 dB higher than our previous work operated at 1.2 mm [21]. The achieved coupling distance of 2.2 mm better addresses the SHIELD application [9, 10, 19–22], where the tag

would be inductively powered by a hand-held probe, and the supported coupling distance shall be substantially larger than the height of the electronics packages where the tag resides. Simulation and measurement show the counterpart SE reader coils, realized in this work for comparison, have to use a higher RF power around 40 dBm at the same coupling distance.

To summarize, the new contributions are: (i) the development and adoption of the new equation-based design method for both SE and DF coil topologies, and (ii) the realization of an IPT employing a DF reader coil with outstanding PTE. Although the PTE performance is ultimately limited by the adopted coil configuration, the analytical method is valuable for a more time-efficient design optimization. The equation-based approach considers, for the first time, the non-uniform current distribution on the coil and the associated radiation resistance, and it decouples the design of the reader and the tag designs. The analytical approach is accurate and verified by simulated and measured results from the IPT designs in this work and previous works [21, 22]. The reported IPT performances with the design procedure fully discussed in [6, 7, 14, 16–19, 21, 22, 24] are summarized in Table 1.1. Table 1.1 focuses on the design methodology, while performance comparison between this work and the previous works will be presented later.

#### Current Approaches and Limitations in Uplink Designs

Backscattering uplink (UL) has been the prevailing method for tag-to-reader communication in radio-frequency identification (RFID) systems, where the tag modulates the reflection part of the transmitted signal that delivers the wireless power [30-34]. The backscattering ULs are straightforward to implement without requiring a tag-generated UL carrier and the associated power consumption. In addition, the same-frequency wireless power transfer (WPT) and UL operation allows using a single-band antennas and passive matching networks (MNWs) for the maximum WPT efficiency and UL power, while the conventional frequency-division duplexing (FDD) reader-to-tag communications require dual-band antenna and MNWs (on both the reader and the tag ends to address the two operation frequencies [5, 35–38]). However, self-jamming is a major problem in backscattering radio systems with the same WPT and the UL frequency. A typical reader design supporting the backscattering UL is illustrated in Fig. 1.3.a. The suppression of the Tx-to-Rx leakage usually relies on a circulator or a directional coupler with rejection only around 20 dB [30, 32]. This rejection is limited by the input return loss of the reader antenna and ultimately the circulator/coupler isolation. Although the Tx-to-Rx blocker tone and the blocker phase noise can be alleviated by employing blocker cancellation techniques [33], mixer down-conversion [39], or IF-based backscattering [34], these techniques cannot decrease the leakage from the power amplifier (PA) output white noise. This noise source, linearly proportional to the Tx power, has been shown to limit the UL data rate and makes the range extension or tag miniaturization very challenging [39].

Alternatively, the hardware overhead associated with the FDD reader-to-tag communications, the dual-band passives, and the on-tag signal generation, could be worthwhile to improve the UL quality. Indeed, the FDD technique has been extensively used in RFID

| Ref.      | Publication | Tag Coil           | Coupled-      | Equation-Based | Model        |
|-----------|-------------|--------------------|---------------|----------------|--------------|
|           | Year/Title  | (mm <sup>2</sup> ) | Coil Topology | Optimization   | Non-uniform  |
|           | c           |                    |               |                | Coil Current |
| [7]       | 07/T-BioCAS | 400                | Two-Coil      | Yes            | No           |
| [24]      | 11/T-BioCAS | 75                 | Three-Coil    | Yes            | No           |
| [6]       | 15/T-BioCAS | 4.3                | Two-Coil      | No (EM-Sim)    | Yes (EM)     |
| [16]      | 16/T-BioCAS | 1 mm <sup>3*</sup> | Two-Coil      | No (EM-Sim)    | Yes (EM)     |
| [19]      | 16/IMS      | 0.01               | Three-Coil    | No (EM-Sim)    | Yes (EM)     |
| [22]      | 16/IMS      | 0.01               | Two-Coil      | No (EM-Sim)    | Yes (EM)     |
| [21]      | 16/MWCL     | 0.01               | Two-Coil      | No (EM-Sim)    | Yes (EM)     |
| [14]      | 16/JSSC     | 8                  | Two-Coil      | Yes            | No           |
| [17]      | 16/T-BioCAS | 1 mm <sup>3*</sup> | Two-Coil      | No (EM-Sim)    | Yes (EM)     |
| [18]      | 17/T-BioCAS | 1 mm <sup>3*</sup> | Three-Coil    | No (EM-Sim)    | Yes (EM)     |
| This Work |             | 0.01               | Two-Coil      | Yes            | Yes (Eqns)   |

\* Use three-dimensional cubic antenna

| Ref.      | Publication<br>Year/Title | Iterative<br>Coupled-Coil | Testing<br>Freqeuncy<br>(MHz) | Optimized<br>Quantity | Rectifier<br>Design |
|-----------|---------------------------|---------------------------|-------------------------------|-----------------------|---------------------|
| [7]       | 07/T Di-CAC               | Ver                       | (1/11/2)                      | DE DE DTE             | NL                  |
| [/]       | 0//I-BIOCAS               | res                       | 1, 5                          | RF-RF PIE             | INO                 |
| [24]      | 11/T-BioCAS               | Yes                       | 13.56                         | RF-RF PTE             | No                  |
| [6]       | 15/T-BioCAS               | Yes                       | 100-250                       | $G_{\max}$            | No                  |
| [16]      | 16/T-BioCAS               | No                        | 50-500                        | RF-RF PTE             | No                  |
| [19]      | 16/IMS                    | Yes                       | 2000                          | RF-RF PTE             | No                  |
| [22]      | 16/IMS                    | Yes                       | 2000                          | RF-dc PTE             | Integrated          |
| [21]      | 16/MWCL                   | No                        | 2000, 5000, 8000              | RF-dc PTE             | Integrated          |
| [14]      | 16/JSSC                   | No                        | 50                            | Transimpedance        | No                  |
| [17]      | 16/T-BioCAS               | No                        | 0-100                         | RF-RF PTE, PL,SAR     | No                  |
| [18]      | 17/T-BioCAS               | No                        | 60                            | RF-RF PTE             | No                  |
| This Work |                           | No                        | 500-8000                      | RF-dc PTE             | Integrated          |

Table 1.1: IPT design and optimization characteristics (ordered by reported date).



Figure 1.3: Simplified block diagram of the (a) conventional backscattering UL and (b) IM3 UL using the upper IM3 component.

systems with a miniature tag [5, 35–38]. The widely-separated WPT and UL frequencies allow the Rx noise to be decoupled from the Tx noise. With a diplexer usually seen in FDD systems, the Rx noise can be suppressed to the thermal floor of -174 dBm/Hz, and the frequency selectivity usually imposes little attenuation on the transmitted Tx power.

#### Achieved Breakthrough on the Uplink Design

Recently, a second-harmonic distortion (HD2) UL technique has been proposed [40]. The second-order nonlinearity of the diode rectifier on the tag (target) is utilized to generate a new frequency component at the second-harmonic (SH) frequency of the continuous-wave (CW) Tx. The new frequency is distant from the Tx fundamental frequency used for wireless powering, so the undesired Tx interference/noise at the Rx frequency can be filtered at the Tx output without compromising the wireless power transfer. However, the UL signal is

weak without the dedicated antenna pairs and matching networks designed at the second harmonic frequency.

To solve this issues, we have proposed the third-order intermodulation (IM3) UL principle in near-field inductive power transfer (IPT) system [39, 41] and far-field systems [41, 42]. The IM3 UL principle is illustrated in Fig. 1.3.b. It preserves the design simplicity of the same-frequency backscattering tags but has distinct WPT and UL frequencies for the low-interference FDD operation with a duplexer. Improved UL SNR, signal-to-blocker-ratio (SBR), and data rate have been demonstrated compared to the counterpart backscattering ULs. In the reported IM3 UL systems, the Tx transmits a closely-spaced two-tone carrier (i.e., 5.728/5.768 GHz in [12, 41, 43] and 890/910 MHz in [42]) and the tag nonlinearity generates an IM3 component. The new carrier is then modulated by the tag baseband data stream and transmitted back to the reader. No explicit frequency generator and modulator are required. In addition to the design simplicity, the IM3 UL only needs single-band passive antenna and MNW because the UL frequency (e.g., 5.808 GHz in [41]) and 930 MHz in [42]) is still within the passive bandwidth. Coherent demodulation is also easily to implement with the UL frequency fixed and known.

Our latest work [44] discusses the tradeoff between the tag harvested dc power and the UL power in a two-tone IM3 UL. In general, under the same PA peak power, increasing the waveform peak-to-average power ratio (PAPR) from unity decreases the rectifier harvested power while the excited IM3 component can be enhanced. The developed theory and simulation in this work [44] both show that under the same PA peak power, the UL power from a RF-powered tag can be improved with a three-tone Tx, while the rectifier harvested dc power is not compromised. Two UHF systems are implemented, and the proposed three-tone Tx waveforms are tested against the two-tone Tx waveforms. The first system uses a custom designed rectenna and the second system uses a commercial RFID inlay. Both systems exhibit 4-dB UL improvement, in terms of Rx spectrum SNR and the demodulated constellation error vector magnitude (EVM).

# **1.2** Next-Generation Wideband RF Transmitters

#### Wideband Requirement

Current projections estimate that global mobile data traffic in 2021 will reach 49 exabytes per month [45]. With the proliferation of multi-band and multi-standard communications, wireless spectrum in the lower GHz regime has become increasingly crowded. While the second (GSM) and the third generation (WCDMA) mobile communications rely only on a handful of radio bands, 4G-LTE requires support for more than 40 bands worldwide, covering spectrum from 450 MHz to 3800 MHz [46], and 5G wireless will cover even more. As seen in the US spectrum allocations in Fig. 1.4, low frequency spectrum has been crowded and saturated.

Compared to the conventional solution, where transmitters (Tx) are custom-designed to



Figure 1.4: Frequency allocation in the United States from 300 MHz to 3 GHz.



Figure 1.5: Conventional parallel path multi-band/multi-mode TRx and next-generation TRx concept based on adaptive RF function blocks

support particular bands, an adaptive Tx module with wideband capability is conceptually simpler and less costly to design. Wideband Tx and other techniques, including wideband receivers [47], tunable duplexers [48], and Tx-to-Rx leakage cancellation [49] enable nextgeneration adaptive multi-band transceivers, as illustrated in [50]. The conventional multimode TRx and next-generation TRx concept are illustrated in Fig. 1.5.

#### Current Wideband Tx Approaches

A multiband Tx or power amplifier (PA) is usually achieved by incorporating two or three band-dedicated designs on a single CMOS die [51–55]. If the frequency bands are widely separated, a diplexer/multiplexer can be used to combine them [51–54]. Typically, external passive multiplexers with high linearity and frequency selectivity are used, and the insertion loss is usually around 1 dB [56]. On the other hand, a band-selecting SPDT (or SP3T) PA switch [57, 58] is usually required to provide isolation between the sub-Tx cells if the Txs collectively cover a wide and continuous bandwidth [55, 59]. According to [58], RF switches realized in SOI CMOS or III-V materials can achieve an 1-dB insertion loss and 1-dB compression point  $(P_{1dB})$  greater than 1 W, while bulk CMOS switches have significantly lower power handing capability due to junction diodes and substrate leakage [57]. Alternatively, Tx power across a wide bandwidth can be extracted from a single PA core operating in conjunction with a wideband impedance matching network (MNW). With this approach, a high-order MNW has to be employed [60–63] and some works have load impedance control at harmonic frequencies [62, 63]. The additional loss and the bandwidth limitation of the complicated and unreconfigurable MNW are the main drawbacks. With a fixed PA output MNW, a periphery reconfigurable PA core has been shown to enhance bandwidth moderately [64].

#### General Approaches for Efficient Tx Power

The desire for efficient RF transmission has been driving the development of switched PAs, including Class-D [65–67], switched capacitor [68–70], inverse Class-D [51, 61, 71–76], and Class-E [77–80], topologies. The improved peak efficiency is from the reduced overlap between the current and the voltage waveforms at the device output. Since the input squarewave drive to the switches usually does not contain any amplitude information, the RF amplitude is usually modulated by varying the device periphery of an inverse Class-D power core [51, 61, 71–76], power combining two out-phasing Class-D [65, 66] or Class-E [79, 80] power cells, or supply modulation [69, 76, 78, 81–84]. To efficiently output signals with a high peak-to-average power ratio (PAPR), a switch-based transmitter (Tx) must have both high peak and back-off efficiencies. The device periphery-to-amplitude modulation scheme results in a back-off efficiency similar to that of a linear Class-B PA, where the drain efficiency (DE) degrades by 3 dB when the output power is 6-dB lower than the peak power. Owing to increased loss in the combiner, amplitude modulation achieved by the out-phasing combining of two nonlinear Class-D/E power cores also degrades back-off efficiency. Several works have been dedicated to increasing the efficiency by improving the power combiners [66, 80] or employing multiple supplies to alleviate the out-phasing angle [79]. Additionally, supplyto-amplitude modulation eliminates the wasted voltage headroom in the back-off operation. Typical methods employ linear/buck regulators [82–84] for envelope tracking (ET) or a lower supply voltage (Class G) for low power operation [69, 76, 78]. For example, a 0.5-to-3.5-V supply modulator with an efficiency of 80% was reported, targeting 10-MHz LTE signals [82, 84]. The main challenge for ET techniques is the supply-modulator efficiency, which drops as the bandwidth increases [85]; similarly, ignoring the cost of implementing the Class-G supplies results in an optimistic estimation of the Tx efficiency. Also, supply interferences due to the bulk-converter switching spur/noise or glitches in the switched-supply [86] degrade the signal integrity.

Another approach improves both the peak and the back-off efficiencies via better passives. It has been demonstrated that off-chip passives, fabricated on IPD [77, 87], LTCC [75] or PCB [82, 84], enhance the PA power and efficiency, because they can be designed with thicker metals and low-loss substrates. Although fully integrated solutions with on-chip passives allow the CMOS die to directly connect to the PCB antenna, a substantial amount of buffer space on the coarse-pitched PCB must be allocated to fan out the wirebond connections [71–73], and the wires introduce loss and impedance variation to the Tx. Directly interfacing a flip-chip package to an HDI PCB motherboard (antenna board) results in better signal integrity and more compact packaging [49, 65, 66, 69], but it is cost-ineffective since the motherboard is usually larger than the chip by two orders of magnitude.

# The Proposed Approach to Achieve High Power, High Efficiency, and Frequency Reconfiguration

In our works [59, 88], the CMOS die is flip-chip connected to HDI interposers which then disperse the signals to a coarse-pitched ball-grid array (BGA) on the back side of the interposers. The array signals are matched to connections on the coarse-pitch PCB motherboard. The interposer is still an order of magnitude larger than the chip, so substantial design space for high-quality passives and SMD components is available.

Another attractive aspect of employing both a digitally modulated Tx and on-interposer passives is that the Tx operating frequency can be reconfigured simply by switching between interposers that have different output passive networks. Notice that digital transmitters employ a square-wave LO drive and do not need an input matching network, while a linear PA requires multiple tuned passive networks. As a result, operation at different frequencies can be optimized via modifying the design of the Tx output passive network on the interposer, while the attached CMOS Tx chip remains the same. Compared to using multiple CMOS designs, this provides a more efficient method to support the wide range of frequency bands required from modern communications systems.

Our first work [59] demonstrates three HDI PCB interposers: LB, MB, and HB, targeting three frequency bands covering collectively from 0.7 to 3.5 GHz. The three packages are served by only a single and generic 2.5-V all-digital Tx in 65-nm CMOS. Thanks to the high-efficiency of the inverse Class-D core and the high-quality passives, the continuous-wave (CW) peak powers are 29.2 dBm at 1.1 GHz (LB), 27.7 dBm at 2.3 GHz (MB), and 26 dBm at 3 GHz (HB), with drain efficiencies (DE) respectively of 60%, 54%, and 49%. The Tx also performs well with 62.5MS/s 64-QAM, 20-MHz WLAN, and 20-MHz LTE signals at six test frequencies between 0.6 and 3.6 GHz.

Our second work [88] follows but advances [59] by combining the three Tx packages (LB/MB/HB) into a single-output interposer package via a new band-selecting topology. Fig. 1.6. illustrates the development of our two works [59] and [88], with wideband capability achieved by three single-band Tx packages and a single-output package.

In the proposed reconfigurable interposer, the secondary windings of three frequencyselecting transformers are connected in series to share the same antenna output, and the wide RF bandwidth is achieved by rotating the three sub-Tx's. The package achieves an output power higher than 22.9 dBm from 0.4 to 4 GHz with a drain efficiency (DE) higher than 25%.

### **1.3 mmW RF Transmitters and Phased Array**

#### mmW Phased Array for Beamforming

Millimeter-wave (mmW) phased-array or spatially-combining Txs have exhibited beam focusing and watt-level effective isotropically-radiated power (EIRP) [89–91]. At a lower



Figure 1.6: Illustration of the realized single-band Tx packages and the single-output package.

frequency such as 2.4 GHz, such a high EIRP is difficult to be achieved within the same device footprint despite the higher Tx efficiency and lower passive loss. The high EIRP and bandwidth associated with mmW phased array Txs enable high-speed (> 1 Gb/s) and long-distance (> 1 km) communication. Beam-steering is also feasible for phased array Txs via introducing phase shifts to the RF [89, 92–95], baseband [96, 97], or LO paths [98]. While most works employ phase shifter in the RF paths to avoid implementing the multiple mixers in the LO-phase-shifting scheme, the LO phase shifters drive lighter loads and do not necessarily entail a higher power consumption [99]. Many previous efforts focus on the design of mmW beamformers that only process the RF signal [89–91, 94, 95]. For the complete transmitter designs that involve the baseband to RF conversion, the prevailing approach involves an IQ mixer driven by the I/Q LO sources and the analog baseband signal, and the mixer output modulated mmW signal is then processed by the beamformer [92, 93, 96, 97]. To avoid distorting the modulated RF signal, the buffer and power amplifiers (PA) in the beamformer must be operated in the linear region such as at the 1-dB-comression power  $(OP_{1dB})$  with efficiency substantially lower than the peak value.

#### Digital mmW Phased Array for Better Efficiency

To further improve the Tx efficiency, direct digital-to-RF Txs adopting a nonlinear but efficient PA core have been extensively reported at both RF [59] and mmW frequencies [98, 100–104]. The nonlinear PA cores are usually driven by a CW LO signal that cannot back-off, and the output power and the output device periphery are digitally-modulated by the magnitude codeword. As the result, the dc power consumption is reduced in the back-off region. For a digitally-modulated Tx to support RF standards such as WLAN and LTE signals, an 8-bit or even higher magnitude resolution is required [59], and the routing efforts and layout parasitic are substantial and can be very lossy at mmW frequency. Fortunately, magnitude resolution of 6 bits (or even lower) is sufficient for mmW communication using single-carrier QAM modulation [98, 100–104].

The resolution and the design complexity of the digitally-modulated Tx elements in a phased array can be further reduced by allowing nonidentical element contributions to the spatially-combined constellation [98, 101–104], and the elements can have lower resolution than that of the spatially-synthesized constellation. In addition to the high peak efficiency, the power consumption corresponding to the low-power spatial symbols can be reduced by turning off some of the Tx elements [98, 101, 102]. The Q-band Tx introduced in [101]. has four digitally-modulated OOK-BPSK elements on chip, and its W-band version was reported in [102]. The goal of the two works was to ultimately spatially synthesize a 64QAM constellation with four chips and antennas. The two works [98, 103] achieves complete implementation with low-resolution Tx elements packaged with the antenna array. Complicated spatially-combined constellations via nonidentical element contributions are demonstrated, with peak EIRP of 22 dBm at 60 GHz [98] and 13 dBm at 130 GHz [103].

#### Spatial Null-Forming

For conventional uniformly-driven phased arrays with high-resolution (static) phase and magnitude controls for the analog Tx elements, such as [89, 92–95], a spatial null or lower sidelobes could be potentially synthesized by introducing proper phase shifts and attenuations to the Tx elements, according to [105, 106]. This functionality is important to reduce interference to a device listening to another Tx, as illustrated in Fig. 1.7. The sidelobe canceler and Dolph-Chebyshev approach are well-known approaches for spatial leakage suppression. However, the existing approaches degrade the Tx EIRP and is usually not supported by the digitally-modulated Tx elements that do not have sufficient output magnitude/phase resolution.

### Achieved Breakthrough on Leakage Suppression for Digital Phased Array

This work demonstrates a unique and novel method for a digital array to suppress the spatial leakage at a direction other than the communication direction. It utilizes the re-



Figure 1.7: Phased array spatial leakage suppression for reduced interference to a second device.

dundancy in the element output combinations. In our work with 8 five-state (OOK-QPSK) elements, the total number of element output combinations is  $5^8$ . The numerous combinations correspond to 145 spatial symbols at the main-beam direction and among which a redundancy-rich QPSK and 16QAM constellation can be extracted, with peak EIRP 3-dB lower than the peak-power constellations that have no or little combination redundancy. For each spatially-synthesized symbol in the redundancy-rich constellation, multiple element output combinations can be used. Therefore, the combination corresponding to the lowest radiation (leakage) at the desired low-leakage direction can be selected with little effect on the transmitted power and constellation EVM observed at the communication direction. Simulation shows that the spatial leakage can be suppressed to -25 dBc (compared to the main-beam power) for most zenith angles between 10° and 50° with the proposed approach and the implemented phased array. The measured leakages indeed improve from the leakage associated with the peak-EIRP constellation. The spatial symbols formed by previous works [98, 101–103] are redundancy-rich as well, but the combination redundancy has not been exploited there.

# **1.4** Structure of the Dissertation

#### Chapter 2

In Sec. 2.1, we will introduce the smallest inductive wireless power transfer and backscattering communication link to date, with tag coil only of 0.01 mm<sup>2</sup> [21]. The CMOS tag, operated at 4.7 GHz, generates a 0.1-mW dc power under a coupling distance of 1.2 mm and reader power of 31.4 dBm. This work provides a complete design flow to minimize the reader power, with the IPT frequency taken into account.

In Sec. 2.2, we will present the proposed equation-based optimization for inductive power transfer to a miniature CMOS rectenna [29]. It starts with the derived RF-dc PTE formula in terms of the high-level coupled-coil and rectifier parameters. The introduction of the equation-based IPT reader-coil analysis and the optimization of the reader-coil FOM  $(FOM_{Tx})$  follow. Both the SE and the DF reader-coil topologies are evaluated. After that, the rectenna FOM  $(FOM_{Rx})$  is optimized. Finally, both  $FOM_{Tx}$  and  $FOM_{Rx}$  are employed to decide the DF reader-coil and the rectenna designs. The PTE formula has been expressed in terms of the fundamental parameters, such as the coupled-coil geometry and the substrate/metal property. EM-simulated and measured results are provided to verify the analytical approach.

In addition, the equation-based approach is again verified by our previous works [21, 22]. It reveals that our previous designs, based on EM-simulation with few operation frequencies and coil sizes considered, are suboptimal. This section also provides the closed-form approximations for the radiation resistances.

The developed analysis has been conducted locally, which means only the information around the operation point is considered. The local analysis is supported by our IPT designs. However, in Sec. 2.3 we further reveal and analyze a potential turning-point bifurcation associated with an IPT systems. The turning-point bifurcations have been reported in power amplifier and oscillator circuits [107, 108]. In weakly-coupled IPT systems, the diode rectifier is designed only to output a low dc current, resulting in the rectifier having a high-Q input impedance, which could couple with the nonlinear rectifier capacitor and trigger the turning-point bifurcation. The revealed turning-point bifurcation in our work invalidates the conventional local analysis (e.g., [6, 7, 27, 28, 109]). Our analysis shows that the turningpoint bifurcation and the resulting hysteresis increase the required source power from the value calculated by the local analysis. Two analytical methods, based on nonlinear modeling of the rectifier capacitor at the fundamental frequency and the conversion matrix in harmonic-balance expression, have been presented in detail in our work [110]. In addition to simulation, a PCB implementation demonstrates the introduced bifurcation, and a specially designed varactor, when put in parallel with the diode rectifier, mitigates the observed bifurcation with improved performance.

#### Chapter 3

In Sec. 3, the proposed IM3 uplink method will be detailed and several design/system example will be provided.

As the preamble, Sec. 3.1 presents the conventional and prevailing direct backscattering and IF backscattering, and these UL methods are adopted for our miniature tag [21, 22, 29]. As expected, the achieved UL data rates are pretty low due to the high Tx-to-Rx noise/blocker leakage. In Sec. 3.2, we present the newly developed HD2 UL technique for a low Tx-to-Rx noise leakage. As mentioned previously, the HD2 UL needs a dedicated antenna and matching network on the miniature tag and is not a suitable solution.

Sec. 3.3 presents our revised IM3 UL systems. The Tx filter used in our first IM3 UL system [39] is replaced by a carefully-selected duplexer. The two-tone Tx (for IPT) are designed at 5.728/5.768 GHz in the duplexer Tx band (5.725-5.77 GHz), and the rectifiergenerated uplink frequency of 5.808 GHz is in the duplexer Rx band (5.805-5.85 GHz). (Assuming the duplexer Tx band is  $[Tx_L, Tx_H]$  and the Rx band is at higher frequency of  $[Rx_L, Rx_H]$ , the duplexer bands have to satisfy  $2Tx_H - Tx_L > Rx_L$  to be applicable to the IM3 uplink.) The duplexer prevents the Tx noise in both the Rx band (5.808 GHz) and the image band (5.688 GHz) from reaching the Rx, and the Tx blockers are also rejected. Therefore, an LNA can be employed in the Rx frontend, and the mixer-generated in-band blocker is negligible. The uplink SNR is 93 dB after the ADC sampling, and an outstanding 10-Mb/s UL data rate is achieved [43].

Sec. 3.4 further extends the 5.8-GHz IM3 UL to far-field applications. Again benefiting from the FDD operation and the duplexer, a low noise floor close to -174 dBm/Hz is obtained at the duplexer Rx output. At a coupling distance of 55 cm and a total (two-tone) RF power of 1 W, the PCB tag harvests a dc power of 50  $\mu$ W. A part of the harvested power (5  $\mu$ W) is used to modulate the IM3 carrier through a varactor. For comparison, the reader is reconfigured to operate the conventional direct backscattering UL with the same PCB tag, Tx output power, and reader antenna. The backscattering Rx spectrum has a lower worse Rx SBR and SNR, by 56 and 32 dB, respectively [41].

Sec. 3.5 implements the world-first single-antenna FDD reader that communicates with a commercial UHF Gen2 RFID inlay [42]. A conventional backscattering reader system was also built and characterized for comparison. The FDD reader, with the main downlink (DL) frequency at 910 MHz and an auxiliary tone at 890 MHz, excites the 930-MHz IM3 UL carrier from the commercial RFID tag, and the returned tag-modulated IM3 carrier is received and demodulated with both low interference and noise. Similarly to the other IM3 UL systems, only a single pair of reader and tag antennas is required to cover both the DL and UL frequencies. The Rx signal is demodulated non-coherently but still exhibits an error vector magnitude (EVM) 5.7-dB better than that obtained with the conventional backscattering reader using the coherent demodulation.

Finally, as the comparison between the IM3 and the backscattering UL has already been conducted in many of our works [39, 41, 43], Sec. 3.6 discusses the tradeoff between the tag harvested dc power and the UL power in a two-tone IM3 UL. The developed theory and simulation show that under the same PA peak power limitation, the UL power from a RF-powered tag can be improved with a three-tone Tx, while the rectifier harvested dc power is not compromised. Two UHF systems are implemented, and the proposed three-tone Tx waveforms are tested against the two-tone Tx waveforms. The first system uses a custom designed rectenna and the second system uses a commercial RFID inlay. The both systems exhibit 4-dB UL improvement, in terms of Rx spectrum SNR and the error vector magnitude (EVM) [44].
### Chapter 4

In Sec. 4, we will demonstrate three HDI PCB interposers: LB, MB, and HB, targeting three frequency bands covering collectively from 0.7 to 3.5 GHz. The three packages are served by only a single and generic 2.5-V all-digital Tx in 65-nm CMOS. Thanks to the high-efficiency of the inverse Class-D core and the high-quality passives, the continuous-wave (CW) peak powers are 29.2 dBm at 1.1 GHz (LB), 27.7 dBm at 2.3 GHz (MB), and 26 dBm at 3 GHz (HB), with drain efficiencies (DE) respectively of 60%, 54%, and 49%. The Tx also performs well with 62.5MS/s 64-QAM, 20-MHz WLAN, and 20-MHz LTE signals at six test frequencies between 0.6 and 3.6 GHz. For the WLAN signal, the Tx has an output power of 21.6 (20.7) dBm at 1.8 (2.4) GHz with a DE of 27% (25%) and system efficiency (SE) of 23%(21%). The measured EVM is -31 dB, substantially better than the specification of -25 dB, and the spectral mask is not violated. For the LTE signal, the peak performance occurs at 1.8 GHz with an output power of 24 dBm and SE of 30%. The measured EVM and adjacent channel leakage ratios (ACLR1,2) are -27, -34, and -39 dB, respectively, better than the specifications with margin. In addition to excellent power and efficiency when transmitting both CW and modulated signals over a wide frequency range, this work also features an all-digital input interface, with both an 8-bit inverse Class-D amplitude modulator and a tunable 8-bit IQ-mixer-based phase modulator realized on silicon. More details of this work can be found in [59].

After demonstrating the three separated packages for collective wideband capability, a single-output, band-selecting interposer is designed to combine three identical all-digital CMOS Tx chips. The open-drain CMOS Txs are flip-chip connected to the primary windings of three frequency-selecting transformers realized on a single PCB interposer. The secondary windings of the three transformers are connected in series and share a single output. Across 0.4 to 4 GHz, a peak power higher than 22.9 dBm is achieved collectively by the three sub-Tx's with a drain efficiency (DE) better than 25%. The peak powers/DE of the three sub-Txs are 28.6 dBm/50%, 27.4 dBm/49%, and 24.7 dBm/34%, at 0.85, 2.1, and 3 GHz, respectively. This work further demonstrates that the band selection can be achieved via reconfiguring the CMOS inverse Class-D switching PAs. As demonstrated via continuous wave (CW) and 64-QAM, WLAN, and LTE modulation tests, the reconfigurable Tx package exhibits high power and efficiency across all supported bands. More details of this work can be found in [88].

Sec. 4.1 provides an overview of the digitally-modulated Tx system. Sec. 4.2 introduces the employed inverse Class-D PA and amplitude modulation with new analysis improved from the previous work [71]. Sec. 4.3 introduces the phase modulator and Sec. 4.4 characterizes the transformer designs on the three HDI PCB interposers. Measurement results for the three HDI Tx packages are summarized in Sec. 4.5. For our second work introducing the single-output, band-selecting wideband package, Sec. 4.6 introduces the HDI PCB interposer design, Sec. 4.7 proposes two methods operating the wideband Tx package with DTx reconfigurations, which operates the idle cells in the switched PAs as short-circuited nodes, and the measured results are provided in Sec. 4.8. The measured results for two works are compared to the reported RF Txs in Sec. 4.9.

### Chapter 5

Sec. 5 introduces a new method to suppress the undesired spatial leakage of a mmW (76 GHz) phased array. In this system  $8 \times 1$  antenna elements on the PCB antennas are driven by digitally-modulated QPSK-OOK elements fabricated in 28-nm bulk CMOS. With 8 elements, the total number of elements output combinations is as large as  $5^8$ , and QPSK and 16QAM constellation can be formed at the broadside direction ( $\theta = 0$ ) by numerous qualified combinations. The peak array effective isotropic radiated power (EIRP) is measured at 31.6 dBm, and the EIRPs for the redundancy-rich spatially-synthesized QPSK (with 8 elements) and 16QAM (with 6 elements) constellations are 28.5 and 23.3 dBm, respectively. The redundancy is utilized to alleviate the Tx spatial leakage. For a given zenith angle requiring a low leakage, the qualified combinations corresponding to the lowest radiation (leakage) can be adopted. The broadside performance is not affected. As mentioned previously, this technique does not require additional magnitude or phase adjustment for the Tx elements and is very useful for digitally-modulated phased arrays with low element resolution. Simulation shows that the leakage power can be suppressed to -25 dBc for most zenith angles between  $10^{\circ}$  and  $50^{\circ}$ , supported by the measured results.

Sec. 5 is arranged as follows: Sec. 5.1 briefly introduces the conventional sidelobe canceler and Dolph-Chebyshev approach for spatial leakage suppression. Sec. 5.2 elaborates the combination redundancy in the implemented phased array with nonidentical element contributions. Sec. 5.3 gives a brief review on the Tx element, which has also been reported in our previous work [104]. Design and measurement of the phased array prototype are provided in Sec. 5.4. Sec. 5.5 presents the measured leakage suppression achieved with the proposed redundancy exploitation.

### 1.5 Related Publications Authored by N.-C. Kuo

Some results presented in this thesis have been organized into technical papers by the author and published in IEEE conferences and journals. The IPT techniques covered in this thesis have been reported in [21, 22, 29, 110], the RF-powered-tag UL techniques in [39–44], the high-power, efficient, and wideband RF Tx techniques in [59, 75, 88], and the mmW phased array in [104, 111].

Publication List

Journal Papers

J15. N.-C. Kuo and A. Niknejad, Low-spatial-leakage constellation formation exploiting combination redundancy in a digitally-modulated mmW phased array," submitted to *IEEE Trans. Microw. Theory Techn.*.

J14. B. Zhao, N.-C. Kuo, A. Niknejad, and B. Nikolic, "A line-array technique for wireless power transfer towards a 100um x 100um coil antenna," submitted to *IEEE Trans. Microw. Theory Techn.*.

J13. N.-C. Kuo and A. Niknejad, "RF-powered-tag intermodulation uplink with a three-tone transmitter waveform for enhanced uplink power," under minor revision, *IEEE Journal of Radio Frequency Identification*.

J12. N.-C. Kuo et al, "A 0.4-to-4 GHz all-digital RF transmitter package with a bandselecting interposer combining three wideband CMOS transmitters," *IEEE Trans. Microw. Theory Techn.*, Nov. 2018.

J11. N.-C. Kuo and A. Niknejad, "FDD reader design and communication to a commercial backscattering UHF RFID tag," *IEEE Microwave and Wireless Components Letters*, Jul. 2018.

J10. N.-C. Kuo, B. Zhao, and A. Niknejad, "Equation-based optimization for inductive power transfer to a miniature CMOS rectenna," *IEEE Trans. Microw. Theory Techn.*, May 2018.

J9. N.-C. Kuo, B. Zhao, and A. Niknejad, "Novel inductive wireless power transfer uplink utilizing rectifier third-order nonlinearity," *IEEE Trans. Microw. Theory Techn.*, Jan. 2018.

J8. N.-C. Kuo, B. Zhao, and A. Niknejad, "A 10-Mb/s uplink utilizing rectifier third-order intermodulation in a miniature CMOS tag," *IEEE Microwave and Wireless Components Letters*, Nov. 2017.

J7. N.-C. Kuo, et al, "A wideband all-digital CMOS RF transmitter on HDI interposers with high power and efficiency," *IEEE Trans. Microw. Theory Techn.*, Nov. 2017.

J6. B. Zhao, N.-C. Kuo, and A. Niknejad, "A gain boosting array technique for near-field wireless power transfer," *IEEE Trans. Power Electronics*, Sept. 2017.

J5. N.-C. Kuo, B. Zhao, and A. Niknejad, "Inductive power transfer uplink using rectifier second-order nonlinearity," *IEEE Trans. Circuits and Systems I: Regular Papers*, Nov. 2016.

J4. N.-C. Kuo, B. Zhao, and A. Niknejad, "Inductive wireless power transfer and uplink design for a CMOS tag with 0.01 mm2 coil size," *IEEE Microwave and Wireless Components Letters*, Oct. 2016.

J3. B. Zhao, N.-C. Kuo, and A. Niknejad, "An inductive-coupling blocker rejection technique for miniature RFID tag," *IEEE Trans. Circuits and Systems I: Regular Papers*, Aug. 2016.

J2. N.-C. Kuo, B. Zhao, and A. Niknejad, "Bifurcation analysis in weakly-coupled inductive power transfer systems," *IEEE Trans. Circuits and Systems I: Regular Papers*, May, 2016.

J1. N.-C. Kuo, J. Chien, and A. Niknejad, "Design and analysis on bidirectionally and passively coupled QVCO with nonlinear coupler," *IEEE Trans. Microw. Theory Techn.*, Apr. 2015.

### **Conference** Papers

C7. L. Zhang, A. Ameri, Y.-A. Li, N.-C. Kuo, M. Anwar, and A. Niknejad, "A 37.5-45.1GHz super-harmonic coupled QVCO with tunable phase accuracy in 28nm bulk CMOS," in *IEEE Asian Solid-State Circuits Conf.*, Nov. 2018.

C6. N.-C. Kuo and A. Niknejad, "An E-band QPSK transmitter element in 28-nm bulk CMOS with multistate power amplifier for digitally-modulated phased arrays," in *IEEE Radio Freq. Integr. Circuits Symp.*, Jun. 2018.

C5. N.-C. Kuo, B. Zhao, and A. Niknejad, "Third-order intermodulation uplink for far-field passive RFID tags," in *IEEE MTT-S Int. Microw. Symp.*, Jun. 2018.

C4. B. Zhao, N.-C. Kuo, Y.-A. Li, Y. Liu, L. Lotti, and A. Niknejad, "A 5.8 GHz powerharvesting  $108\mu$ m x  $108\mu$ m "Dielet" near-field radio with on-chip coil antenna," in *IEEE International Solid-State Circuits Conf.*, Feb. 2018.

C3. N.-C. Kuo, B. Zhao, and A. Niknejad, "Near-field power transfer and backscattering communication to miniature RFID tag in 65 nm CMOS technology," in *IEEE MTT-S Int. Microw. Symp.*, May. 2016.

C2. N.-C. Kuo et al, "A frequency reconfigurable multi-standard 65nm CMOS digital transmitter on LTCC interposers," in *IEEE Asian Solid-State Circuits Conf.*, Nov. 2014.

C1. Chien, N.-C. Kuo, and A. Niknejad, "A 26-GHz low-phase error in-phase-coupled QVCO using modified bi-directional diodes," in *IEEE Radio Freq. Integr. Circuits Symp.*, May 2014.

# Chapter 2

# **IPT** Designs for a Miniature Rectenna

## 2.1 IPT Design Example based on EM Simulation for a CMOS Tag with 0.1 mm Coil Size

### **Design Overview**

For a weakly-coupled two-coil system illustrated in Fig. 2.1, assuming the reader-coil and the tag-coil self-impedance are respectively  $R_{reader} + j\omega L_{reader}$  and  $R_{tag} + j\omega L_{tag}$ , and the mutual inductance is M, then according to the analysis in our previous work [110], the reader power can be approximated by (2.1)

$$P_{\text{source}} = \frac{R_{\text{reader}}}{2\omega^2} \times \frac{V_{\text{rect}}^2}{M^2 [Q_{\text{tag},L} \| Q_{\text{rect}}]^2}$$
(2.1)

where  $V_{rect}$  is the rectifier input swing corresponding to the designed output dc condition (1 V/0.1 mA), and  $Q_{tag} = \omega L_{tag}/R_{tag}$  and  $Q_{rect} = R_{rect}/\omega L_{tag}$  are respectively the quality factor of the tag inductor and rectifier. The formation of (2.1) have been detailed step by step in [110]. Eq. (2.1) has been verified by simulation and measurement in [21, 22], and the optimization of which results in the highest RF-dc PTE.

To exploit (2.1) for the efficiency optimization of our first IPT system with the tag size of  $100 \ \mu m \times 100 \ \mu m$  [21], full-wave EM simulation was applied to coupled-coils, and the parameters are extracted from the simulated Z-parameter by  $R_{reader}=real(Z_{11})$ ,  $L_{reader}=imag(Z_{11}/\omega)$ ,  $R_{tag}=real(Z_{22})$ ,  $L_{tag}=imag(Z_{22}/\omega)$  and  $M=imag(Z_{12}/\omega)$ . The extracted parameters are frequency dependent due to the coil parasitic capacitance and skin effect. IPT frequencies up to 10 GHz were considered in the design, and eventually two tags were fabricated respectively at 2 and 5 GHz. The former is the same frequency to [22] while 5 GHz is the frequency with the highest RF-dc efficiency. Four frequencies (0.1, 2, 5, and 8 GHz) are put into comparison in the following discussion for the ease of presentation, while the reader power has been evaluated for other frequencies though the same approach.



Figure 2.1: IPT schematic, design parameters, and die photographs for the 2 and 5 GHz designs in 65-nm CMOS technology.

### Design Context for Using the Analytical Expression

For IPT systems operated in the mid-GHz frequency, the reader coil has to be matched to 50 to extract the highest power from the PA, so the design with the highest PTE harvests the highest power. This is noticeably different from the IPT designs at tens of MHz, where the commonly-used driver is a voltage source and the optimal PTE could be achieved at a high reader input resistance with a low power delivered into the reader coil [24]. Differentialdriven rectifiers the same or similar to that used in this work can be found in [21, 22, 112– 114]. The rectifier has been demonstrated with a good ac-to-dc PTE of 68% at UHF [114] and 33% at 60 GHz [112, 113]. For this design at 5 GHz, the simulated PTE is 60% under the designed rectifier output dc power of 0.1 mW (1-V harvested voltage at a 10-k $\Omega$  load resistor). This rectifier output condition should be sufficient to supply the future active circuitry realized on the tag [34].

In far-field systems [112, 113], if a (low-loss) impedance matching network (IMN) is put

between the antenna and the rectifier, then the mismatch loss does not have to be included in the rectifier PTE [114]. Apparently, the PTE reported in [114] should not be compared to the PTE reported in [112, 113], where rectifier is assumed to directly connect to the antenna (without an IMN) and the mismatch loss is included in the PTE. Generally speaking, the reader and tag designs are not coupled in far-field systems, and improving the rectifier ac-dc PTE also increases the collective far-field RF-dc PTE. For illustration purpose, assuming a rectifier has a low  $V_{rect}^2/R_{rect}$  when generating the designed output dc condition, this rectifier would be considered efficient by [114], and also preferred by [112, 113] if  $R_{rect}$  is close to the antenna impedance of 50  $\Omega$ . However, in a near-field IPT system without an IMN between the rectifier and the tag inductor loop, such as this work and many others [21, 22], the required source power ( $P_{source}$ ) is governed by (2.1) and cannot be determined by the rectifier only. Maximizing the rectifier PTE, or equivalently minimizing  $V_{rect}^2/R_{rect}$ , does not necessarily lead to the lowest  $P_{source}$  according to (2.1). The lowest  $P_{source}$  has to be achieved with the IPT frequency, coupled-coil geometries, and the rectifier designed collectively.

### Design Steps based on EM Simulation

The first step is to optimize the reader coil, and a one-turn, 100  $\mu$ m tag inductor is used in this step, 1.2 mm away from the reader. According to (2.1), the quantity  $M^2 \omega^2 / R_{reader}$  shall be maximized and is plotted in Fig. 2.2(c) to the PCB coil radius. The mutual inductance and reader resistance are plotted respectively in Figs. 2.2(a) and 2.2(b). The performance drops if the frequency exceeds the self-resonance frequency (SRF) of the reader coil, where capacitive coupling starts to dominate. Reader designs at 0.1 and 8 GHz are ineffective due to the low mutual reactance at the low frequency and the excessive PCB loss at the high frequency. Also, a tiny reader loop with sufficiently high SRF has to be used for the design at 8 GHz, which results in difficulty in PCB fabrication. PCB readers with coil radius of 1.5 mm and 0.7 mm are used for the 2-GHz and 5-GHz design, with trace width of 200  $\mu$ m. The two frequencies, 0.1 GHz and 8 GHz, are no longer considered.

The CMOS tag is optimized as the second step, where the quantity to maximize is  $M(Q_{tag}//Q_{rect})/V_{rect}$ . The optimized parameters are the rectifier size, stage, and the tag inductor (turns and trace width). As shown in Fig. 2.3, using multiple turns increases M and  $Q_{tag}$  (until  $R_{tag}$  increases faster than  $L_{tag}$  owing to the reduced inductor inner diameter), while on the downside  $Q_{rect}$  also becomes lower due to a larger  $L_{tag}$ . Using the 3- $\mu$ m metal trace results in a larger inductor inner diameter and high mutual inductance, compared favorably to using the 5- $\mu$ m metal trace. Differential-drive rectifier proposed in [114] is adopted and the schematic has been shown in Fig. 1. Fig. 2.4(a) shows that using high-stage rectifiers or a large rectifier device reduces  $V_{rect}$ , but on the downside  $R_{rect}$  also reduces, as shown in Fig. 2.4(b). The rectifier device, fabricated in 65 nm CMOS, has performance insensitive to the adopted IPT frequencies below 10 GHz.

The optimal selection of the tag inductor and rectifier can be approached iteratively. The design parameters for the two IPTs are listed in Fig. 2.1, along with the die photographs. A six-turn inductor is used for the 2 GHz tag and a four-turn inductor is used for the 5 GHz



Figure 2.2: Reader coil optimization. (a) Mutual inductance M. (b) Self resistance  $R_{reader}$ . (c) Optimized quantity  $M^2 \omega^2 / R_{reader}$ .

design, while an identical four-stage rectifier is used for both. The calculated reader power, according to (2.1), is 34.4 dBm and 29.4 dBm, respectively for the 2-GHz and 5-GHz design. The 5-GHz design benefits from the  $\omega^2$  term in the denominator of (2.1) and the higher  $Q_{tag}$ . The PCB matching network introduces additional 2-dB loss for the both designs.

#### Measurement

The measured IPT performances for the two tags are shown in Fig. 2.5. Fig. 2.5(a) plots the output dc voltage versus the varactor bias for the two tags, and Fig. 2.5(b) plots the output dc voltage versus the reader power at the optimal varactor bias (1 V for the 2 GHz tag and 0.6 V for the 4.7 GHz tag). The reader PCB can move horizontally or vertically related to the fixed tag, controlled by a precise probe station. The ground pad, varactor bias pad, and rectifier output pad of the tag are wirebond-connected to external waveform generator and multi-meter for performance characterization. The measured reader power is 38 dBm for the 2-GHz tag and 31.4 dBm for the 4.7-GHz tag (with different reader PCB designs). The operation frequencies correspond to the lowest input reflection of the readers with  $|S_{11}| = -15$  dB, determined by the reader coil and the matching network realized by SMD components and microstrip lines. The alignment of the reader and the tag frequencies is achieved by tuning the on-chip varactor. The varactor bias should be higher (lower) if the tag has to align to a higher (lower) operation frequency. It is observed from Fig. 2.5(a) that



Figure 2.3: Tag inductor turns and width design. (a) Mutual inductance M. (b) Tag inductance  $L_{tag}$ . (c) Tag inductor quality factor  $Q_{tag}$ .



Figure 2.4: Measured tag output voltage for the 2/4.7 GHz tags versus (a) varactor bias and (b) reader power (under the optimal varactor bias).



Figure 2.5: Measured tag output voltage for the 2/4.7 GHz tags versus (a) varactor bias and (b) reader power (under the optimal varactor bias).

the output voltage (power) is more sensitive to the varactor bias for the 4.7 GHz tag. This is because  $Q_{tag}$  is higher for the high-frequency design.

## 2.2 Equation-Based Optimization for Inductive Power Transfer to a Miniature CMOS Rectenna

### Analytical Approach Based on Fundamental Properties

In a near-field IPT with a miniature tag, we can assume that the tag is exposed to a uniform magnetic field, and the mutual inductance can be expressed by  $M = B_{z,I=1} \times A_{tag}$ , where  $B_{z,I=1}$  is the reader-generated magnetic field at the rectenna location when the injection current,  $I_{inj}$ , is 1 A, and  $A_{tag}$  is the enclosed area by the rectenna coil. To decouple the reader coil and the rectenna design, eq. (2.1) can be modified to (2.2), which is

$$P_{\text{source}} = \frac{R_{\text{reader}}}{2\omega^2 (B_{z,I=1})^2} \times \frac{V_{\text{rect}}^2}{A_{\text{tag}}^2 [Q_{\text{tag},L} \| Q_{\text{rect}}]^2}.$$
(2.2)

The first term on the RHS of (2.2) depends solely on the reader coil design and the second term depends solely on the rectenna design. Therefore, the PTE optimization does not require iterations between the reader-coil and the rectenna designs. The inverses of the two terms are defined as the reader and the rectenna FOM, respectively, where  $FOM_{Tx} = 2\omega^2 B_{z,I=1}^2/R_{reader}$  and  $FOM_{Rx} = [A_{tag}(Q_{tag,L}//Q_{rect})^2/V_{rect}^2)$ . Both the  $FOM_{Tx}$  and the  $FOM_{Rx}$  prefer to have a high value. The explicitly frequency factor  $\omega^2$  is assigned to  $FOM_{Tx}$  for the ease of analysis.

According to the Biot–Savart Law, the z-direction magnetic field induced by a circular current loop can be derived as a function of the coil radius r, coupling distance d, and the injection current  $I_{inj}$ , assuming the coil current is uniform. The magnetic field, denoted by  $B_{z,dc}$ , is

$$B_{z,dc} = \left(\frac{\mu_0}{2} \times \frac{r^2}{(d^2 + r^2)^{1.5}}\right) \times I_{inj} \equiv (B_{z,dc,I=1}) \times I_{inj}$$
(2.3)

where  $B_{z,dc,I=1}$  is the magnetic field under a unit current injection. Notice that (2.3) is only accurate when the IPT frequency is sufficiently low. Fig. 2.6(a) plots  $B_{z,dc,I=1}$  to the reader coil radius and under several coupling distances. In such a low-frequency case,  $B_{z,I=1} = B_{z,dc,I=1}$  and  $R_{reader} = 2\pi r R_0$ , where  $R_0$  is the unit-length coil resistance, and  $FOM_{Tx} = \omega^2 B_{z,dc,I=1}^2/(\pi r R_0)$ . Fig. 2.6.b plots the low-frequency  $FOM_{Tx}$  with  $R_0 = 1 \ \Omega/m$ and  $\omega = 1$ . The corresponding values with a different  $R_0$  or  $\omega$  can be easily obtained by scaling the curves in Fig. 2.6(b) by a factor of  $\omega^2/R_0$ . Fig. 2.6(b) indicates that if the IPT frequency is very low, employing a coil radius the same to the coupling distance achieves the best  $FOM_{Tx}$ . However, both  $B_{z,I=1}$  and  $R_{reader}$  are strong functions of the IPT frequency due to the radiation resistance, skin-depth effect, and the non-uniform coil current; therefore, the analysis on  $FOM_{Tx}$  is much more complicated than what Fig. 2.6(b) has illustrated.

When the electrical wavelength is comparable to the loop dimension, the maximum current magnitude on the coil can be much higher than  $I_{inj}$  due to the standing-wave effect. As illustrated in Fig. 2.7, the current magnitude on a SE loop, denoted by  $I_{SE}(\Omega)$ , can be modeled by

$$I_{\rm SE}(\Omega) = I_{inj} \times \cos\left(\beta \times l \times \Omega/2\pi\right) / \cos(\beta \times l) \tag{2.4}$$

where  $\beta$  is the phase constant ( $\beta = 2\pi/\lambda_{eff}$ ),  $\lambda_{eff}$  is the equivalent electrical wavelength, l is the loop length, and  $\Omega$  is the coil position described in standard polar coordinates. At the short-circuited via ( $\Omega = 0$ ), the current magnitude has the highest value, where  $I_{SE}(0) = I_{inj}/\cos(\beta l)$ .  $I_{SE}(2\pi) = I_{inj}$  at the coil input. The current distribution is almost uniform when  $\beta l \ll 1$ , while the via current approaches infinite when the loop length is about a quarter wavelength ( $\beta l = \pi/2$ ). Similarly, the current on a DF loop can be modeled by



Figure 2.6: (a) The z-direction magnetic field induced by a uniform unit-current loop  $(B_{z,dc,I=1})$ . (b)  $FOM_{Tx}$  under a low frequency.

Single-ended Coil with length /

Differential Coil with length /



Figure 2.7: Current distributions and mathematical models for (a) a single-ended (SE) coil and (b) a differential (DF) coil.

$$I_{\rm DF}(\Omega) = I_{inj} \times \cos\left(\beta \times l \times (\Omega - \pi)/2\pi\right)/\cos(\beta \times l/2)$$
(2.5)

In this case, the highest current magnitude of  $I_{inj}/\cos(\beta l/2)$  occurs at  $\Omega = \pi$  (loop middle) and reaches a high value when the loop length is about a half wavelength ( $\beta l = \pi$ ).  $I_{DF}(0) = I_{DF}(2\pi) = I_{inj}$  at the coil input.

In our design, the coil loops are realized by the PCB top metal while the ground plane is kept to prevent the magnetic field from leaking to the PCB backside. The ground plane, separated from the coil by the substrate thickness of h, attenuates the strength of the magnetic field according to the image theory [106]. This attenuation can be captured by a correction coefficient C(r, d, h), expressed by

$$C(r, d, h) = [B_{z, dc, I=1}(r, d) - B_{z, dc, I=1}(r, d+h)] / B_{z, dc, I=1}(r, d)$$
(2.6)

It will be demonstrated later via Fig. 2.16 that the ground plane does not degrade  $FOM_{Tx}$  for the designed reader coil because it also lowers the input resistance of the reader coil.

With (2.5) and (2.6) taken into account, the magnetic field generated by a SE loop, denoted by  $B_{z,SE}$ , can be derived by

$$B_{z,\text{SE}} = C(r, d, h) \times \frac{B_{z,\text{dc},I=1}}{2\pi} \int_{0}^{2\pi} I_{\text{SE}}(\Omega) d\Omega$$
$$= C(r, d, h) \times \frac{B_{z,\text{dc},I=1} \times c}{\omega l \sqrt{\varepsilon_{\text{eq}}}} \times \tan(\beta l) \times I_{\text{inj}}$$
(2.7)

where c is the free space light velocity and  $\epsilon_{eq} = (2\pi c/\omega \lambda_{\text{eff}})^2$ . Similarly, the magnetic field at the coil center of a DF current loop, denoted by  $B_{z,DF}$ , can be derived by

$$B_{z,\mathrm{DF}} = C(r,d,h) \times \frac{B_{z,\mathrm{dc},I=1} \times c}{\omega l \sqrt{\varepsilon_{\mathrm{eq}}}} \times 2 \tan\left(\frac{\beta l}{2}\right) \times I_{\mathrm{inj}}.$$
(2.8)

Notice that Bz,SE = Bz,DF when  $\beta l \ll 1$ .

To evaluate  $FOM_{Tx}$ , the frequency-dependent coil input resistance must be obtained as well as a function of coil geometry and the metal/substrate property. The SE coil is modeled by a short-circuited micro-strip trace with a length of l, and the DF coil is modeled by a micro-strip trace driven at both ends differentially. The input resistance  $(R_{reader})$  of a SE coil and a DF coil, denoted by  $R_{reader,SE}$  and  $R_{reader,DF}$ , respectively, are expressed by

$$R_{\text{reader,SE}} = \text{Real}\{Z_0 \times \tanh(\alpha_{\text{total}}l + j\beta l)\} \approx \frac{Z_0 a_{\text{total}}l}{\cos^2(\beta l)}$$
(2.9)

$$R_{\text{reader,DF}} = \text{Real}\left\{2Z_0 \times \tanh\left(\frac{\alpha_{\text{total}}l}{2} + j\frac{\beta l}{2}\right)\right\} \approx \frac{Z_0 a_{\text{total}}l}{\cos^2\left(\frac{\beta l}{2}\right)}.$$
 (2.10)

In (2.9) and (2.10),  $Z_0$  is the micro-strip characteristic impedance, and  $\alpha_{total}$  is the frequencydependent attenuation constant including the metal and the substrate losses. The approximations in (2.9) and (2.10) are valid for the designs in this work with  $\alpha_{total} \ll 1$ . According to [115, 116],  $\alpha_{total}$  can be expressed by

$$\alpha_{\text{total}} = \frac{\sqrt{\mu_0}}{wZ_0\sqrt{8S}}\sqrt{\omega} + \left(\frac{1-\varepsilon_{\text{eq}}^{-1}}{1-\varepsilon_{\text{r}}^{-1}}\right)\frac{\tan\delta}{2c}\omega.$$
(2.11)

The first and the second term on the RHS of (2.11) are contributed by the metal loss (proportional to  $\omega^{0.5}$ ) and the substrate loss (proportional to  $\omega$ ), respectively. S is the metal conductivity, w is the trace width,  $\epsilon_r$  is the substrate permittivity, and  $\tan \delta$  is the substrate loss tangent. Both  $\epsilon_{eq}$  and  $Z_0$  can be approximated accurately by analytical functions of  $\epsilon_r$ ,  $\omega$ , and h. The analytical expressions can be found in [115, 116] and are not repeated here.

Plugging (2.7)–(2.10) into  $FOM_{Tx} \equiv 2\omega^2 B_{z,I=1}^2/R_{reader}$ , the  $FOM_{Tx}$  for a SE and DF reader coil can be expressed by

$$\text{FOM}_{\text{Tx,SE}} = \frac{2c^2 \times C(r, d, h)^2}{\varepsilon_{\text{eq}} Z_0 a_{\text{total}}} \times \frac{B_{z, \text{dc}, I=1}^2 \sin^2(\beta l)}{l^3}$$
(2.12)

$$\text{FOM}_{\text{Tx,DF}} = \frac{2c^2 \times C(r, d, h)^2}{\varepsilon_{\text{eq}} Z_0 \alpha_{\text{total}}} \times \frac{4B_{z, \text{dc}, I=1}^2 \sin^2\left(\frac{\beta l}{2}\right)}{l^3}.$$
(2.13)

According to (2.12) and (2.13),  $FOM_{Tx,SE}/FOM_{Tx,DF}$  is equal to  $cos(\beta l/2)^2$ , which is always lower than unity. Therefore, the DF coil configuration is preferred over the SE coils. If the non-uniform current is suppressed, then the last terms on the RHS of (2.12) and (2.13) are  $B_{z,dc,I=1}^2(\beta l)^2/l^3$  and are higher because sin(x) < x. The non-uniform current distribution has been intentionally avoided in [18] by segmenting the reader coil. However, segmenting the reader coil is not applicable here for small reader coils and a short coupling distance around 1 mm. The last terms on the RHS of (2.12) and (2.13) are frequency-dependent and are plotted in Fig. 2.8(a) and (b), respectively, with the coupling distance d = 1 m and  $\epsilon_{eq}$ = 1. The best values are -153.8 dB at  $(r, freq, \beta l) = (0.44, 27 \text{ MHz}, \pi/2)$  in Fig. 2.8(a) and -147.8 dB at  $(r, freq, \beta l) = (0.44, 54 \text{ MHz}, \pi)$  in Fig. 2.8(b). If d is not 1 m or  $\epsilon_{eq}$  is not unity, Figs. 2.8(a) and (b) are still applicable. In such a case, the horizontal and vertical axes of Fig. 2.8 represent respectively r/d and  $(freq)\sqrt{\epsilon_{eq}d}$ , and the output value should be scaled by  $1/d^5$ .

It can be observed in Fig. 2.8 that the best  $FOM_{Tx}$  no longer happens when the loop radius is the same to the coupling distance. The operation frequency also plays an important role here. For a given coil radius r, the lowest values in Fig. 2.8 occur when the coil is in resonance (i.e.  $\beta l = \pi/2$  for a SE coil and  $\beta l = \pi$  for a DF coil). However, as  $\alpha_{total}$  increases monotonically with frequency, the best  $FOM_{Tx}$  is expected to occur at a frequency lower than the coil SRF. Although (2.12) and (2.13) do not object operating the reader coil at its SRF, it is difficult to realize the MNW for the extremely high coil input impedance there.

For a given operation frequency, the optimal coil radius does not correspond to the coil resonance in most cases. Derived from Fig. 2.8, Fig. 2.9(a) and (b) plot the maximum values for the last terms on the RHS of (2.12) and (2.13), respectively, to the operation frequency. The optimal coil radius (r) is also plotted. As expected, the optimal r is the same to the coupling distance when  $\beta l \ll 1$ . For the SE coil, as the frequency goes higher from a low



Figure 2.8: (a)  $dB[B_{z,dc,I=1}^2 sin(\beta l)^2/l^3]$ : last term on the RHS of (2.12). (b) Last term on the RHS of (2.13):  $dB[4B_{z,dc,I=1}^2 sin(\beta l/2)^2/l^3]$ .  $(d, \epsilon_{eq}) = (1, 1)$ .



Figure 2.9: (a) Maximum  $dB[B_{z,dc,I=1}^2 sin(\beta l)^2/l^3]$  and (b) maximum  $dB[4B_{z,dc,I=1}^2 sin(\beta l/2)^2/l^3]$  to the operation frequency.  $(d, \epsilon_{eq}) = (1, 1)$ .

value, the optimal r starts to shrink from the coupling distance, and  $\beta l$  approaches  $\pi/2$  ( $\beta l$  approaches  $\pi$  for the DF coil). At the optimal frequency of 27 MHz (54 MHz for the DF coil),  $\beta l = \pi/2$  for the SE coil ( $\beta l = \pi$  for the DF coil). Beyond the optimal frequency, the optimal r continues to shrink to keep  $\beta l$  slightly higher than  $\pi/2$  (slightly higher than  $\pi$  for the DF coil). The corresponding  $l/\lambda_{eff}$  for several frequencies are annotated. Reader coils with multiple turns are not used in this work because they are more difficult to realize on PCB. Also, using a multiple-turn coil has to employ a small coil for a high SRF and the resulting  $B_{z,dc,I=1}$  becomes too low.

### Improving Modeling Accuracy with Radiation Resistance

The above discussion does not consider the frequency-dependent  $\alpha_{total}$ , geometry-dependent C(r, d, h), and the radiation resistance. The former two items have been included in the  $FOM_{Tx}$  formula in (2.12) and (2.13), which can be further improved with the radiation resistance taken into account.

It is known and verified that the far-field electric field caused by a micro-strip trace can be derived by the dyadic Green's function together with the current distribution [117]. Exploiting these results, the electric field generated by a SE(DF) reader coil, in the standard spherical coordinates  $(R, \theta, \phi)$ , can be expressed by

$$E_{\theta}(\theta,\phi) = \frac{re^{-jk_{0}R}}{R} \times \frac{\omega^{2}\mu_{0}h}{2\pi c} \left(\frac{\sin^{2}(\theta)}{\varepsilon_{r}} - 1\right)$$

$$\times \int_{0}^{2\pi} I_{\mathrm{SE(DF)}}[-\cos(\phi)\sin(\Omega) + \sin(\phi)\cos(\Omega)]$$

$$\times e^{jk_{0}X}d\Omega \qquad (2.14)$$

$$E_{\phi(\theta,\phi)} = \frac{re^{-jk_{0}R}}{R} \times \frac{-\omega^{2}\mu_{0}h}{2\pi c}\cos(\theta)$$

$$\times \int_{0}^{2\pi} I_{\mathrm{SE(DF)}}[\sin(\phi)\sin(\Omega) + \cos(\phi)\cos(\Omega)]$$

$$\times e^{jk_{0}X}d\Omega \qquad (2.15)$$

where  $X = rcos(\Omega)sin(\theta)cos(\phi) + rsin(\Omega)sin(\theta)sin(\phi)$ . For a SE coil, another radiation contributor is the via current [117]. Assuming a uniform via current, the far-field electric field only contains  $\theta$ -direction component and is expressed as

$$E_{\theta,\text{via}} = \frac{\mu_0 \omega h}{2\pi} \times \frac{e^{-jk_0 R}}{R} \times \frac{\sin(\theta)}{\varepsilon_r} e^{jk_0 r \sin(\theta) \sin(\phi)} \times I_{\text{SE}}(\Omega = 0).$$
(2.16)

According to [106], the radiation resistance, denoted by  $R_{rad}$ , can be calculated by

$$R_{\rm rad} = \frac{1}{I_{\rm inj}^2 \times 120\pi} \int_0^{2\pi} \int_0^{\pi/2} (|E_\theta|^2 + |E_\phi|^2) R^2 \sin(\theta) d\theta d\phi.$$
(2.17)

The radiation resistance contributed by the loop current is denoted by  $R_{rad,SEloop}$  for a SE coil and  $R_{rad,DFloop}$  for a DF coil. Equations (2.14)-(2.17) indicate that  $R_{rad,SEloop}$  and  $R_{rad,DFloop}$  are proportional to  $\omega^4$ , square of the loop radius and the substrate thickness, and they are functions of only  $\beta l$  and  $\epsilon_r$  after divided by  $\omega^4 r^2 h^2$ . The radiation resistance contributed by the via, denoted by  $R_{rad,SEvia}$ , is a single-variable function of  $\beta l$  after divided by  $\omega^2 h^2/\epsilon_r^2$ . Based on the above observations, the analytical approximations for the radiation resistance coil resistance, the analytical approximations of the radiation resistances can be included in the analytical  $FOM_{Tx}$  optimization.

### Equation-Based IPT Reader Coil Optimization

The reader-coil FOM  $(FOM_{Tx} \equiv 2\omega^2 B_{z,I=1}^2/R_{reader})$  is optimized through the Gradient method with the operation frequency swept from 0.5 to 8 GHz and the loop radius and trace width the optimization variables. The highest frequency is set to accommodate the PA module (ZVE-8G). The coupling distance is set at 2.2 mm, and the PCB substrate is Roger4003 with  $\epsilon_r$  of 3.5 and tan $\delta$  of 0.003. Two available substrate thicknesses, 0.8 and 1.6 mm, are explored and compared. Fig. 2.10 and Fig. 2.11 plot the optimized  $FOM_{Tx}$ for the SE and DF coil configuration, respectively. The optimized coil radius and  $l/\lambda_{eff}$  are annotated for some testing frequencies. Equation (2.12) and (2.13) can be directly employed if the radiation resistance is not included.

Figs. 2.10 and 2.11 show that using a DF coil with h = 1.6 mm achieves the highest  $FOM_{Tx}$ . For both the SE and DF coils, the adoption of the 0.8-mm substrate is not recommended due to the serious ground-plane interference (e.g., C(2, 2.2, 0.8) = 0.44). It is observed in Fig. 2.11 that the radiation resistance is relatively small and can be ignored in the DF coils, while Fig. 2.10 shows that the radiation resistance can seriously undermine the optimized  $FOM_{Tx}$  for the SE coils at higher frequency. This again justifies the adoption of a DF coil. The radiation resistance is also better suppressed with a thinner substrate (i.e. 0.8 mm) because the radiation resistance is proportional to  $h^2$ .

In general, using a wider trace can reduce the attenuation constant  $\alpha$ total to some extent due to the decreasing metal loss; however, the loss contributed by the substrate increases with the trace width, so the trace width cannot increase indefinitely. Also, reducing the metal loss is not useful if the radiation resistance is relatively high. It can be inferred from Figs. 2.10 and 2.11 that increasing the trace width beyond 0.6 mm does not improve  $FOM_{Tx}$ effectively.

The trends of the curves in Fig. 2.10 and Fig. 2.11 without counting the radiation resistance resemble the curves in (the normalized) Fig. 2.9(a) and (b), respectively. Since d



Figure 2.10: Optimized reader-coil  $FOM_{Tx}$  for the SE coil configuration: (a) substrate thickness h = 0.8 mm (b) h = 1.6 mm.



Figure 2.11: Optimized reader-coil  $FOM_{Tx}$  for the DF coil configuration: (a) substrate thickness h = 0.8 mm (b) h = 1.6 mm.

= 2.2 mm and  $\epsilon_{eq} \approx 2.5$ , the optimal frequencies of 27 MHz in Fig. 2.9(a) for the SE coil and 54 MHz in Fig. 2.9(b) for the DF coil are scaled to 7.8 GHz and 15.6 GHz, respectively. The optimal frequencies are slightly reduced by  $\alpha_{total}$  that increases with frequency 2.11 If the radiation resistance is included and cannot be ignored, for example for a SE coil on a 1.6-mm substrate, the optimal frequency for the best  $FOM_{Tx}$  is further reduced (i.e. to 3 GHz).

In summary, the optimized  $FOM_{Tx}$  for the SE coil configuration can achieve 126.3 dB at 3 GHz, 5.4 dB worse than that of the DF coil configuration, which is 131.7 dB at 7 GHz. Three sanity checks for the developed optimization are to confirm (i) for the same w and h, the optimized  $FOM_{Tx}$  of a DF coil is very close to that of a SE coil at a low frequency, (ii) the optimal coil size at a low frequency is close to the coupling distance, and (iii) as the IPT frequency increases, the optimal radius shrinks, and  $l/\lambda_{eff}$  increases monotonically and approaches 0.25 for the SE coils and 0.5 for the DF coils.

### Equation-Based IPT Rectenna Optimization

This section focuses on the design of the rectenna in 65-nm CMOS process. The rectenna  $FOM_{Rx} \equiv [A_{tag}(Q_{tag,L}//Q_{rect})]^2/V_{rect}^2$  has to be maximized as well. In  $FOM_{Rx}$ ,  $V_{rect}$  and  $Q_{rect}$  are functions of the rectifier circuitry, and  $A_{tag}$  and  $Q_{tag,L}$  are functions of the rectenna inductor. Since  $Q_{tag,L} = \omega L_{tag}/R_{tag}$  and  $Q_{rect} = R_{rect}/\omega L_{tag}$ , they are frequency-dependent.

In this design context, the rectenna coil (100  $\mu$ m × 100  $\mu$ m) is significantly smaller than the operation wavelength, so the current distribution on the CMOS inductor can be viewed as uniform. In such a case, fast and accurate analytical modeling and approximations for the inductor have been reported, taking into account the substrate coupling, current constriction, and the proximity effects [118, 119]. The analytical methods have been verified and are not repeated here. EM simulations can also be adopted to obtain more accurate results for on-silicon inductors at the cost of the simulation time.

The small inductor size allows only a limited amount of inductor realizations. The minimum metal spacing is 2  $\mu$ m and the metal thickness is 0.9  $\mu$ m. The inductor reactance ( $\omega L_{tag}$ ) and quality factor ( $Q_{tag,L}$ ) for some inductor designs are plotted in Fig. 2.12(a) and (b), respectively. The inductor enclosed area  $A_{tag}$  is also annotated in Fig. 2.12(a), which shows that using multiple inductor turns (N) increases  $A_{tag}$  (but decreases  $Q_{rect} \equiv R_{rect}/\omega L_{tag}$ due to the high reactance). Using a 3- $\mu$ m metal trace ( $w = 3 \mu$ m) compares favorably to using a 5- $\mu$ m metal trace because more inductor turns can be drawn to achieve a higher  $A_{tag}$ . Generally speaking, when the inductor inner diameter is already small compared to the outer diameter, further increasing the inductor turns N could even degrade  $Q_{tag,L}$  at higher frequencies due the increasing substrate loss. This explains why  $Q_{tag,L}$  at 5 GHz with N = 4 (and  $w = 3 \mu$ m) is better than that with N = 7. On the other hand, when the inductor inner diameter is close to the outer diameter and the frequency is relative low (with a low substrate loss), the inductance and resistance are roughly proportional to  $N^2$  and N, respectively, and  $Q_{tag,L}$  increases with N. This explains why  $Q_{tag,L}$  at 1 GHz with N = 2 is only half of that with N = 4.



Figure 2.12: (a) Inductor reactance  $(\omega L_{tag})$  and enclosed area  $(A_{tag})$  and (b) inductor quality factor  $(Q_{tag,L})$  with different rectenna coil designs.



Figure 2.13: Schematic and die photograph of the designed IPT and CMOS rectenna.

For the targeted rectifier output dc condition of 1 V at a 10-k $\Omega$  load, the four-stage rectifier illustrated in Fig. 2.13 is the optimal configuration with  $V_{rect}$  of 0.67 V and  $R_{rect}$ of 1.5 k $\Omega$ . The rectifier uses the minimum channel length of 65 nm, so both  $V_{rect}$  and  $R_{rect}$ are insensitive to the operation frequency below 10 GHz. Simulation shows that using a two-stage rectifier has to incorporate a much higher  $V_{rect}$  of 1.3 V. In this case, the number of the parallel devices seen by the rectenna inductor in the ac domain becomes half, but  $R_{rect}$ and  $Q_{rect}$  only increase little under such a high voltage swing. On the other hand, further increasing the rectifier stages (e.g. to six stages) only slightly reduces  $V_{rect}$  due the non-zero device threshold voltage, but  $R_{rect}$  scales down accordingly and  $Q_{rect}$  degrades severely. More details on the rectifier property can be found in [21].

Finally,  $FOM_{Rx}$  can be calculated for the rectenna inductors, and the results are plotted in Fig. 2.14 as functions of the operation frequency. If the operation frequency is low, such as 0.5 GHz, then  $Q_{tag,L} << Q_{rect}$  and  $Q_{tag,L}//Q_{rect} \approx Q_{tag,L}$ . In such a case, increasing the inductor turns improves  $FOM_{Rx}$  because both  $A_{tag}$  and  $Q_{tag,L}$  increase. On the other hand, if the IPT is operated at a higher frequency, such as 8 GHz, then using an inductor with a high number of turns can easily make  $Q_{rect} << Q_{tag,L}$  and  $Q_{tag,L}//Q_{rect} Q_{rect}$ . In such a case,  $FOM_{Rx}$  can be approximated by  $[(R_{rect}/V_{rect}\omega) \times (A_{tag}/L_{tag})]^2$ , and the number of turns should be reduced to increase  $A_{tag}/L_{tag}$  for a better  $FOM_{Rx}$ . The optimal inductor design should have a high  $A_{tag}$ , and  $Q_{tag,L}$  should be comparable to  $Q_{rect}$ . Fig. 2.14 shows that  $FOM_{Rx}$  has the maximum value of -129.5 dB at 3.5 GHz. This is achieved with an inductor with N = 7 and  $\omega = 3 \ \mu m$ , and  $(A_{tag}, Q_{tag,L}, Q_{rect}) = (0.048 \ mm^2, 7, 16)$ .



Figure 2.14: Rectenna  $FOM_{Rx}$  with different rectenna coil designs.

### **IPT** Design

Both the FOMTx plotted in Fig. 2.11(b) and the  $FOM_{Rx}$  plotted in Fig. 2.14 shall be employed to decide the reader coil, rectenna design, and the IPT frequency. The optimized performances for the IPT frequency from 2 to 7 GHz are summarized in Table 2.1. The projected IPT power, in unit of dBW, can be easily obtained by the summation of  $-dB(FOM_{Tx})$  and  $-dB(FOM_{Rx})$ . The lowest calculated power is 28.5 dBm, with a DF reader coil and IPT designed at 6 GHz. In this work, the IPT is designed at 5 GHz, which has a slightly higher theoretical power of 28.9 dBm. Designing at 4 GHz is also recommended. In practice, RF PAs are more expensive and less efficient at higher frequencies. This factor should be considered in the frequency planning.

The prototype DF coil is realized with a coil radius of 1.9 mm and trace width of 0.4 mm, which corresponds to a (analytically) calculated  $FOM_{Tx}$  of 130.0 dB, only 1.2 dB deviated from the optimal value generated by the analytical optimization (with a coil radius of 1.6 mm and trace width of 0.6 mm). EM simulation shows the  $FOM_{Tx}$  deviation between the two designs is even lower because the 0.4-mm coil has its current centered better at the desired loop radius. Two SE coils, with substantially worse  $FOM_{Tx}$ , are still realized and measured for comparison. The two SE coils have coil radius of 1.9 mm (SE1) and 1.4 mm (SE2),

| Freq. | FOM <sub>Rx</sub> | SE FOM <sub>Tx</sub> | SE Power | DF FOM <sub>Tx</sub> | DF Power |
|-------|-------------------|----------------------|----------|----------------------|----------|
| (GHz) | (dB)              | (dB)                 | (dBm)    | (dB)                 | (dBm)    |
| 2.0   | -130.9            | 125.5                | 35.4     | 127.2                | 33.7     |
| 3.0   | -129.6            | 126.3                | 33.3     | 129.2                | 30.4     |
| 4.0   | -129.4            | 126.1                | 33.3     | 130.5                | 28.9     |
| 5.0   | -130.1            | 125.4                | 34.7     | 131.2                | 28.9     |
| 6.0   | -130.1            | 124.4                | 35.7     | 131.6                | 28.5     |
| 7.0   | -130.6            | 123.8                | 36.8     | 131.7                | 28.9     |

Table 2.1: Optimized IPT performance using SE and DF coil.

and the calculated  $FOM_{Tx}$  are 122.3 and 123.8 dB, respectively. The PCB photographs of the three readers are shown in Fig. 2.15(a), and the EM-simulated  $FOM_{Tx}$  for the three designs are plotted in Fig. 2.15(b) to the IPT frequency. At frequencies very close to the coil SRF (e.g., 4.2 GHz for SE1 and 5.5 GHz for SE2), the  $FOM_{Tx}$  cannot be evaluated accurately from the EM simulation because both the simulated input resistance and the mutual inductance are extremely high. At 5 GHz, the EM-simulated  $FOM_{Tx}$  are 129.0, 121.5, and 123.0 dB, for the DF, SE1 and SE2 coil, respectively. The simulated  $FOM_{Tx}$  are about 1 dB lower than the results obtained from the analytical analysis. The good agreement verifies the analytical approach.

The photograph and schematic of the adopted rectenna are shown in Fig. 2.13. The on-silicon inductor uses N = 4 and  $w = 3 \ \mu \text{m}$  and has a  $FOM_{Rx}$  of -131.1 dB at 5 GHz. The achieved  $FOM_{Rx}$  is also close to the optimal value. The IPT performances for the three reader coils and the rectenna are summarized in Table 2.2. For the designed DF coil, the required power, calculated by the analytical approach, is 31.1 dBm. The EM-simulated result, including the loss of the impedance matching network (MNW), is 33.2 dBm. The MNW transforms the coil input impedance to 50  $\Omega$  to extract power from the PA module. Substantially higher reader powers around 10 W are required for the two SE reader coils.

Although the ground plane attenuates the magnetic field generated by the reader coil, the radiation resistance is known to reduce with the presence of the ground plane [106]. The coil input resistance, mutual inductance, and  $FOM_{Tx}$  of the designed DF coil are simulated, with and without the ground plane, and the results are plotted in Fig. 2.16. It shows that removing the ground plane indeed increases the mutual inductance from 4.1 pH to 5.6 pH at 5 GHz, but  $FOM_{Tx}$  does not improve due to the increasing input resistance of the coil from 3.9 to 9.0  $\Omega$ .

| PCB Substrate                                  | 1.6-mm Rogers 4003C   |        |        |  |
|------------------------------------------------|-----------------------|--------|--------|--|
| Coupling Distance (mm)                         | 2.2                   |        |        |  |
| Reader Coil Structure                          | DF                    | $SE_1$ | $SE_2$ |  |
| IPT Frequency (GHz)                            | 5.0                   |        |        |  |
| Optimal Reader-Coil FOM <sub>Tx</sub> (dB)     | 131.2 125.4           |        |        |  |
| Optimal Rectenna FOM <sub>Rx</sub> (dB)        | -130.1                |        |        |  |
| Coil Radius (mm)                               | 1.9                   | 1.9    | 1.4    |  |
| Trace Width (mm)                               | 0.4                   | 0.4    | 0.2    |  |
| Calculated Reader-Coil FOM <sub>Tx</sub> (dB)  | 130.0                 | 122.3  | 123.8  |  |
| Rectenna Coil Size (µm)                        | $100 \times 100$      |        |        |  |
| Rectenna Inductor Turns/Width                  | $N = 4/w = 3 \ \mu m$ |        |        |  |
| Calculated Rectenna FOM <sub>Rx</sub> (dB) -13 |                       |        |        |  |
| Calculated Minimum Power (dBm)                 | 28.9 34.7             |        |        |  |
| Calculated Power (dBm)                         | 31.1                  | 38.8   | 37.3   |  |
| EM-simulated FOM <sub>Tx</sub> (dB)            | 129.0                 | 121.5  | 123.0  |  |
| EM-simulated FOM <sub>Rx</sub> (dB)            | -131.3\$              |        |        |  |
| EM-simulated Power (dBm)                       | 32.3                  | 39.8   | 38.3   |  |
| EM-simulated Power (dBm)*                      | 33.2                  | 40.7   | 39.2   |  |
| Measured Power (dBm)                           | 33.1                  | 40.5%  | 39.5%  |  |

<sup>s</sup>Use full-wave EM-simulation for the CMOS inductor

\* Include the loss of the reader-coil impedance matching network <sup>%</sup> Estimated from the IPT back-off characteristic with the DF reader coil

Table 2.2: IPT performance with a miniature tag.



Figure 2.15: (a) Photograph of the three PCB readers: SE1, SE2, and DF. (b) EM-simulated  $FOM_{Tx}$  for the three reader coils.

### **IPT** Measurement

The measured input return losses  $(|S_{11}|)$  of the three readers are plotted in Fig. 2.17. According to the  $dB(|S_{11}|)$  responses, the operation IPT frequencies for the three coils in the measurement, corresponding to the lowest  $|S_{11}|$ , are 4.80, 4.92, and 4.77 GHz for the DF, SE1, and SE2 PCB reader, respectively. The same to the tags reported in [21, 22], the on-chip varactor helps aligning the resonance frequency of the rectenna LC-tank to the operation frequency. The rectenna output dc voltage versus the varactor bias ( $V_{bias}$ ) is plotted in Fig. 2.18. The rectenna is placed 2.2 mm above the center of the reader coil, controlled precisely by a probe station. The ground pad, varactor bias pad, and the rectifier output pad of the rectenna are wirebond-connected to an external waveform generator and a multimeter for performance characterization. When using the DF coil, two signal generators (SG) and PA modules (ZVE-8G) are used with the same output power (30.1 dBm). The phase of one SG is adjusted to confirm a phase difference of 180° between the two PA outputs.

Fig. 2.18 demonstrates that with a total RF power of 33.1 dBm, the rectenna outputs the desired dc condition of 1 V at a 10-k $\Omega$  load resistor and harvests a 0.1-mW dc power (RF-dc PTE of -43 dB). The optimal varactor bias is 0.8 V. Fig. 2.18 also shows that utilizing the same RF power of 33.1 dBm, the SE1 and SE2 coils can only charge the rectenna to 0.20 V (4  $\mu$ W) and 0.27 V (7  $\mu$ W), respectively. To estimate the required reader power for the rectenna to harvest the required 0.1-mW dc power, the rectenna output is also measured with the input power of the DF reader reduced to lower levels, and the results are annotated



Figure 2.16: EM-simulated input resistance, mutual inductance, and  $FOM_{Tx}$  for the designed DF coil with and without the ground plane.

in Fig. 2.18. The back-off characteristics ( $V_{out} = 0.27$  V under 26.7 dBm and  $V_{out} = 0.20$  V under 25.7 dBm) of the DF coil predict that the input powers have to be enhanced to 40.5 and 39.5 dBm for the SE1 and SE2 reader, respectively, for the rectenna to harvest the required dc power. This agrees well with the analysis and simulation.

Since the reader MNW is designed for the differential input signal, the reflection powers on both sides can only be simultaneously minimized for a differential input signal. Fig. 2.19 plots the measured reflected powers at both inputs of the DF reader to the phase deviation from the desired 180°. The corresponding rectenna output dc voltage is also plotted. It shows the phase deviation increases the reflected powers and therefore less power can be delivered into the coil. The rectenna output voltage remains higher than 0.9 V (81  $\mu$ W) when the phase offset is less than 50°.

Table 2.3 summarizes the reported IPT performances employing a sub-millimeter tag (i.e., [2, 13, 15, 19–22]). The IPT FOM proposed in [6] compares the works with the coupling distance and the size of the rectenna coil normalized. This work achieves the best RF-dc IPT FOM of -2.7 dB with a well-optimized DF reader coil aided by the equation-based approach.



Figure 2.17: Measured input reflection coefficient (S11) for the three reader coil designs.



Figure 2.18: Measured rectenna output voltage versus the varactor bias for the three IPT designs.



Figure 2.19: Measured reflection powers at the two inputs of the DF reader coil versus the phase difference between the two inputs.

| Ref.         | Tag<br>Coil<br>(mm²) | Dist-<br>ance<br>(mm) | RF<br>Freq.<br>(GHz) | Power<br>into Coil<br>MNW<br>(dBm) | dc<br>Power<br>(mW) | RF-RF<br>Eff.<br>(dB) | RF-<br>DC<br>Eff.<br>(dB) | *IPT<br>FOM<br>(dB) | Integration<br>Level |
|--------------|----------------------|-----------------------|----------------------|------------------------------------|---------------------|-----------------------|---------------------------|---------------------|----------------------|
| [13]         | 0.50                 | 0.5                   | 2.5                  | 24                                 | 0.09                | N.A.                  | -34                       | -38                 | CMOS Rectenna        |
| [15]         | 0.13                 | 1.0                   | 1.5                  | 17                                 | 0.01                | N.A.                  | -37                       | -24                 | CMOS Rectenna        |
| [22]         | 0.04                 | 1.1                   | 2.0                  | 21                                 | 0.1                 | -29 (Sim)             | -31                       | -9                  | CMOS Rectenna        |
| [2]          | 0.014                | 0.05                  | 3.5                  | 22                                 | 13.7                | N.A.                  | -10                       | -21                 | CMOS Rectenna        |
| [19]         | 0.01                 | 0.5                   | 2.0                  | N.A.                               | N.A.                | -27                   | N.A.                      | N.A.                | Coils only           |
| [20]         | 0.01                 | 0.5                   | 2.5                  | 20                                 | 0.14                | N.A.                  | -29                       | -8                  | Discrete Rectifier   |
| [21]         | 0.01                 | 1.2                   | 2.0                  | 38                                 | 0.1                 | -46 (Sim)             | -48                       | -16                 | CMOS Rectenna        |
|              |                      |                       | 4.7                  | 31                                 | 0.1                 | -39 (Sim)             | -41                       | -9                  | CMOS Rectenna        |
| This<br>Work | 0.01                 | 2.2                   | 4.8                  | 33                                 | 0.1                 | -41 (Sim)             | -43                       | -3                  | CMOS Rectenna        |

\*IPT FOM = (Eff.)× Dist.<sup>3</sup>/Tag Coil Area<sup>1.5</sup> [6]

Table 2.3: Reported IPT Performance with a miniature tag (tag size  $< 1 \text{ mm}^2$ )

### Reexamination of the Reported IPTs in [21, 22]

The two 1.2-mm IPT designs reported in [21] were optimized at 2.0 and 4.7 GHz. Although  $FOM_{Tx}$  and  $FOM_{Rx}$  have not been defined in [21], the reader coil and the rectenna designs were decoupled utilizing the tag miniaturization. The optimization of the reader coil and the rectenna relied completely on EM simulation, so the explored design space is limited. Here the analytical expressions and optimization are adopted again to identify the optimal designs and evaluate the two previous designs, which are revealed to be suboptimal.

The two IPTs both employ SE reader coils, and the coils are realized on 1.6-mm FR4 substrate with  $\tan \delta = 0.02$  and  $\epsilon_r = 4.4$ . With the updated substrate parameters and the new coupling distance of 1.2 mm, the analytically optimized reader-coil FOM  $(FOM_{Tx})$ , including the radiation resistance, is plotted in Fig. 2.20. The optimal  $FOM_{Tx}$  is 131.4 dB at 2.0 GHz and 133.3 dB at 4.7 GHz, while the calculated  $FOM_{Tx}$  are 128.8 dB and 131.6 dB for the previous 2-GHz and the 4.7-GHz design, respectively. The reader-coil design for the 2-GHz IPT deviates substantially from the optimal one. On the other hand, the rectenna was fabricated in the same CMOS process and the coil size is also limited to 0.1 mm × 0.1 mm, so the corresponding  $FOM_{Rx}$  can be checked in Fig. 2.14. In [21], the realized  $FOM_{Rx}$  for the 2-GHz and 4.7-GHz designs are -131.9 dB (N = 6,  $w = 3\mu$ ) and -131.2 dB (N = 4,  $w = 3\mu$ ), respectively, pretty close to the optimal values of -130.9 dB (N = 7,  $w = 3\mu$ ) and -129.8 dB (N = 6,  $w = 3\mu$ ). Finally, Table 2.4 summarizes the IPT performances obtained by the analytical method, EM-simulation, and measurement. Using the analytical expressions, the calculated IPT powers for the previous 2-GHz/4.7-GHz designs (33.1/29.6 dBm) are close to the results from the EM simulation (34.5/31.0 dBm).

The second design example is the 2-GHz IPT reported in [22]. The coupling distance in [22] is 1.1 mm and the rectenna coil size is  $0.2 \text{ mm} \times 0.2 \text{ mm}$ . This design also uses a SE coil on 1.6-mm FR4 substrate, and the optimized  $FOM_{Tx}$  has already been plotted in Fig. 2.20. The optimal  $FOM_{Tx}$  is 132.7 dB at 2 GHz, while the calculated  $FOM_{Tx}$  is 129.0 dB for the reader coil used in [22] with coil radius of 1.8 mm and trace width of 0.25 mm. The coil design is far from optimal with a high deviation of 3.7 dB in  $FOM_{Tx}$ . On the other hand, the calculated  $FOM_{Rx}$  are plotted in Fig. 2.21 for different tag inductors with inductor size of 0.2 mm x 0.2 mm. The larger inductor footprint and the lower operation frequency at 2 GHz suggest using an inductor with more turns. The larger inductor area also allows using a wider metal trace of 5  $\mu$ m to improve the inductor quantity factor. The realized six-turn inductor in [22] with trace width of 5  $\mu$ m corresponds to a  $FOM_{Rx}$  of -118.3 dB, quite close to the optimal  $FOM_{Rx}$  of -118.0 dB (N = 7). The calculated IPT power is 19.3 dBm, which is substantially higher than the minimum calculated value of 15.3 dBm. The IPT power estimated by the analytical approach (19.3 dBm) is close to the EM-simulated value of 20.0 dBm. The IPT performances obtained by the analytical method, EM-simulation, and experiment are summarized in Table 2.5.



Figure 2.20: Optimized  $FOM_{Tx}$  for the SE coil on 1.6-mm FR4 with coupling distance of 1.2 mm. (The results with coupling distance of 1.1 mm are also plotted for the next design example [22])

| PCB Substrate                                 | 1.6-mm FR4       |         |  |
|-----------------------------------------------|------------------|---------|--|
| Reader Coil Topology/Distance (mm)            | SE/1.2           |         |  |
| IPT Frequency (GHz)                           | 2.0              | 4.7     |  |
| Optimal Reader-Coil FOM <sub>Tx</sub> (dB)    | 131.4            | 133.3   |  |
| Optimal Rectenna FOM <sub>Rx</sub> (dB)       | -130.9           | -129.8  |  |
| Reader Coil Radius/Width (mm)                 | 1.5/0.2          | 0.7/0.2 |  |
| Calculated Reader-Coil FOM <sub>Tx</sub> (dB) | 128.8            | 131.6   |  |
| Rectenna Coil Size (µm)                       | $100 \times 100$ |         |  |
| Inductor Turns/Trace Width (µm)               | 6/3              | 4/3     |  |
| Calculated Rectenna FOM <sub>Rx</sub> (dB)    | -131.9           | -131.2  |  |
| Calculated Minimum Power (dBm)                | 29.5             | 26.5    |  |
| Calculated Power for Designs (dBm)            | 33.1             | 29.6    |  |
| EM-simulated Power (dBm)                      | 34.5             | 31.0    |  |
| EM-simulated Power* (dBm)                     | 35.8             | 31.8    |  |
| Measured Power (dBm)                          | 38.0             | 31.4    |  |

\* Include the loss of the reader-coil impedance matching network

Table 2.4: Reported IPT performance with a miniature tag [21].



Figure 2.21: Rectenna FOMs  $(FOM_{Rx})$  with different rectenna coil designs with inductor size of 200  $\mu$ m × 200  $\mu$ m.

| PCB Substrate                                                                                                            | 1.6-mm FR4                   |
|--------------------------------------------------------------------------------------------------------------------------|------------------------------|
| Reader Coil Topology/Distance (mm)                                                                                       | SE/1.1                       |
| IPT Frequency (GHz)                                                                                                      | 2.0                          |
| Optimal Reader-Coil FOM <sub>Tx</sub> (dB)                                                                               | 132.7                        |
| Optimal Rectenna FOM <sub>Rx</sub> (dB)                                                                                  | -118.0                       |
| Reader Coil Radius/Trace Width (mm)                                                                                      | 1.8/0.25                     |
| Calculated Reader-Coil FOM <sub>Tx</sub> (dB)                                                                            | 129.0                        |
| Rectenna Coil Size (µm)                                                                                                  | $200 \times 200$             |
| Inductor Turns/Trace Width (µm)                                                                                          | N = 6/w = 5                  |
| Calculated Rectenna FOM <sub>Rx</sub> (dB)                                                                               | -118.3                       |
| Calculated Minimum Power (dBm)                                                                                           | 15.3                         |
| Calculated Power for the Design (dBm)                                                                                    | 19.3                         |
| EM-simulated Power (dBm)                                                                                                 | 20.0                         |
| EM-simulated Power* (dBm)                                                                                                | 20.7                         |
| Measured Power (dBm)                                                                                                     | 21.1                         |
| EM-simulated Power for the Design (dBm)<br>EM-simulated Power (dBm)<br>EM-simulated Power* (dBm)<br>Measured Power (dBm) | 19.3<br>20.0<br>20.7<br>21.1 |

\* Include the loss of the reader-coil impedance matching network

Table 2.5: Reported IPT Performance with a miniature tag [22]



Figure 2.22: Calculated  $R_{rad,SEloop,n}$ ,  $R_{rad,DFloop,n}$ , and  $R_{rad,SEvia,n}$  as functions of  $l/\lambda$ . The (closed-form) analytical approximations are annotated.

### **Closed-form Approximations for the Radiation Resistances**

Although the radiation resistance can be evaluated numerically from (2.14)-(2.17), closedform approximations for  $R_{rad,SEloop}$ ,  $R_{rad,DFloop}$ , and  $R_{rad,SEvia}$  are desired to be included in the equation-based PTE optimization. It is observed from Fig. 2.22 that  $log(R_{rad,SEvia,n})$ ,  $log(R_{rad,SEloop,n})$ , and  $log(R_{rad,DFloop,n})$  can be approximated by first-order polynomials of  $log(l/\lambda_{eff})$ . Therefore, the closed-form approximation can be obtained and are annotated in Fig. 2.22. The closed-form approximations are pretty accurate (i.e.,  $R^2 > 0.97$ ) and have been included in the  $FOM_{Tx}$  calculation and optimization.

### 2.3 Turning-Point Bifurcation in Weakly-Coupled IPT

Eq. (2.1) indicates that improving the quality factors of the rectenna inductor  $(Q_{tag})$ and the rectifier  $(Q_{rect})$  can reduce the required source power. However, it has been demonstrated in our work [110] that as  $Q_{tag}//Q_{rect}$  goes higher, eq. (2.1) can be invalid. More precisely, with a sufficiently high  $Q_{tag}//Q_{rect}$  (e.g.  $Q_{tag}//Q_{rect} > 30$ ), the  $V_{rect} - P_{source}$  trace



Figure 2.23: Illustrative hysteresis  $V_{dc}$ - $P_{source}$  curve.

can exhibit a hysteresis response induced by the nonlinear matching capacitance and the nonlinear diode capacitance. Fortunately, in our design employing a CMOS rectenna, the undesired hysteresis should not occur.

Detailed analysis and experiments on the hysteresis phenomenon can be found in in [110]. The hysteresis response is illustrated in Fig. 2.23. In such a case, as  $P_{source}$  increases from a low value,  $V_{rect}$  and the rectifier output dc voltage  $(V_{dc})$  stay on the lower hysteresis branch. When the  $(P_{source}, V_{rect})$  solution encounters the first turning-point bifurcation point at a high input power, it jumps to the upper hysteresis branch. The desired solution calculated by (2.1) locates on the upper hysteresis branch and is close to the second bifurcation point at a lower input power level. Although the desired solution can be obtained by reducing  $P_{source}$ , the input power at the first bifurcation point is much higher than the power predicted by (2.1). Therefore, the bifurcation is an undesired phenomenon.

Two IPT examples demonstrating the hysteresis (Rx2) and mitigating it (Rx3) have been realized on FR4. The schematics are shown in Fig. 2.24. Fig. 2.25 plots the rectifier output voltages versus the reader power for the Rx2 rectenna (at three test frequencies), which exhibit obvious hysteresis responses. The input power at the first bifurcation point are 31.5, 28, and 24 dBm for the IPT to operate at 1.961, 1.911, and 1.872 GHz, respectively. The second rectenna (Rx3) is a design using an auxiliary parallel varactor to partially neutralize



Figure 2.24: Schematic and photograph of the example IPT systems.

the undesired device nonlinearity. Fig. 2.26 shows the lower hysteresis branches for Rx2 and Rx3. The measured powers at the first bifurcation points are below 23 dBm for Rx3, much lower than that of Rx2. Details on the auxiliary varactor and the compensation mechanism can be found in [110].


Figure 2.25: Simulated and measured output dc voltage versus input power for Rx2.



Figure 2.26: Performance comparison between Rx2 and Rx3. Only the power-up curves are shown.

# Chapter 3

# Uplink Designs for a Miniature Rectenna

### **3.1** Backscattering Uplink

Direct backscattering with direct ADC demodulation is used for the IPT design with the 200- $\mu$ m rectenna [22]. The uplink block diagram is illustrated in Fig. 3.1(a). In the backscattering uplink, the reader senses the two digital tag states by measuring the reflected Tx power, which is modulated by applying a digital signal directly to the tag varactor. In this system, when the tag sends "1" symbol, the on-tag LC-tank is in resonance ( $V_{bias} =$ 1.4 V); when the tag sends a "0" symbol, the on-tag LC-tank is off resonance ( $V_{bias} = 0$  V). Alternatively, the varactor can be replaced by a fixed tuning capacitor in parallel/series with a switch, so that the two tag states can be achieved by turning the switch on and off [30]. Inevitably, the power transfer is compromised at one tag state, where the on-tag LC-tank is off-resonance. Several studies have been dedicated to energy harvesting when the uplink is operational [120], but this study does not focus on that aspect. The directional coupler (Krytar 158010) is used to extract to backscattering signal [32]. The coupler has an insertion loss about 0.8 dB, isolation about 35 dB, and coupling loss of 10 dB from the coupler output port to the isolation port. Since the achieved  $|S_{11}|$  of the coil network is about -15 dB, the dominant Tx-to-Rx leakage path follows the Tx power that is reflected from the coupler output port and then coupled to the isolation port. The noise floor associated with this leakage is substantially higher than the thermal noise floor so the 10-dB coupling loss does not degrade the Rx SNR.

Fig. 3.1(b) shows the measured Rx power spectrum at the coupler isolation port, centered at the Tx carrier of 1.99 GHz. In this measurement, the sent data toggles alternating between the two states "1" and "0" with bit rate of 625 kb/s, as illustrated in Fig. 3.1(a). In order to maintain the output dc voltage close to 1 V, only 3/16 of the 1.6- $\mu$ s bit duration (0.3  $\mu$ s) is used for the data transmission and the rest 13/16 of the period (1.3  $\mu$ s) is used solely for the power transmission ( $V_{bias} = 1.4$  V), where the rectenna LC-tank is in resonance.



Figure 3.1: (a) Block diagram for the backscattering measurement. (b) Output spectrum of the backscattering signal. (c) Demodulated baseband constellation using oscilloscope and square demodulation.

In addition, the source power is enhanced to 23 dBm to overcome the insertion loss. The spectrum shows that the backscattering signal strength is -70 dBm, the in-band blocker at 1.99 GHz (Tx-to-Rx leakage tone) is -5 dBm, and the signal-to-blocker ratio (SBR) is -65 dBc. The noise is -154 dBm/Hz, so the SNR is 84 dB-Hz. To demodulate the 625-kb/s baseband signal, the backscattering signal was sampled by Agilent 54855A oscilloscope with 8-bit resolution and a sampling rate of 5 GS/s. The sampling rate must be two-times higher than the Tx carrier frequency. The Rx is attenuated in order to accommodate the  $\pm 40 \text{ mV}$  ADC full scale, and the noise are attenuated to -18 dBm, -83 dBm, and -167 dBm/Hz, respectively. It is important to notice that the high blocker results in an excessive ADC full range and quantization noise floor, which degrades the Rx SNR. The quantization noise can be calculated by  $\Delta^2/12 = 8.5e-9 V^2$  or 1.7e-18  $V^2/\text{Hz}$  (about -165 dBm/Hz). Obviously, the quantization cannot be ignored and dominates the Rx noise.

Successively, the error vector magnitude (EVM) of the demodulated baseband signal is calculated and measured. The demodulation scheme used here is the self-mixing or square demodulation. After the 1000-tap digital filtering or equivalently a comb digital low-pass filtering with bandwidth of 5 GHz/1000 = 5 MHz, the measured baseband constellation is plotted in Fig. 3.1(c) with EVM of -24 dB. It has been reported in [22] that the phase of the 1.99-GHz in-band blocker is very close to the phase of the backscattering carrier at the same frequency; therefore, the EVM of the square demodulation, with the presence of the dominant in-band blocker, can be predicted by

$$EVM_{Square Demodulation in RF Domain} = \frac{2}{\pi^2} \frac{BW_{LPF}}{SNR_{SSB@RF}}.$$
(3.1)

Plugging in  $SNR_{SSB@RF} = 82$  dB-Hz and  $BW_{filter} = 5$  MHz into (3.1) predicts a demodulated baseband EVM of -22 dB, pretty close to the measured result of -24 dB. It is demonstrated that the uplink communication based on direct backscattering and direct ADC sampling is feasible for this 1.1-mm IPT using a 0.2-mm rectenna. However, as the rectenna size further shrinks, the uplink signal definitely becomes weaker while the Rx noise floor, the in-band blocker, and the resulting ADC quantization noise all go higher to undermine this uplink technique.

The ADC quantization noise has been shown to be the dominant Rx noise contributor in [22]. In the following uplink designs [21, 29, 39, 43], the Rx signals are down-converted such that the substantial in-band blocker is converted to dc and can be easily filtered by a dc block. As the result, the ADC full range can be tailored to accommodate the magnitude of the desired signal and the ADC quantization noise can be ignored. Nevertheless, it has been shown in [39] that the high phase noise associated with the strong in-band blocker cannot be completely canceled after the mixer down-conversion, and the residue noise still overwhelms the weak uplink signal, and direct backscattering is not an appropriate uplink method when using the rectenna with 0.1-mm coil size.

Alternatively, the IF-based backscattering uplink [34] is adopted in [21, 29]. The signal applied to the tag varactor is an 2-MHz IF carrier modulated by a baseband signal. The uplink diagram embedded in the 4.7-GHz, 1.2-mm IPT [21]. The single-ended reader coil and Rx system are illustrated in Fig. 3.2(a). When the tag sends "0", the tag varactor is driven by a 2 MHz sinusoidal waveform swinging between 0 and 1.2 V, whereas for a "1" the varactor is biased at 0.6 V (on-tag LC-tank in resonance). The 2-MHz frequency spacing between the backscattering signal and the Tx leakage protects the Rx spectrum from being contaminated by the Tx phase noise. Owing to the high Tx-to-Rx noise leakage, the coupler coupling loss and mixer-first conversion scheme do not degrade the Rx SNR.

The measured Rx spectrum (in the IF domain) is plotted in Fig. 3.2(b). The baseband signal is a 20 kb/s alternating one-zero sequence (square-wave modulation). The phase of the LO signal has to be tuned to maximized the Rx power. It can be observed that the major part of the Rx noise is close to dc and can be easily filtered in the digital domain. The Rx signal is -32 dBm, the Rx noise is -99 dBm/Hz, and the Rx SNR in the IF domain is 67 dBc-Hz. The sampled IF signal is band-pass filtered, squared, and ten low-pass filtered to demodulate the baseband signal. With the tag sending a 20-kb/s PRBS signal, the demodulated constellation is plotted in Fig. 3.2(c), and the constellation EVM is -25 dB. The EVM of the square demodulation performed to the IF signal has been calculated in [39] and can be expressed by

$$EVM_{Square Demodulation in IF Domain} = \frac{4}{\pi^2} \frac{BW_{LPF}}{SNR_{SSB@IF}}.$$
(3.2)

Plugging in  $SNR_{SSB@IF} = 67$  dBc-Hz and  $BW_{LPF} = 40$  kHz into (3.2) yields an EVM of -25 dB, very close to the measured result.

The second IF-based backscattering uplink is embedded in the 4.8-GHz, 2.2-mm IWPT using the differential reader coil [29]. In this design the IF frequency is set at 5 MHz. The block diagram is illustrated in Fig. 3.3(a). The Rx spectrum plotted in Fig. 3.3(b) shows a degraded Rx SNR of 62 dB-Hz due to the enhanced coupling distance. Based on (3.2) and the digital LPF bandwidth of 40 kHz (for a PRBS uplink data rate of 10 kb/s), the calculated demodulated baseband EVM is -20 dB. The measured constellation is plotted in Fig. 3.3(c). The measured EVM is -17.8 dB for the square demodulation, quite close to the estimated value. The EVM can be further improved by using the IQ demodulation for the IF signal in the post processing. The 5-MHz in-phase and quadrature signal are synthesized in Matlab, and the sampled IF signal is down-converted again and then low-pass filtered. In such a case, the demodulated baseband EVM improves 3 dB compared to the square demodulation and can be expressed as:

$$EVM_{IQ \text{ Demodulation in IF Domain}} = \frac{2}{\pi^2} \frac{BW_{LPF}}{SNR_{SSB@IF}}.$$
(3.3)



Figure 3.2: (a) Block diagram of the first IF-based backscattering uplink. (b) Downconverted Rx spectrum with a 20-kb/s square-wave uplink signal. (c) Demodulated constellation for a 20-kb/s PRBS uplink.



Figure 3.3: (a) Block diagram of the second IF-based backscattering uplink. (b) Downconverted Rx spectrum with a 40-kb/s square-wave uplink signal. (c) Constellation for a 10kb/s PRBS uplink using square demodulation (top) and quadrature demodulation (bottom).

The measured constellation using the quadrature demodulation is plotted in Fig. 3.3(c). With the filter bandwidth of 40 kHz, the measured EVM is -20.5 dB. Indeed, the EVM improves about 3 dB (2.7 dB) compared to the square demodulation.

# 3.2 HD2 Uplink

It has been revealed that the performance of the direct backscattering uplink is limited by the ADC quantization noise because the high in-band blocker has to be accommodated by the ADC, and the IF-backscattering is limited by the Tx-to-Rx noise leakage. To put it simply, the Tx-to-Rx leakage is the main performance limiter.

An effective approach to remove the Tx-to-Rx interference/noise is to separate the Tx and the Rx frequency, similar to frequency-division duplexing (FDD) radios. In FDD radios, the Tx leakage falling into the Rx frequency is filtered at the Tx output, and the Tx leakage outside the Rx frequency is filtered at the Rx input. For instance, a miniature radio [5] uses 24 GHz for downlink and 60 GHz. As expected, this kind of technique has to generate a new frequency on tag and two sets of dual-band antenna and matching networks, both on tag and reader, are required for the two widely-separated frequencies. To use a single set of passive networks for each tag and each reader, IF-based backscattering uplink can be employed to eliminate a set of passive networks from both the tag and reader. Notice that for the IF-backscattering uplinks with the Tx/Rx frequency spacing of 2 MHz [21] and 5 MHz [29], the Tx noise (in the Rx band) cannot be individually filtered at the PA output without degrading the wireless power transfer. Alternatively, the IF frequency can be increased to a much higher value (e.g. 100 MHz) [34]. This can indeed achieve a low Rx noise, but at the cost of implementing the high-frequency on-chip IF source and modulator with a higher rectenna power consumption.

In most RFID and wireless power transfer applications, the diode rectifier has been used solely to generate the dc current. Recently, it was reported in [38] that by connecting a commercial UHF RFID tag to a dual-band antenna network (0.87 GHz and 2.61 GHz), the third-harmonic frequency generated by the tag can serve as an auxiliary uplink channel in addition to the main channel at the fundamental frequency. However, the detection range of the auxiliary channel, even with a high-gain horn antenna at Rx, is reduced compared to the fundamental backscattering uplink channel. The harmonic component was assumed to be generated by the rectifier.

A novel uplink scheme is proposed in our previous work [40] to separate the Tx/Rx frequency spacing without requiring an on-tag IF. It contributes by firstly proposing an IPT design exploiting the rectifier second-order nonlinearity. The developed theory is verified by simulation and PCB implementation. In the proposed second-harmonic (SH) uplink, the Tx transmits a continuous waveform (1.8 GHz) and the tag rectifier works as a frequency doubler. The newly-generated frequency (3.6 GHz) is distant from the Tx fundamental frequency used for the wireless powering, so the undesired Tx interference/noise at the Rx frequency can be filtered at the PA output without compromising the wireless power transfer. The tag-to-reader uplink can be performed by modulating the SH frequency. The coupled-coils used for wireless power transfer are reused for uplink (at the cost of the degraded UL power), so no additional area is required on the tag and reader design. Compared to the backscattering uplink, the Rx signal strength of the SH uplink reduces by 34 dB. Nevertheless, thanks to the effectively filtering on the Tx-to-Rx interference and noise, the SNR of the SH uplink

does not degrade, and it demonstrates a significantly improved Rx SBR, 37-dB better than the backscattering uplink.

To be clear, Fig. 3.4(a) illustrates the block diagram of a conventional IPT system employing the direct backscattering. The reflected power can be modulated by applying the digital signal directly to the tag varactor/switch. The Rx frequency is very close to the Tx frequency, so the Tx-to-Rx leakage and noise are difficult to suppress without attenuating either the wireless power transfer or the backscattering signal. Fig. 3.4(b) illustrates the proposed SH uplink. Similarly, digital signal is applied to the tag varactor. When the onrectenna LC-tank is in-resonance, the reactance matching at tag is enabled and the voltage swing at the diode rectifier is sufficient for it to generate the required dc current, and substantial SH current is generated as a beneficial by-product. The SH current returns to the reader Rx through coil coupling. When the tag LC-tank is off-resonance in the second tag state, the rectifier input swing is low and little SH current can be excited and be received at Rx. Therefore, the tag state can be distinguished by measuring the Rx SH component. This technique does not rely on the coupler/circulator isolation and accurate reader impedance matching to reduce the Tx-to-Rx leakage. More importantly and emphasized already, the Tx interference/noise at the SH can be eliminated by an external filter placed at the output of the Tx source. This low-pass filter (LPF) does not degrade the fundamental power. The Tx interference at the fundamental frequency can also be easily filtered by a high-pass filter placed at the Rx input, which does not attenuates the Rx signal.

Fig. 3.5(a) shows the SH uplink prototype on PCB. This IPT and uplink do not address any specific application. The PA output interference at 3.6 GHz is filtered at the PA output, and the Rx output-of-band blocker (1.8 GHz) is filtered at the Rx input. As the rectenna varactor bas voltage varies, the measured Rx SH power and the associated noise floor are plotted in Fig. 3.5(b). The SH power can achieve the maximum value of -56 dBm at  $V_{bias} =$ 2.5 V, quite close to the simulation [40]. The noise floor is about -152 dBm/Hz. Fig. 3.5(b) indicates that the Tx-to-Rx leakage at the SH can be suppressed to -64 dBm (when  $V_{bias} =$ 0 V). To our satisfaction, the desired Rx SH signal (-56 dBm) is much stronger than the Tx-to-Rx leakage at the SH frequency (-64 dBm).

In the second step, the uplink sends a square-wave signal, where a square waveform alternating between 2.5 V and 0 V is applied to the varactor with data rate of 1 Mb/s. The output dc voltage drops to 0.75 V because the power transfer is compromised when the varactor is biased at 0 V. The Rx spectrum centered at 3.6 GHz is shown in Fig. 3.6(a). The SBR is a very good value of +4 dB, the noise floor -150 dBm/Hz, and the Rx SNR is 89 dBc-Hz. Finally, the Rx spectrum measured at the fundamental frequency is presented in Fig. 3.6(b). The measurement is conducted with the HPF removed from the Rx input. It shows the backscattering uplink improves the signal strength by 34 dB,from -61 dBm to -27 dBm, but the SBR degrades from +4 dBc to -33 dBc, with a high Tx-to-Rx interference of 6.3 dBm. The noise floor is also substantially higher in the backscattering uplink (-116 dBm/Hz). The comparable SNR of the two approaches justifies the selection of the SH uplink over the backscattering, owing to the better SBR performance (+4 dBc vs. -33 dBc) and the similar system complexity. More details and comparisons can be found in our published



Figure 3.4: IPT block diagram with uplink employing (a) conventional direct backscattering uplink and (b) proposed SH uplink



Figure 3.5: (a) Schematic for the SH uplink prototype. (b) Measured Rx SH power and noise (at 3.6 GHz) versus the varactor bias.

paper [40].

The SH uplink is a useful technique to drastically reduce the Tx-to-Rx leakage, and the use of the rectifier nonlinearity brings little design overhead. However, the improvement on the Rx SNR is not significant for the SH uplink. This is because the rectenna LC-tank does not resonate at the SH frequency, so the Rx uplink power at the SH frequency is very weak. Although the impedance matching network on the reader side is designed to match the coil impedance at both the fundamental and the SH frequencies, the introduced loss is higher at the SH frequency.

## 3.3 Near-Field IM3 Uplink

To solve the (dual-RF-frequency) bandwidth issue while keeping the merit of separating the Tx/Rx frequency, a novel IM3 uplink technique is proposed. In our proposed IM3 uplink prototype introduced in [39, 41–43], a two-tone Tx is adopted and the IM3 frequency generated by the tag rectifier third-order nonlinearity is used as the Rx carrier, which is modulated by a baseband signal sent by the tag. The uplink signal at the IM3 frequency can be picked up by the reader coil. The IPT impedance matching networks for both the reader and the tag coil can be reused efficiently, since the IM3 frequency at 5.808 GHz is close to the Tx fundamental frequencies at 5.728 and 5.768 GHz.

Due to the Tx/Rx frequency separation, the Tx-to-Rx leakage can be suppressed by a commercial duplexer with the Tx band from 5.725 to 5.770 GHz and the Rx band from 5.805 to 5.850 GHz. With the aid of the duplexer, both the the Rx signal-to-noise ratio (SNR) and signal-to-blocker ratio can be significantly improved. The proposed technique with the duplexer is implemented within a 1.1-mm, 5.75-GHz IPT system [43], and the tiny CMOS rectenna with a coil size of only 0.01  $mm_2$  is used. The achieved uplink data rate is 10 Mb/s, better than the reported systems by two orders of magnitude.

Fig. 3.7 shows the schematic and photograph of the designed 5.75-GHz IPT. The rectenna is fabricated in 65 nm CMOS and the reader coil is on Rogers 4003 substrate. The coupled coils and the rectifier are optimized to have the highest power transfer efficiency to harvest a 1-V dc voltage at a 10-k $\Omega$  load resistor, at a 1.1-mm coupling distance. For the CW excitation (5.748 GHz) and two-tone Tx excitation (5.728/5.768 GHz), the simulated and measured  $V_{out}$  are shown in Fig. 3.8(a) as functions of the varactor bias ( $V_{bias}$ ). The measured results are close to simulation. A total power of 29 dBm is required to power up the rectenna in both schemes, and the optimal  $V_{bias}$  is 2 V.

As illustrated in Fig. 3.7(a), the two-tone voltage swing at the rectifier input, with the magnitude denoted by  $V_{rect}$ , excites an rectifier IM3 current, denoted by  $I_{IM3,rect}$ . The LC-tank is close to resonance at the IM3 frequency, so the IM3 current flowing into the tag coil, denoted by  $I_{IM3,coil2}$ , is higher than  $I_{IM3,rect}$ . At the reader coil, an induced voltage  $V_{IM3} = Z_{21} \times I_{IM3,ind}$  is in series with the source impedance of  $Z_{11}$ , and the maximum available Rx IM3 power is  $|V_{IM3}|^2/Real(8 \times Z_{11})$ . A more detailed explanation can be found in [39], which presents the first IM3 UL prototype without using a duplexer.



Figure 3.6: (a) Measured Rx spectrum for the SH uplink. (b) Measured Rx spectrum for the fundamental backscattering.



Figure 3.7: (a)Schematic and (b)die/PCB photographs of the designed CMOS rectenna and IPT in the IM3 uplink.

With the two-tone power (5.728/5.768 GHz) of 27 dBm/tone, the simulated  $V_{rect}$ ,  $V_{out}$ ,  $I_{IM3,rect}$ ,  $I_{IM3,coil2}$ , and the upper IM3 power at 5.808 GHz ( $IM3_H$ ) returned to the 50- $\Omega$  source are plotted in Fig. 3.8(b), with the varactor bias swept. As  $V_{bias}$  increases, the LC-tank resonance frequency approaches the Tx frequencies and  $V_{rect}$  increases; as the results,  $V_{out}$ ,  $I_{IM3,rectz}$ ,  $I_{IM3,coil2}$ , and the IM3 power all increase. Simulation shows the Rx IM3 power can reach -60 dBm and is effectively modulated by  $V_{bias}$ .

Fig. 3.9 illustrates the IM3 uplink system. The spectral representations of signals at many system nodes are provided. The Tx/Rx frequency and bandwidth of the 2-dB-loss duplexer perfectly match the proposed two-tone IPT and IM3 uplink. The duplexer has a Tx-to-Rx isolation better than 70 dB. Although the PA output has a high noise and IM3 spur, respectively of -128 dBm/Hz and -18 dBm (point A), a clean Rx spectrum can be achieved, even after LNA amplification with gain of 25 dB (noise figure = 4 dB). The measured noise floor of -145 dBm/Hz at the mixer input (point C) indicates the noise at the duplexer Rx output is close to the thermal noise. At the mixer input, the OOB Tx blockers are lower than -20 dBm and the PA IM3H leakage is only -67 dBm. The mixer down-conversion (LO = 5.748 GHz) and IF filtering and amplification result in a total conversion gain of 38 dB.

Without the presence of the tag, the Rx spectrum in the IF domain, centered at 60 MHz, is shown in Fig. 3.10(a). The in-band blocker is -29 dBm and the noise floor is -105 dBm/Hz. When the square-wave modulation illustrated in Fig. 3.9 is applied to  $V_{bias}$  ( $f_{BB} = 2.5$  MHz), the Rx spectrum is shown in Fig. 3.10(b) with two sidebands at -11 dBm. The in-band blocker is 0 dBm, partially contributed from the neighbor circuits co-illuminated by the two-tone Tx power. Based on this spectral information, the desired IM3 power from the rectenna with a constant bias  $V_{bias} = 2$  V can be easily estimated as -1 dBm. This measured result closely matches the simulated IM3 power of -60 dBm. Indeed, applying the 2-dB duplexer loss, 25-dB LNA and 38-dB conversion gains to the -60-dBm simulated IM3 power predicts a 1-dBm IM3 power. The backscattering sidebands close to the Tx frequencies are also plotted in Fig. 3.9, even though they are not used here.

Finally, a 5-Mb/s PRBS signal is applied to Vbias, with the high-state level of 2 V ( $V_{high} = 2$  V) and the low-state level,  $V_{low}$ , an operation variable. The Rx IF signal is sampled by a 500-Mb/s ADC (full range:  $\pm 0.8$  V), where the quantization noise can be neglected. The sampled 60-MHz IF signal is IQ down-converted and filtered by a 20-MHz comb filter to extract the baseband signal (in Matlab). The rectenna output dc voltage and the EVM of the demodulated constellation are plotted in Fig. 3.11(a) as functions of  $V_{low}$ . Apparently, the harvested  $V_{out}$  goes higher as  $V_{low}$  approaches 2 V, but the Rx signal and EVM both degrade. A 5-Mb/s uplink can be realized with EVM of -26 dB and  $V_{out}$  of 0.83 V (with  $V_{low} = 0$  V). The demodulated EVM can be related to the SNR measured in the IF domain by

EVM<sub>IM3,IQ</sub> Demodulation in IF Domain = 
$$\frac{2}{\pi^2} \frac{BW_{LPF}}{SNR_{SSB@IF}}$$
. (3.4)



Figure 3.8: (a) Rectenna dc voltage versus the varactor bias for one-tone (CW) and twotone excitation. (b) Simulated rectifier ac swing  $(V_{rect})$ , output voltage  $(V_{out})$ , and the tag-generated IM3 currents  $(I_{IM3,rect})$  and Rx IM3 power at 5.808 GHz (IM3 freq.).



Figure 3.9: System illustration of the proposed IM3 uplink embedded in the two-tone excited IWPT system.



Figure 3.10: (a) Measured Rx IF spectrum without the tag. (b) Measured Rx IF spectrum with a square-wave ( $V_{low} = 0$  V and  $V_{high} = 2$  V) applied to Vbias.

Plugging in  $BW_{LPF} = 20$  MHz and  $SNR_{SSB@IF} = 93$  dB-Hz into (3.4) predicts an EVM of -27 dB, quite close to the measured value of -26 dB. The data rate can be further enhanced. Fig. 3.11(b) plots the constellation with a 10-Mb/s uplink, where the EVM is -21 dB and  $V_{out}$  is 0.84 V. The EVM degradation is due to the insufficient bandwidth of the 60-MHz band-pass filter (Mini-Circuits BBP-60+).

This work features an unprecedented 10-Mb/s uplink from the IM3 uplink system innovation employing a duplexer. More details and analysis on the IM3 uplink can be found in [39]. On the other hand, our first IM3 UL prototype was implemented without a duplexer and was compared to the conventional direct and IF-based backscattering uplinks. The details can be found in [39]. The first prototype uses Tx frequencies of 4.94 and 5.0 GHz and Rx frequency of 5.06 GHz. Unfortunately, it does not use the duplexer but only puts a high-Q filter at the PA output. The high-Q filter reduces the PA output noise at the Rx frequency



Figure 3.11: (a) 5-Mb/s PRBS uplink EVM and  $V_{out}$  versus  $V_{low}$  ( $V_{high} = 2$  V). (b) Baseband constellations for three uplink cases.

(5.06 GHz) by 40 dB, and contributes to a good Rx SNR of 90 dB at 5.06 GHz. However, the high-Q filter is less effective at 4.88 GHz (image band), so the noise there dominates the Rx noise after the mixer down-conversion. Also, the Tx-to-Rx two-tone leakages at 4.94 and 5.0 GHz are 0 dBm, 80-dB higher than the Rx signal. The out-of-band (OOB) blockers are converted to an in-band blocker by the undesired mixer third-order nonlinearity, and the high in-band blocker (in the IF domain) dominates the ADC range and results in a high quantization noise. Eventually, the SNR of the Rx baseband degrades to 76 dB, and the achieved data rate is limited to 100 kb/s in our first prototype [39].

# 3.4 Far-Field IM3 Uplink: Custom Tag

### Design

Successively, the IM3 uplink technique is implemented in a far-field power-harvesting reader-tag system with improved uplink quality. A 5.8-GHz prototype is demonstrated and compared to the conventional backscattering uplink. The IM3 uplink is insensitive to the PA output noise and exhibits better Rx SNR and SBR. More details on this implement can be found in [41].



Figure 3.12: PCB tag schematic and photograph.

The schematic and photograph of the PCB tag, including the rectifier and the modulator, are shown in Fig. 3.12 The tag is realized on RO4003C substrate. The four-stage rectifier is input matched to 50  $\Omega$  at 5.8 GHz and an input power of -5 dBm. The rectifier drives a 100-k $\Omega$  resistor, a limiting zener diode, a 32-kHz crystal oscillator (EPSON SG-3040-JC), and a clock buffer to drive the modulator. The total rectifier output dc power is designed at 50  $\mu$ W (2 V/25  $\mu$ A). The simulated and measured rectifier output voltages versus the total input power (without the modulator) are plotted in Fig. 3.13(a), under continuous-wave (CW) excitation at 5.828 GHz and two-tone excitation at 5.808 and 5.848 GHz. The two-tone excitation achieves a higher PTE with a higher peak-to-average ratio. At the designed output condition, the measured PTE is 6% with the CW excitation and 9% with the two-tone excitation. Under the two-tone excitation, the simulated and measured (through a 20-dB directional coupler) rectifier-generated 5.768-GHz IM3 power returned to the 50- $\Omega$  source are plotted in Fig. 3.13(b). The IM3 component is measured with the aid of a directional coupler. When the output voltage reaches 2 V, the measured IM3 power is -31 dBm (input power at -2.5 dBm), lower than the simulated value of -28 dBm (input power at -5.5 dBm).

The IM3 modulator and the backscattering modulator are the same and realized by a parallel LC-tank. The parallel open stub is longer than a quarter wavelength and is inductive, and the parallel capacitance is a diode varactor in series with a SMD capacitor. The LC-tank is designed to resonate at 5.8 GHz when the varactor bias voltage  $(V_{bias})$  is 0 V to minimize the insertion loss. When  $V_{bias}$  is 2 V, the resonance frequency shifts to 6.1 GHz and both the rectifier input power and the excited IM3 power decrease. The modulator also modulates the input impedance seen by the tag antenna and can serve as the backscattering modulator



Figure 3.13: (a) Rectifier harvested dc voltage and (b) rectifier-generated IM3 power versus the total rectifier input power (modulator not included).

properly. At 5.8 GHz, the simulated modulator insertion loss is 2 dB at  $V_{bias} = 0$  V and is 4 dB at  $V_{bias} = 2$  V. Simulation shows the modulator loss increases the required tag received power  $(P_{in,tag})$  to -2.2 dBm under the CW excitation and -6 dBm/tone under the two-tone excitation.

#### Measurement

The relevant system parameters from simulation and component data sheets are summarized in Table 3.1 to estimate the uplink signal strength, SNR, and SBR for the IM3 uplink and the counterpart backscattering uplink. In the IM3 uplink, the required Tx power at the reader antenna is 24.5 dBm/tone to power up the tag, taking into account the total antenna gain of 12 dB and the path loss of 42.5 dB (according to the Friis equation). The simulated IM3 voltage at the tag antenna alternates between -10.2+4.5j mV and -7.1+1.0jmV corresponding to the two tag states, with an offset  $|V_{IM3,CM}|$  of 9.1 mV and difference  $|\Delta V_{IM3}|$  of 4.7 mV. Therefore, with the modulator performing a square-wave modulation, the single-sideband power at the reader Rx, denoted by  $P_{IM3,SSB}$ , is estimated as -70 dBm, by

$$P_{\rm IM3,SSB} = |\Delta V_{IM3}|^2 / (2\pi^2 \times 50\Omega) + Antenna/LNA_{Gain} - Path/Duplexer_{Loss}.$$
 (3.5)

| IM3 Uplink                              |                             |                | Backscattering Uplink                     |                              |                |  |
|-----------------------------------------|-----------------------------|----------------|-------------------------------------------|------------------------------|----------------|--|
| Tx Frequency (GHz)                      |                             | 5.808/5.848    | Tx Frequency (GHz)                        |                              | 5.828          |  |
| Tag Received Power $(P_{in,tag})$ (dBm) |                             | -6.0 each tone | Tag Received Power (Pin,tag) (dBm)        |                              | -2.2           |  |
| Coupling Distance (cm)                  |                             | 55             | Coupling Distance (cm)                    |                              | 48             |  |
| Downlink Path Loss (dB)                 |                             | 42.5/42.6      | Downlink Path Loss (dB)                   |                              | 41.4           |  |
| Rx Frequency (GHz)                      |                             | 5.768          | Rx Frequency (GHz)                        |                              | 5.828          |  |
| IM3 at Tag Antenna                      | $V_{\rm bias} = 0  {\rm V}$ | -10.2+4.5j     | Tag Input Reflection                      | $V_{\rm bias} = 0 \ {\rm V}$ | 0.180 - 0.365j |  |
| (mV)                                    | $V_{\rm bias} = 2  {\rm V}$ | -7.1+1.0j      | Coefficient $(S_{11})$                    | $V_{\rm bias} = 2  {\rm V}$  | 0.033 - 0.486j |  |
| Uplink Path Loss (dB)                   |                             | 42.5           | Uplink Path Loss (dB)                     |                              | 41.4           |  |
| Duplexer Loss/Rejection (dB)            |                             | 2/>60          | Directional Coupler Insertion Loss (dB)   |                              | 1              |  |
| LNA Gain/Noise Figure (dB)              |                             | 11/4           | Directional Coupler Coupling Factor (dB)  |                              | 20             |  |
| Cable Loss (dB)                         |                             | 2              | Cable Loss (dB)                           |                              | 2              |  |
| Common Components                       |                             |                |                                           |                              |                |  |
| Antenna Gain (dBi)                      |                             |                | Reader (helix): 10; Tag (monopole): 2     |                              |                |  |
| 1-W PA Noise                            | White Noise                 |                | -125 dBm/Hz                               |                              |                |  |
|                                         | Phase Noise                 |                | -117 dBc/Hz @ 50 kHz; -129 dBc/Hz @ 1 MHz |                              |                |  |

Table 3.1: System parameters for the far-field WPT system

The uplink path loss at 55 cm and frequency of 5.768 GHz is 42.5 dB, very close to the downlink path loss. The IM3 blocker at the reader Rx, denoted by  $P_{IM3,blocker}$ , is estimated as -54 dBm, by

$$P_{\rm IM3, blocker} = |V_{IM3, CM}|^2 / (2\pi^2 \times 50\Omega) + Antenna/LNA_{Gain} - Path/Duplexer_{Loss}.$$
 (3.6)

The (Viccom) duplexer Tx bandwidth is from 5.805 to 5.850 GHz and the Rx bandwidth is from 5.725 to 5.775 GHz. Since the duplexer has a Tx-to-Rx isolation better than 70 dB, the noise level at the duplexer output is close to the thermal noise floor, and the Rx noise  $(N_{Rx})$  is estimated as -161 dBm/Hz, by

$$N_{\rm Rx} = -174 \, \mathrm{dBm/Hz} + LNA_{NF} + LNA_{Gain} - Cable_{Loss}.$$
(3.7)

The total Tx power is enhanced to 30 dBm (27 dBm/tone) in the IM3 uplink measurement to power up the tag to the desired output condition. This is expected because the measured rectifier PTE has been observed to be lower than the simulated result. The measured Rx spectrum for the IM3 uplink is plotted in Fig. 3.14(a). The IM3 sideband powers are -78 dBm, at 5.768 GHz  $\pm$  32 kHz, and the IM3 blocker at 5.768 GHz is -63 dBm. A good SBR of -15 dB is observed. The measured IM3 blocker and sideband powers are lower than the simulated values by 8 dB. Part of the degradation can be tracked because the measured rectifier-generated IM3 is lower in measurement compared to the simulated value. Finally, the measured noise floor is -156 dBm/Hz and the SNR is 78 dBc with 1-Hz noise bandwidth. It is found that the PN of the IM3 in-band blocker from the tag is not negligible in the RF spectrum. Without the tag, the noise floor drops 3 dB to -159 dBm/Hz, close to the calculated value of -161 dBm/Hz. Therefore, the IM3 uplink signal-to-white-noise ratio (SWNR) is 81 dBc.

#### Comparison to the Backscattering UL

The 5.828-GHz backscattering uplink for comparison employs an antenna power of 30 dBm, also higher than the simulated value of 27.2 dBm. The coupling distance is reduced to 48 cm (with a path loss of 41.4 dB) due to the lower PTE with the CW Tx. The measured Rx spectrum is plotted in Fig. 3.14(b). The Tx blocker is -5 dBm, lower than the antenna power by (CL + RL) dB, where RL is the antenna input return loss of 13 dB and CL is the coupler coupling and cable loss of 22 dB. The measured noise floor is -122 dBm/Hz, dominated by the Tx PN of -117 dBc/Hz. With the backscattering sideband power of -76 dBm, the SNR and SBR are 46 dBc (with 1-Hz noise bandwidth) and -70 dBc, respectively, both significantly worse than the IM3 uplink. The backscattering sideband power, denoted by  $P_{back,SSB}$ , can be estimated as -78 dBm, by

$$P_{\text{back},\text{SSB}} = P_{\text{in},\text{tag}} |\Delta S_{11}|^2 / (\pi^2) + Antenna_{Gain} - Path/Coupler_{Loss}.$$
 (3.8)

where  $|\Delta S_{11}|$  is the  $S_{11}$  deviation between the two tag backscattering states. The simulated  $S_{11}$  is 0.180 - 0.365*j* when  $V_{bias} = 0$  V (in resonance) and 0.033 - 0.486*j* when  $V_{bias} = 2$  V (off resonance), so  $|\Delta S_{11}| = 0.19$ .

#### Discussions

Although the measured Rx noise in the backscattering uplink is as high as -122 dBm/Hz at 5.828 GHz + 32kHz, the white noise floor, observed at 5.928 GHz, is -160 dBm/Hz (-155 dBc/Hz). Therefore, the SWNR is 84 dBc with 1-Hz noise bandwidth. If a passive mixer is adopted for the backscattering uplink to eliminate the -5-dBm blocker, the SWNR might not degrade significantly. Assuming the Tx PN at Rx can be completely eliminated and in the meanwhile the noise floor does not increase, the backscattering Rx SNR with the coupling distance extended from 48 to 55 cm and a higher PA power could theoretically achieve an SWNR of 82 dBc, even slightly better than the IM3-uplink SWNR (81 dBc) by 1 dB. However, achieving complete PN cancellation is very challenging and unlikely due to the multiple leakage paths and the delay mismatch between the leakage path and the compensation path. In addition, the Rx white noise in the backscattering uplink is ultimately dominated by the PA output noise floor, which is low in this demonstration (-125 dBm/Hz at 1 W) but can degrade in systems with a more noisy power source or systems with a higher coupling distance and PA power. Alternatively, the IM3 uplink is pretty immune



Figure 3.14: (a) Measured Rx spectrum (at the LNA) output of the IM3 uplink. and (b) Measured Rx spectrum (at the directional coupler output) of the backscattering uplink.

to the PA output noise. If necessary, additional filters (and isolators) can be added to the PA output or the duplexer Rx output to keep the Rx noise close to the thermal noise floor with negligible Tx blockers, and the uplink SBR and SNR are solely determined by the Rx signal strength.

# 3.5 Far-Field IM3 Uplink: Commercial UHF Gen-2 Tag

Although this work [41] uses a custom-made tag, adopting the IM3 uplink principle to the widely-adopted backscattering UHF RFIDs (e.g. EPCglobal UHF Gen2 [42]) is possible with an appropriate UHF duplexer and the reader Tx transmitting an asymmetric two-tone signal in the uplink session.

The follow-up development [42] implements the first single-antenna UHF FDD reader system compatible with a commercial RFID inlay. The proposed IM3 UL system achieves an EVM improvement of 5.7 dB compared to the conventional backscattering system.

#### Backscattering Reader Design/Measurement

The block diagram of the (conventional) backscattering reader, implemented to be compared with the proposed FDD uplink, is illustrated in Fig. 3.15. This backscattering uplink is very similar to those realized in some reported works [33], [121]. The 910-MHz Tx carrier with peak power of 22 dBm is on-off-keying (OOK) modulated by the DL preamble and command. The adopted downlink preamble selects Type-A Reference Interval (Tari) of 12.5  $\mu$ s, 50% duty cycle, RTcal of 2.5 Tari and TRcal of 5 Tari. The command sequence selects a 128-kb/s backscattering rate and the FM0 encoding [31]. During the uplink session, the Tx sends a 910-MHz and 22-dBm continuous wave (CW) for the wireless power transfer and backscattering. The backscattering signal is collected at the isolation port of the direction coupler, and the Tx-to-Rx leakage is 7 dBm. The 7-dBm leakage power is reduced from the 22-dBm Tx power by the antenna return loss of 5 dB and the coupler coupling factor of 10 dB. To relax the linearity requirement on the Rx LNA, a second signal generator, with tunable magnitude and phase, is employed to cancel the Tx-to-Rx leakage during the UL session, and a Rx switch is employed to avoid saturating the Rx LNA during the DL session. The Rx signal is demodulated by an IQ mixer. The demodulated Rx signals are sampled by a scope at 250 MS/s.

When the commercial RFID inlay (Avery Dennison RFID 600598) is involved at a coupling distance of 20 cm, the measured average backscattering power at antenna is -36 dBm with a peak power of -33 dBm. Since the Tx power at antenna is 22 dBm in the UL session, the total path loss is about 55 dB. The baseband quadrature (Q) channel is plotted in Fig. 3.16. The baseband in-phase (I) channel in this system has a much lower output power and is ignored. (This is because in this system the phase of the backscattering carrier at the mixer RF input is close to the phase of the mixer quadrature LO signal.) Fig. 3.16 shows



Figure 3.15: Block diagram of the implemented (conventional) backscattering reader system.

that the leakage of the Tx CW waveform in the UL session has been effectively canceled, so the dc offset after the direct down-conversion is only -10 mV. The leakage in the DL session cannot be canceled by the auxiliary CW source but is isolated by the Rx switch. The demodulated UL signal, filtered digitally by a 2-MHz comb filter ( $BW_{LPF} = 2$  MHz), is plotted in Fig. 3.17(a). The UL preamble and the 16-bit key can be recovered by a FM0 decoder, which also rejects the low-frequency noise [122]. A simple FM0 decoding scheme is adopted here with the i-th decoded bit equal to the absolute difference between the (2*i*)-th and the (2*i* - 1)-th demodulated data. This simple decoder degrades the SNR by 3 dB, assuming the white noise dominates. The decoded baseband constellation is plotted in Fig. 3.17(b) with an EVM of -14.2 dB.

According to Fig. 3.17(b), the difference between the decoded symbols, denoted by  $\Delta V_{BB}$ , is 3 mV and the corresponding average power ( $P_{BB}$ ) is -40 dBm. On the other hand, the Rx noise is dominated by the Tx-to-Rx noise (-140 dBc/Hz) leakage. In the UL session, the Tx leakage power at the coupler isolation port is 7 dBm and the leakage tone can be suppressed by 20 dB at the power combiner output by the auxiliary source. However, the white noise cannot be canceled by the auxiliary source and is -133 dBm/Hz, measured at the combiner output. With the system parameters known, the baseband noise level sampled by the scope, denoted by  $N_{BB}$ , is estimated at -122 dBm/Hz, and the decoded constellation EVM with the IQ demodulation [43] can be evaluated as -16 dB, by

$$EVM = 10 \times log(2 \times N_{BB} \times BW_{LPF}/P_{BB}).$$
(3.9)



Figure 3.16: Scope-sampled demodulated Rx signal at the demodulator output (Q-channel).



Figure 3.17: (a) Demodulated UL signal and (b) decoded data with the backscattering reader.

which is close to the measured value.

#### FDD Reader Design/Measurement

The proposed FDD reader is illustrated in Fig. 3.18(a). The antenna-pair setup, coupling distance (20 cm), DL OOK waveform, and the Tx output power are identical to that of the implemented backscattering reader. In the UL session, the reader transmits an asymmetric two-tone signal with an auxiliary tone at 890 MHz. The auxiliary tone is only transmitted for 0.6 ms and is 16-dB lower than the main tone. An attenuated version of the Tx waveform is plotted in Fig. 3.18(b). Although using an auxiliary tone with a higher magnitude can enhance the excited IM3 component from the tag third-order nonlinearity, the commercial RFID tag might not be able to harvest a waveform with a high peak-to-average ratio. The tag-generated upper IM3 tone at 930 MHz is utilized as the UL carrier, and a (EGSM) duplexer is employed with the reader Tx frequencies in the duplexer lower band (880 – 915 MHz) and the uplink frequency in the duplexer upper band (925 – 960 MHz). The duplexer has an insertion loss of 1 dB and provides Tx-to-Rx rejections of 71, 78, and 80 dB at 890, 910, and 930 MHz, respectively. Thanks to the low Tx leakage, an LNA is connected directly to the duplexer Rx port. The UL signal is then down-converted to 40 MHz and sampled and demodulated/decoded in the digital domain.

The sampled Rx waveform is plotted in Fig. 3.19. The OOK-modulated IM3 carrier, at 40 MHz, has a high on-off ratio, which implies that the power amplifier (PA) IM3 distortion has been rejected effectively at Rx, and the on-tag backscatter modulates the IM3 carrier very effectively. The 40-MHz UL signal can be modeled by

$$Rx(t) = a(t) \times V_{IF} \times \cos(\omega_{IF}t) + n_{IF}(t), \qquad (3.10)$$

where a(t) = 0 when the tag sends a "0" and a(t) = 1 when the tag sends a "1".  $V_{IF}$  is the magnitude of the 40-MHz uplink carrier and is 0.2 V, corresponding to an average Rx power of -7 dBm. The 930-MHz UL (average) power at antenna should be -74 dBm, taking into account the 1-dB duplexer loss, RF gain of 48 dB, mixer conversion loss of 6 dB, and IF gain of 26 dB. Compared to the backscattering UL, the signal strength in the implemented FDD UL is substantially lower; therefore, achieving a better UL SNR relies primarily on the duplexer Tx-to-Rx noise filtering. At the duplexer output, the noise level at 930 MHz has been attenuated to a low level close the thermal noise floor (-174 dBm/Hz), and the 4-dB-NF LNA, DSB mixer, and the IF amplifier amplify the Rx noise to -99 dBm/Hz (at 40 MHz). As the result, the mean square value of  $n_{IF}$ , denoted by  $\langle n_{IF}^2 \rangle$ , is 6.3e-12  $V^2$ /Hz. The 40-MHz OOK-modulated signal (3.10) is demodulated by a simple square operation followed by a 2-MHz comb filter. Compared to the IQ demodulation [43], the square demodulation [39] is much easier to implement but results in a lower SNR by 3 dB. The constellation EVM after the square demodulation and FM0 decoding can be evaluated as -23 dB, by



Figure 3.18: (a) Block diagram of the proposed FDD reader system. (b) Tx waveform (with attenuation).



Figure 3.19: Scope-sampled down-converted Rx signal.



Figure 3.20: Demodulated UL signal and decoded data at coupling distance of (a) 20 cm and (b) 40 cm using the proposed FDD reader.

$$EVM_{Square Demod} = 10 \times log(16 \times \langle N_{IF} \rangle^2 \times BW_{LPF}/V_{IF}^2).$$
(3.11)

The demodulated data and the decoded constellation are plotted in Fig. 3.20(a). The decoded constellation EVM is -19.9 dB, better than the -14.2-dB EVM achieved by the backscattering UL with IQ demodulation. Fig. 3.20(b) provides the results with the coupling distance extended from 20 to 40 cm, where the decoded constellation EVM degrades to -12.6 dB, still comparable to the backscattering uplink measured at 20 cm.

# 3.6 IM3 Uplink with Three-Tone Tx for Enhanced Uplink Power

The comparison between the IM3 and the backscattering UL has already been covered in our previous works [39, 42]. The previously introduced IM3 UL systems only transmit two-tone waveforms, and this work demonstrates that under a given transmitter (Tx) peak power, transmitting an optimized three-carrier Tx waveform can improve the UL power and the receiver (Rx) signal-to-noise ratio (SNR) without compromising the rectifier harvested dc power. The developed theory is based on power series analysis and verified by nonlinear simulation and experimental results, which include two RF-powered reader-tag systems: The first one employs a custom designed ultra-high frequency (UHF) PCB rectenna and the second one uses a commercial Gen-2 UHF RFID inlay.

#### IM3 Uplink with Two-Tone Tx Waveform: Tx PAPR

The simplified RF-domain block diagram of the two-tone IM3 UL system is shown in Fig. 3.21 Since conventional duplexers only have a single Rx band, only one of the two IM3 components (e.g., the higher IM3 component at  $f_{IM3,H} = 2f_{Tx2} - f_{Tx1}$ ) is used. For the ease of analysis, the slow-varying voltage envelope ([108], Sec. 5.6) at the rectifier input, denoted by Vrect, env, is assumed to resemble the envelope transmitted by the Tx, denoted by  $V_{Tx.env}$ , with minor distortion. In other words,

$$V_{rect,env}(t) = G_{link} \times V_{Tx,env}(t + \Delta t), \qquad (3.12)$$

where  $G_{link}$  is the path loss from the reader antenna to the rectenna rectifier. Since the transmitted signal is narrowband to accommodate the duplex bandwidth, the in-band channel frequency response can be viewed as a constant. The validity of (3.12), implicitly admitted in the works designing the Tx waveform for the highest power transfer efficiency (PTE) [123, 124], has still to be confirmed by simulation. This is because the rectifier has a nonlinear input current to voltage response.



Figure 3.21: Simplified block diagram of the IM3 UL using the upper IM3 component.

In addition, this work assumes using a Tx (PA) with a fixed peak power and constant power consumption (e.g., most commercial Class-A linear PA modules and ZHL-16W used in this work). Therefore, the system dc-dc efficiency goes higher when the Tx transmits a low-PAPR waveform with a higher average Tx power. The works suggesting using a high PAPR waveform focus on the RF-dc efficiency and harvesting a higher dc power under the same Tx average power [123, 124]. Nevertheless, given the same average power, a PA with a higher peak power must be implemented to accommodate the high-PAPR waveforms, and the PA dc-RF efficiency inevitably drops from the PA peak efficiency under the high PAPR operation [125].

With the Tx peak voltage (at the Tx antenna) denoted by  $V_{max}$ , the time-domain waveform of a two-tone Tx fully exploiting the peak voltage is modeled by

$$V_{Tx}(t) = V_{max}(1 - a_2)\cos(\omega_{Tx1}t) + V_{max}(a_2)\cos(\omega_{Tx2}t + \phi_2), \qquad (3.13)$$

where the coefficient  $a_2$  is between zero and unity. The Tx envelope can be calculated as

$$V_{Tx,env} = V_{max} \times \sqrt{a_2^2 + (1 - a_2)^2 + 2a_2(1 - a_2)\cos(\frac{\omega_{Tx2} - \omega_{Tx1}}{2} + \phi_2)}.$$
 (3.14)

The envelope PAPR can be easily calculated as

$$PAPR_{2tone} = \frac{max(V_{Tx,env}^2)}{\langle V_{Tx,env^2} \rangle} = \frac{1}{a_2^2 + (1 - a_2)^2}.$$
(3.15)

Apparently,  $PAPR_{2tone}$  does not depend on  $\phi_2$  and is 0 dB for a CW waveform ( $a_2 = 0$  or 1). The rectifier input voltage and current in the envelope domain is assumed to be governed by a memoryless polynomial, expressed as

$$I_{in,env} = b_1 V_{in,env} + b_3 V_{in,env}^3, (3.16)$$

where the coefficients  $b_1$  and  $b_3$  are real numbers. Even order nonlinearities are not included since they do not generate in-band frequency components. By replacing  $V_{in,env}$  in (3.16) by  $G_{link}V_{Tx,env}$ , the excited upper and lower IM3 current magnitudes, denoted by  $I_{IM3,u}$  and  $I_{IM3,l}$ , are approximated by

$$I_{IM3,u} = (V_{max}G_{link})^3 \times b_3 \times a_2^2(1-a_2), \qquad (3.17)$$

$$I_{IM3,l} = (V_{max}G_{link})^3 \times b_3 \times a_2(1-a_2)^2, \qquad (3.18)$$

respectively. Fig. 3.22 plots the calculated  $PAPR_{2tone}$  and IM3 currents versus the magnitude of the second tone at  $\omega_{Tx2}$ , which is  $a_2$ . Without loss of generality, the constant coefficient  $V_{max}G_{link}$  is set at unity. If the upper IM3 components is exploited, higher power shall be allocated to the second tone at  $\omega_{Tx2}$  ( $a_2 > 0.5$ ). The maximum IM3 current is -16.6 dB, occurring at  $a_2 = 0.67$  and PAPR = 1.8.

#### **PCB** Example

Eq. (3.18) is verified by simulation and measurement with a custom-designed rectifier. The rectifier schematic and photograph are shown in Fig. 3.23. In this prototype, the targeted RF frequency is 900 MHz (UHF) and the harvested dc power is 5 mW, with 1-V dc voltage at a 200- $\Omega$  load resistor. Schottky diodes (SMS-7621) are used in a two-stage Dickinson rectifier that has an input impedance close to 50- $\Omega$  when driven to create the 1 V/5 mA output condition. The output capacitance (1 × 0.5 mm<sup>2</sup>) is 200 nF, which also presents a low impedance to the low-frequency second-order intermodulation in the following multi-tone testing.

The rectifier output dc voltage and input reflection coefficient  $(S_{11})$  are simulated versus the rectifier input power at 900 MHz, and the results are shown in Fig. 3.24(a) and (b), respectively. Besides the implemented two-diode rectifier, rectifiers composed of different number of diodes have also been studied. Although the required rectifier input power is the lowest with a single-diode rectifier, the corresponding input return loss is only 5 dB so this low input power (i.e., 9.2 dBm) can only be achieved with an additional low-loss



Figure 3.22: Modeled IM3 currents and Tx waveform PAPR versus the second-tone magnitude  $(a_2)$ .



Figure 3.23: Schematic and photograph of the custom-designed rectifier.



Figure 3.24: Simulated rectifier (a) output voltage and (b) input  $|S_{11}|$  versus the power delivered into the rectifier.

MNW interfacing the rectifier and the antenna. For the implemented two-diode rectifier, the simulated RF-dc PTE is 45% without requiring an additional MNW. The modulator adds little loss with a small parallel capacitor composed of a 1-pF capacitor in series with a varactor diode (SMV-1763). The varactor capacitance is 9.8 pF when biased at 0 V and is 3.4 pF when biased at 2 V. The measured rectifier output voltage and input return loss, including the modulator loss, are plotted in Fig. 3.25. The required source power is 10.5 dBm at the desired 1-V output.

Under a two-tone input at 895 and 910 MHz, the tag-generated IM3 component at 925 MHz and the rectifier output dc voltage are measured. Tx waveforms with different waveform PAPR are tested with a fixed peak power of 11 dBm (1.12 V at a 50- $\Omega$  load). To be clear, the input waveforms with PAPR of 1.1, 1.3, and 1.5 are illustrated in Fig. 3.26. Both the



Figure 3.25: Measured rectifier input  $|S_{11}|$  and harvested dc voltage versus the incident (source) power.

complete time-domain waveforms and the waveform envelopes are plotted. Under different waveform PAPR, the simulated and the measured IM3 generation are provided in Fig. 3.27 and are close to each other. The (upper) IM3 power transmitted to the 50- $\Omega$  source is measured via a directional coupler. As the source PAPR increases, the Tx average power and the measured output dc voltage both decrease, while the generated IM3 power increases. The excited IM3 power starts to saturate at PAPR around 1.5. This IM3 response is well predicted by the simplified analysis (3.17). In Fig. 3.27(b), the simulated IM3 power is compared to the IM3 component calculated by (3.17) with good agreement. (The constant coefficient in (3.17),  $(V_{max}G_{link})^3 \times b_3$ , is set at 0.33.)

Fig. 3.27 implies that the IM3 component is generated at the cost of the reduced harvested dc power. Therefore, the efficiency for the IM3 generation is defined by the generated IM3 power divided by the degradation in the harvested dc power compared to the dc power obtained with the CW excitation (PAPR = 1). The derived efficiencies based on measurement are listed in Table 3.2. For example, with a moderate PAPR of 1.3, the harvested dc power is 0.93 V, 1.3-mW lower than the dc power achieved with the CW waveform. The IM3 power returned to the 50- $\Omega$  source is -15.6 dBm (28  $\mu$ W), and the equivalent efficiency is 2.2%. The efficiency can double if the lower IM3 component is also exploited.

The simulated waveform at the rectifier input has been compared to the input waveform, in terms of the waveform PAPR, and the agreement of the two suggests using (3.12) is acceptable [44].



Figure 3.26: Two-tone waveform and envelope with 11-dBm peak power ( $V_{peak} = 1.12$ ) for PAPR = 1.1, 1.3, and 1.5.



Figure 3.27: Measured and simulated IM3 power and harvested dc voltage.
| PAPR | Harvested    | Reduced dc | IM3 <sub>u</sub> | IM3 <sub>u</sub> |
|------|--------------|------------|------------------|------------------|
|      | Voltage (mV) | Power (mW) | Power (mW)       | Eff. (%)         |
| 1.1  | 1047         | 0.14       | 0.003            | 1.9              |
| 1.2  | 973          | 0.88       | 0.014            | 1.5              |
| 1.3  | 933          | 1.27       | 0.028            | 2.2              |
| 1.4  | 875          | 1.79       | 0.054            | 3.0              |
| 1.5  | 849          | 2.01       | 0.076            | 3.8              |
| 1.6  | 788          | 2.51       | 0.107            | 4.3              |
| 1.7  | 753          | 2.78       | 0.141            | 5.1              |

Table 3.2: Uplink carrier generation with two-tone Tx waveform.

#### IM3 Uplink With Three-Tone Tx Waveform

Assuming equal frequency spacing, the time-domain waveform for a three-tone Tx, denoted by  $V_{Tx,3tone}$ , can be modeled by

$$V_{Tx,3tone}(t) = a_1 \times \cos[(\omega_{Tx} - \Delta\omega)t + \phi_1] + a_2 \times \cos[\omega_{Tx}t] + a_3 \times \cos[(\omega_{Tx} + \Delta\omega)t + \phi_3].$$
(3.19)

If the waveform peak voltage is fixed at  $V_{max,3tone}$ , the coefficients and phases in (3.19) cannot take arbitrary values. In this work, the frequency spacing between adjacent tones are fixed at  $\Delta\omega$  to reduce the number of intermodulation tones. Based on (3.19), the envelope of  $V_{Tx,3tone}$  can be expressed as

$$V_{Tx,3tone,env}(t) = \left(a_1^2 + a_2^2 + a_3^2 + 2a_1a_2\cos[\Delta\omega t - \phi_1] + 2a_2a_3\cos[\Delta\omega t + \phi_3] + 2a_3a_1\cos[\Delta\omega t + \phi_3 - \phi_1]\right)^{\frac{1}{2}}.$$
 (3.20)

The maximum value of  $V_{Tx,3tone,env}$ , or  $(V_{Tx,3tone,env})_{max}$ , is denoted by  $V_{max,3tone}$ . The excited (upper) IM3 current, denoted by  $I_{IM3,u,3tone}$ , can be calculated from (3.12), (3.16), and (3.20) and the time-domain waveform is expressed as

$$I_{IM3,u,3tone} = b_3 G_{link}^3 a_2 a_3 \times [a_3 \times \cos(\omega_{IM3,u}t + 2\phi_3) + 2a_1 \times \cos(\omega_{IM3,u}t + \phi_3 - \phi_1)]. \quad (3.21)$$

According to (3.21),  $I_{IM3,u,3tone}$  is also a function of  $a_1$ ,  $a_2$ ,  $a_3$ , and  $\phi_1 + \phi_3$ . For the special case with  $\phi_1 + \phi_3 = 0$ , the calculated IM3 current is plotted in Fig. 3.28(a) to the two variables  $a_1$  and  $a_3$ . Without loss of generality,  $a_2$  and the constant coefficient  $b_3 G_{link}^3$  in (3.21) are set at unity. The waveform PAPR, defined by  $V_{max,3tone}^2 / \langle V_{Tx,3tone,env}^2 \rangle$ , is



Figure 3.28: (a) Calculated (a) IM3 current, (b) Tx waveform PAPR, (c) waveform peak value, and (d) normalized IM3 current.

plotted in Fig. 3.28(b), and the waveform peak value,  $V_{max,3tone}$ , is plotted in Fig. 3.28(c). To fairly compare cases with different  $V_{max,3tone}$ , the normalized IM3 current is plotted in Fig. 3.28(d), where the waveform peak value has been normalized to unity by scaling the magnitude coefficients of  $a_1$ ,  $a_2$ , and  $a_3$ . The uniform scaling does not affect the waveform PAPR.

In Fig. 3.28(d), the actual coordinates  $(a_1, a_2, a_3)$ , represented by the normalized  $(a_3^*, a_1^*)$ , are

$$(a_1, a_2, a_3)_{(a_1^*, a_3^*)} = \frac{(a_1^*, 1, a_3^*)}{V_{max,3tone}(a_1^*, 1, a_3^*)}.$$
(3.22)

For example, the normalized IM3 current in Fig. 3.28(d) is around -18 dB at  $(a_3^*, a_1^*) = (2, 0.1)$ , where the actual coordinates  $(a_1, a_2, a_3)$  is (0.03, 0.32, 0.64) since  $V_{max,3tone}(0.1, 1, 2)$  is 3.1 (10 dB).

The waveform PAPR and the normalized IM3 contours, plotted in Fig, 3.28(b) and 3.28(d), respectively, can help in locating the maximum IM3 power for a given PAPR. Fig. 3.29(a) and (b) plot the waveform PAPR and the normalized IM3 contours for the other cases with  $\phi_1 + \phi_3 = \pi/2$  and  $\phi_1 + \phi_3 = \pi$ , respectively. Eq. (3.20) and (3.21) have indicated that the waveform peak voltage, waveform PAPR, and the IM3 current with  $\phi_1 + \phi_3 = x$  are equal to those with  $\phi_1 + \phi_3 = 2\pi - x$ . Therefore, cases with  $\phi_1 + \phi_3 > \pi$  do not have to be explored. In other words,

$$V_{max,3tone}(\phi_1 + \phi_3) = V_{max,3tone}(2\pi - \phi_1 - \phi_3)$$
(3.23)

$$|I_{IM3,u,3tone}(\phi_1 + \phi_3)| = |I_{IM3,u,3tone}(2\pi - \phi_1 - \phi_3)|.$$
(3.24)

Fig. 3.29 shows that the normalized IM3 contours are not as sensitive to  $\phi_1 + \phi_3$  as the PAPR contours. Although there are multiple  $(a_1, a_2, a_3)$  combinations for a given PAPR, the combination corresponding to the highest (normalized) IM3 current has  $a_3 > 1$ . The maximum (normalized) IM3 current versus the waveform PAPR are calculated for many  $\phi_1 + \phi_3$  from 0 to  $\pi$ , and some results are plotted in Fig. 3.30 The optimal performance occurs at  $\phi_1 + \phi_3 = \pi$ . The IM3 current generated by a two-tone Tx is also attached. It is observed that under the same peak waveform voltage and PAPR, the proposed three-tone Tx can increase the IM3 current by 3 dB in the low PAPR region (e.g., PAPR < 1.2). The performance improvement can also be observed from the time-domain Tx envelopes. Under a fixed Tx peak voltage of 1 V, the optimal three-tone Tx envelopes for PAPR of 1.1 and 1.3 are plotted in Fig. 3.31(a). The optimal  $(a_1, a_2, a_3)$  coefficients and the calculated (normalized) IM3 current are annotated and have been marked on the IM3 contours in Fig. 3.29(b). Although the auxiliary tone at  $\omega_{Tx}$  -  $\Delta\omega$ , with magnitude of a1, is relatively weak, it plays an important role in shaping the waveform envelope. For comparison, the counterpart two-tone Tx envelopes, with the same  $a_2$  and  $a_3$  but with the auxiliary tone turned off  $(a_1$ = 0), are plotted in Fig. 3.31(b). Although the two-tone waveforms have the normalized IM3 currents  $(IM3_n)$  close to the counterpart three tone waveforms, the waveform PAPR is noticeably higher. Therefore, the three-tone Tx can outperform the two-tone Tx in the low PAPR region since the two-tone Tx exhibits a pronounced tradeoff between the waveform PAPR and the IM3 current there.

The optimal three-tone waveforms for each waveform PAPR are applied to the custommade PCB rectifier, with peak power of 11 dBm and frequencies of 880, 895, and 910 MHz.



Figure 3.29: Calculated waveform PAPR and normalized IM3 current for (a)  $\phi_1 + \phi_3 = \pi/2$  and (b) $\phi_1 + \phi_3 = \pi$ .



Figure 3.30: Calculated normalized IM3 current excited by two-tone and three-tone Tx waveform.

The simulated and measured harvested power and the 925-MHz IM3 power returned to the source are plotted in Fig. 3.32. The same to the two-tone cases, the harvested dc power decreases while the IM3 power increases with the (source) waveform PAPR. The measured results under two-tone excitation, previously shown in Fig. 3.27(a), are also attached for comparison. Under the same PAPR (and Tx average power), the harvested dc power with the optimal three-tone waveform is very close to that obtained with the two-tone waveform, while the excited IM3 powers are noticeably higher. The measured improvements in the UL power are 4.8, 3.8, and 2.6 dB at waveform PAPR of 1.1, 1.3, and 1.5, respectively. The derived (upper) IM3 efficiency based on measurement are listed in Table 3.3. The efficiency is 3.2% under a low PAPR of 1.2 and a high output voltage of 0.97 V. Such an efficiency can only be obtained by a two-tone waveform with a high PAPR of 1.5 and a lower output voltage around 0.85 V.

,



Figure 3.31: (a) Optimal three-tone Tx envelopes with peak voltage of 1 and PAPR of 1.1 and 1.3. (b) Two-tone Tx envelopes with the auxiliary tone turned off.

| PAPR | Harvested<br>Voltage (mV) | Reduced dc<br>Power (mW) | IM3 <sub>u</sub> Power<br>(mW) | IM3 <sub>u</sub><br>Eff. (%) |
|------|---------------------------|--------------------------|--------------------------------|------------------------------|
| 1.1  | 1028                      | 0.33                     | 0.008                          | 2.4                          |
| 1.2  | 966                       | 0.97                     | 0.031                          | 3.2                          |
| 1.3  | 912                       | 1.46                     | 0.066                          | 4.5                          |
| 1.4  | 849                       | 2.01                     | 0.110                          | 5.5                          |
| 1.5  | 802                       | 2.40                     | 0.138                          | 5.7                          |
| 1.6  | 773                       | 2.63                     | 0.151                          | 5.8                          |
| 1.7  | 736                       | 2.91                     | 0.173                          | 6.0                          |

Table 3.3: Uplink carrier generation with three-tone Tx waveform.



Figure 3.32: Measured and simulated IM3 power and harvested dc voltage with optimal three-tone waveforms for different PAPR.

### Uplink System With A Custom Designed Tag

The UL quality, in terms of the spectrum SNR and the EVM of the demodulated constellation, in a WPT and IM3 UL system can benefit from employing an optimized three-tone Tx. Such a system demonstration is introduced in this section with the custom-made UHF PCB rectifier. The rectifier performances, under two-tone and three-tone excitation, have been characterized. Under a fixed coupling distance of 45 cm, Tx peak power of 37 dBm (5 W), UL data rate of 1 Mb/s, and Tx waveform PAPR of 1.1 (for a high harvested dc power), the demodulated EVM achieved with the optimal three-tone Tx waveform is 4.1-dB better than that achieved with the two-tone Tx waveform. The harvested dc voltage at the 200- $\Omega$ rectifier dc load is 0.98 V with the three-tone Tx and 0.97 V with the two-tone Tx.

The system block diagram is illustrated in Fig. 3.33(a). The spectral representations of signals at critical system nodes are measured and provided in Fig. 3.33(b). The harvested dc voltage around 1 V implies that the rectenna input power is around 10 dBm and the (one way) path loss including the monopole antennas (TG.22.0112) is about 27 dB. To accommodate the (cavity) duplexer with the Tx band from 880 to 915 MHz and the Rx band from 925 to 960 MHz, the three-tone waveform uses frequencies of 883, 898, and 913 MHz and the two-tone waveform uses frequencies of 898 and 913 MHz. The UL frequency is 928 MHz for both cases. The duplexer has a low insertion loss of 1 dB and the Tx-to-Rx

rejection is higher than 70 dB [42].

For the both Tx waveforms, the dominant Tx tone at antenna is at 913 MHz and is around 37 dBm (5 W). The corresponding Tx (white) noise floors are -110 dBm/Hz. The excellent duplexer Tx-to-Rx isolation suppresses the Rx noise floor significantly and allows directly connecting an LNA to the duplexer output. The noise floor at the duplexer output should be very close to the thermal noise floor of -174 dBm/Hz. This is because the noise measured at the RF LNA output is -137 dBm/Hz with the estimated LNA-chain RF gain of 31 dB and noise figure of 5 dB. First, the tag modulator is driven by a square waveform at 1 MHz. The modulator only weakly modulates the IM3 power transmitted back to the reader, and the desired sideband powers are lower than the in-band IM3 blocker by 27 dB. The IM3 powers measured at the Rx LNA output (point B in Fig. 3.33) are -19.5 dBm and -23 dBm for the two-tone and three-tone Tx. The complete Rx spectrums are plotted in Fig. 3.34(a) and (b), for the two-tone and three-tone Tx, respectively. The UL power generated by the three-tone Tx is better by 3.5 dB. The estimated tag-generated IM3 powers under both excitations are close to the measured values.

The power difference between the IM3 sideband and the noise floor (with 1-Hz noise bandwidth), is 87 dB with the two-tone Tx and 90.5 dB with the three-tone Tx. The UL signal is down-converted to 30 MHz, filtered, and then amplified and sampled by a 200-MS/s oscilloscope. The double-sideband mixer down-conversion and the IF amplification further degrades the Rx SNR, and the measured sideband SNRs in the IF domain (point C in Fig. 3.33) are 82 dB and 86 dB for the two-tone and three-tone Tx, respectively.

Finally, a 1-Mb/s PRBS signal is applied to the modulator. The scope-sampled IF signal is quadrature down-converted to baseband and then filtered by a 1-MHz (comb) low-pass filter (LPF). The demodulated constellations, with 500 consecutive symbols, are plotted in Fig. 3.35. The demodulated EVM is -20.6 dB with the two-tone Tx and -24.7 dB with the three-tone Tx. The EVM improvement of 4.1 dB justifies using the optimal three-tone waveform.

Table 3.4 summarizes the UL power and efficiency performance of some recently reported RF-powered tags adopting FDD WPT/UL operation [5, 35–38]. Conventionally, the UL frequency is generated by an oscillator (OSC) or a phase-lock loop (PLL), and the dedicated UL antenna is driven by a buffer amplifier. The oscillator and buffer are only turned on when the tag transmits the OOK-modulated symbol "1". The efficiency is usually determined by the buffer stage and the reported values are around 15%. The proposed IM3 UL system has a lower efficiency of 6% (at Tx PAPR of 1.7) in generating the UL carrier. The efficiency is lower because the WPT input and UL output share the same antenna, so the unidirectional buffer amplifier used in many FDD works cannot be employed. Also, the lower IM3 component is not utilized. The FDD UL exploiting the rectifier third harmonics [38] seems to generate the UL signal for free, given that a dual-band MNW and a dual-band antenna are required. However, the third-harmonic power generated by the rectifier can be re-distributed to the rectifier output dc power, so an implicitly performance tradeoff exists. Harmonic source pull has revealed that the rectifier PTE can be improved with the rectifier source impedance designed at harmonics [126], where the harmonic currents generated by



(a)

Two-tone Excitation (PAPR = 1.1) Mixer RF Input After LNA (point B) Amplified IF (point C) Duplexer Out (point A) -8 dBm -3 dBm 37 11 dBm -25 dBm dBm dBm -35 dBm -50 dBm 🖡 -35 dBm -40 fbb -60 **f**BB -57 -110 dBm/Hz dBn dBm -117 dBm dBm/H 137 dBm/H MANTIAN maniner where MMW M f<sub>Tx2</sub> f<sub>Tx3</sub> fim3 f<sub>Tx1</sub>  $f_{Tx2}$ f<sub>Tx3</sub> fim3 (30 MHz Three-tone Excitation (PAPR = 1.1) Mixer RF Input After LNA (point B) Amplified IF (point C) Duplexer Out (point A) 37 -3 dBm -4 dBm 15 dBm -21 dBm dBm dBm -31 dBm 46.5 dBm 🖡 🗕 -31 dBm 5 -40 fBB dBm **f**BB -36 110 dBm/Hz -60 dBn -117 dBm dBm 137 dBm/Hz dBm/Hz ALAMAN where we are MAN VM **f**Tx1 fTx2 f<sub>Tx3</sub> fim3 f<sub>Tx1</sub>  $f_{Tx2}$ f<sub>Tx3</sub> fim3 (30 MHz) (b)

Figure 3.33: (a) System block diagram. (b) Measured spectrum at critical system nodes.



Figure 3.34: Measured reader Rx spectrum (after LNA) of the IM3 uplink with (a) two-tone and (b) three-tone Tx waveform.



Figure 3.35: Demodulated (PRBS) constellation with (a) two-tone and (b) three-tone Tx waveform.

| Ref.         | UL Freq. | UL Power | Eff.     | UL Carrier            |
|--------------|----------|----------|----------|-----------------------|
|              | (GHz)    | (dBm)    | (%)      | Generation            |
| [35]         | 2.43     | -12.5    | 15*      | PLL + Buffer          |
| [36]         | 1.46     | -21      | 20       | OSC + Buffer          |
| [37]         | 1.74     | -18.5    | 15       | PLL + Buffer          |
| [5]          | 60       | -3       | $20^{*}$ | Pulser + OSC + Buffer |
| [38]         | 2.60     | N.A.     | N.A.     | Tag HD3 Nonlinearity  |
| This<br>Work | 0.93     | -7.6     | 6.0      | Tag IM3 Nonlinearity  |

\*Buffer only (simulation)

Table 3.4: Reported FDD RF-powered tag UL carrier generation.

the rectifier nonlinearity are filtered by the MWN and cannot be exploited for UL.

### Uplink System With A Commercial Gen2 UHF Tag

In our previous work [42], the two-tone IM3 UL has been shown to be compatible with a commercial Gen-2 UHF RFID inlay. The IM3 UL, with the duplexer-based FDD operation, achieves a better EVM than that obtained with the conventional CW Tx and backscattering UL. In this section, the optimal three-tone Tx is also demonstrated to achieve a better UL quality compared to that achieved with a two-tone Tx waveform. The same commercial tag (Avery Dennison RFID 600598, Monza R6) is placed at a coupling distance of 25 cm.

Both Tx waveforms are limited to a peak power of 28 dBm (7.93 V at 50  $\Omega$ ) and the Tx waveform PAPR is fixed at 1.2. According the introduced optimization, the adopted three-tone Tx output voltage uses  $(a_1, a_2, a_3, \phi_1, \phi_3) = (0.347, 1.098, 7.141, \phi_1, \pi - \phi_1)$ , and the magnitude coefficients of the two-tone Tx is  $(a_2, a_3) = (0.71, 7.22)$ . The system block diagram has been introduced via Fig. 3.36. An OOK modulator is necessary to provide proper preamble and command to activate the tag. The adopted preamble and command are the same to those used in [42] and the parameters are annotated in Fig. 3.36. The selected backscattering data rate is 128 kb/s with FM0 encoding [31].

The scope-sampled 30-MHz UL signal is processed by quadrature down-conversion and then filtered by a 1-MHz comb LPF. The resulting time domain baseband waveforms corresponding to the two-tone and three-tone Tx are plotted in Fig. 3.37(a) and (b), respectively. Higher UL power is observed with the three-tone Tx. Noticeable Tx-to-Rx leakage exists when the preamble and command are sent due to the modulated Tx falling into the UL frequency. The high Tx-to-Rx leakage during the DL session is not an issue because the UL



Figure 3.36: Block diagram of the WPT/UL system communicating with a commercial UHF tag.

signal is only sent in the UL session. The UL preamble and the 16-bit key are recovered by the FM0 decoder, which also rejects the low-frequency noise. The FM0 decoder adopted is this work and [42] simply calculate the *i*-th decoded symbol as the absolute difference between the (2i)-th and the (2i - 1)-th demodulated symbol. The adopted FM0 decoding scheme degrades the SNR (or EVM) by 3 dB.

Finally, the EVM of the decoded symbols can be calculated. The optimal three-tone waveform achieves an EVM of -29.8 dB and the two-tone Tx achieves an EVM of -25.6 dB. Considering the UL package only has 23 symbols, the EVM distribution over 40 UL packages has also been measured. The EVM distributions with the both Tx waveforms are plotted in Fig. 3.38, which confirm the better UL performance associated with the three-tone Tx.



Figure 3.37: UL signal after quadrature down-conversion and digital filtering with (a) two-tone and (b) three-tone Tx waveform.



Figure 3.38: EVM distribution with (a) two-tone and (b) three-tone Tx waveform.

## Chapter 4

# Wideband and Efficient RF Transmitters

## 4.1 All-Digital RF Transmitter System Overview

Fig. 4.1 illustrates the block diagram of the designed all-digital RF transmitter. Unlike most reported digitally-modulated transmitters that send amplitude/phase or I/Q codes to the chip in parallel with numerous I/O pins (e.g. [71–74], this design adopts a flip-chip chipto-interposer connection to allow for a compact package, high-quality passives, and better signal integrity. As a result, the number of available I/O pins is limited by the 200- $\mu$ m bump pitch, so two high-speed 2.5-Gb/s LVDS receivers, including a pre-amplifier and a strong-arm comparator, are designed to receive the amplitude-modulation (AM) and phase-modulator (PM) serial code streams. Two 1-to-10 deserializers (DeSer) following the LVDS receivers regenerate the parallel 8-bit AM and PM codes (two of the ten bits are not used) with a symbol rate of 250 MS/s. Two clock receivers, one at 1.25 GHz and one at 250 MHz, are implemented for use by the LVDS receivers and deserializers. Both the rising and falling edges of the 1.25-GHz clock are utilized to sample the 2.5-Gb/s LVDS signals. More details on the Tx periphery circuits are in the Appendix.

The 8-bit amplitude modulator outputs 256 amplitude states, realized by dividing the inverse Class-D core into 19 parallel cells, composed of 4 binary cells and 15 thermal cells. The binary cells have relative sizes of 1, 2, 4, and 8, and the thermal cells have a relative size of 16. The unit-cell size on each differential side is 6  $\mu$ m/65 nm for the thin-oxide NMOS switch device and 18  $\mu$ m/280 nm for the cascode thick-oxide device. In the LO path, a static divider is used to create the in-phase and quadrature LO signals for the 8-bit IQ-mixer-based phase modulator, so the LO receiver has to operate at twice the Tx frequency. The supply voltage is 2.5 V for the power core and 1 V for the driver, phase modulator, and LO/CLK receivers. The on-chip amplitude and phase paths have a delay mismatch lower than 100 ps, which degrades the performance very little. The operations of the amplitude and phase modulator will be covered in the following sections.



Figure 4.1: Block diagram of the realized all-digital RF transmitter with on-chip amplitude and phase modulator and on-interposer transformer.

## 4.2 Design And Analysis of the Inverse Class-D PA

#### Inverse Class-D Performance with Ideal Switches

The inverse Class-D power core, illustrated in Fig. 4.2(a), features non-overlapping voltage and current waveforms and an ideal maximum DE of 100% (if the switch resistance  $R_{on}$ is zero). Compared to the Class-D PAs based on CMOS inverters and thus suffer from the associated  $CV^2 f$  parasitic loss [65–67], the parallel drain capacitance in an inverse Class-D power core can be absorbed into the load resonator, and better performance can be expected at higher frequencies. In practice, the non-zero switch  $R_{on}$  degrades both the power and efficiency, and causes overlap between the current and voltage waveforms. The output power and DE of an inverse Class-D cell are functions of the switch and the load resistance, the LC-tank, and the dc feed inductance, as given by the analytical expressions presented in [[71], eq. (11) and (12)]. The analysis was conducted by solving the circuit KCL at all harmonics, where the switches were modeled by a time-varying conductance alternating between zero and the switch conductance  $(1/R_{on})$ . The provided expressions contain infinite series and are difficult to interpret, so they are revised here to provide a more general and accessible analysis, where the load is represented by a complex number. Similar to [71], the load impedance is assumed to be zero for odd-order harmonics and infinite for even-order harmonics. According to [[71], Fig. 11], the load impedances at even harmonics do not significantly affect results. After some algebraic manipulations, the output (fundamental) power and DE of an inverse Class-D core are expressed by

$$P_{\text{out\_ClassD-1}} \approx \frac{V_{\text{DD}}^2}{4} \times \frac{20R_L}{\left(5R_{\text{on}} + R_L\right)^2} \tag{4.1}$$



Figure 4.2: (a) Schematic of the inverse Class-D cell ( $V_{DD} = 2.5, R_{on} = 1$ ). (b) Power (in dBm) vs.  $Z_L$ . (c) DE vs.  $Z_L$ . The switch current/voltage waveform with  $Z_L$  of (d) 10  $\Omega$ , (e) 100  $\Omega$ , and (f) 10+10j  $\Omega$ .

$$DE_{\text{ClassD}-1} \approx \frac{1}{1 + 5R_{\text{ON}}/R_L},\tag{4.2}$$

where  $R_L$  is the load resistance  $(Z_L = R_L + jX_L)$ . Eq. (4.1) and (4.2) suggest that the performance of the inverse Class-D power core is more convenient to be characterized in terms of the load impedance  $Z_L$  rather than the load admittance  $(Y_L = G_L + jB_L)$ . The approximations are valid assuming  $4\omega_0 C_s R_{on} \ll 1$ . This assumption implies that the switch capacitance can be neglected and is valid for switches fabricated in modern CMOS processes (e.g. 65-nm CMOS) and operated at RF frequencies (e.g. 4 GHz).

Eq. (4.1) and (4.2) indicate that the output power and DE are independent of the load reactance  $X_L$ . The simulated power and DE for the Fig. 4.2(a) schematic are plotted in Fig. 4.2(b) and (c), respectively, to the load impedance. The ideal switches have an onresistance of 1  $\Omega$  ( $R_{on} = 1 \Omega$ ), and the supply voltage is 2.5 V. If  $R_{on} \neq 1 \Omega$  or  $V_{DD} \neq 1$ 2.5 V, Fig. 4.2(b) and (c) are still applicable. In such a case, the horizontal and vertical axes represent respectively  $R_L/R_{on}$  and  $X_L/R_{on}$ , and the output power should be scaled by  $(V_{DD}/2.5)2/R_{on}$ . It is observed that  $X_L$  does not affect the power and efficiency. To be clear, the time-domain current and voltage waveforms are plotted in Fig. 4.2(d), (e), and (f), respectively corresponding to  $Z_L$  of 10, 100, and  $10+10j \ \Omega$ . Because the current is zero during the second half-cycle, the switches only dissipate power in the first half-cycle. With zero load reactance  $(X_L = 0)$ , the peak voltage during the second half-cycle approaches a maximum value of  $\pi V_{DD}$  as  $R_L$  increases (or as  $R_{on}$  decreases), and the voltage in the first half-cycle approaches zero. As shown in Fig. 4.2(f), with a non-zero load reactance ( $X_L \neq$ 0), the voltage swing can exceed  $\pi V_{DD}$  and also become negative during the second halfcycle. However, the current/voltage waveforms in the first cycle and, thus, the switch power dissipation are equivalent to those associated with  $Z_L = R_L + j0$ . Although using a non-zero  $X_L$  results in a higher fundamental voltage, the fundamental voltage is no longer in-phase with the fundamental current, and the resulting output power (and DE) does not change.

As predicted by (4.1), with a fixed  $R_{on}$  the output power is lower when  $R_L$  takes either a very high or a very low value. Applying the AM-GM inequality to (4.1), it can be shown that the maximum power of  $V_{DD}^2/4R_{on}$  occurs when  $R_L = 5R_{on}$ . Unfortunately, the drain efficiency corresponding to the maximum power is only 50%. To achieve a higher efficiency, the load resistance must be significantly higher than the switch resistance. A practical design uses an  $R_L \approx 10R_{on}$ , resulting in an output power that is 0.5-dB from the maximum value and an efficiency of 67%.

Compared to a Class-D power core with the same supply and switch resistance, the inverse Class-D core has a higher output power. Without using PMOS devices, the inverse Class-D can also achieve a better efficiency. The Class-D power core, illustrated in Fig. 4.3(a), is modeled by a load resistance  $R_L$  with both of terminals connected to a square-wave voltage source that has a source resistance of  $R_{on}$ . The two voltage sources are complementary and



Figure 4.3: (a) Model of a Class-D power core. (b) Efficiency comparison of a Class-D and inverse Class-D power core versus  $x \equiv R_L/R_{on}$ .

swing between 0 and  $V_{DD}$ . The Class-D (fundamental) power and efficiency can be easily derived by

$$P_{out\_ClassD} \approx \frac{4V_{\rm DD}^2}{5} \times \frac{R_L}{\left(2R_{\rm oN} + R_L\right)^2} \tag{4.3}$$

$$DE_{\text{ClassD}} \approx \frac{4}{5} \times \frac{1}{1 + 2R_{\text{oN}}/R_L}.$$
 (4.4)

According to (4.3), the maximum power achievable by the Class-D cell, reached when  $R_L = 2R_{on}$ , is  $V_{DD}^2/10R_{on}$ . This maximum power is 4-dB lower than that of an inverse Class-D cell. Moreover, for the same  $R_{on}$  and  $R_L$ , the inverse Class-D cell exhibits up to 6 dB higher power.  $Pout_{ClassD-1}/Pout_{ClassD}$  is a function of  $x \equiv R_L/R_{on}$  and is 1.9, 4.9, and 6.0 dB, respectively for x = 1, 5, and 10. DEs for the two topologies are plotted in Fig. 4.3(b) as functions of x. It shows the Class-D core exhibits a higher DE with a lower x, while the inverse Class-D cell obtains a higher DE if x > 10. Notice that an inverse Class-D cell does not need a pull-up PMOS device and can employ a larger switch with lower on resistance  $(R_{on})$  but similar parasitic capacitance. Finally, the calculated output powers (in dBm) of an inverse Class-D cell and a Class-D cell, both with a 2.5-V supply voltage, are plotted respectively in Figs. 4.4(a) and (b) as functions of  $(R_L, R_{on})$ . The efficiency, which only depends on  $x \equiv R_L/R_{on}$ , is also annotated. Fig. 4.4 provides a preliminary guide to the design of Class-D and inverse Class-D power cores.



Figure 4.4: (a) Inverse Class-D and (b) Class-D output power and DE as functions of  $R_L$  and  $R_{on}$  ( $x \equiv R_L/R_{on}$ ).

For the ease of design, the amplitude modulation of an inverse Class-D power core is usually realized by digitally modulating the switch conductance  $(G_{on} = 1/R_{on})$  with uniform steps [51, 61, 71–76]. According to (4.1) and (4.2), both the output power and efficiency degrade when  $G_{on}$  decreases from the peak value  $G_{on,max}$  ( $G_{on,max} = 1/R_{on,min}$ ), while the exact roll-off characteristics depend on both  $G_{on,max}$  and  $R_L$ . Fig. 5 shows the simulated back-off power, efficiency, output voltage  $(V_{out})$ , and output voltage step of an ideal inverse Class-D power core with RL of 1, 10, and 100  $\Omega$ . The switch conductance is swept from 0.01 to 1 S with a step size of 0.01 S. The curves match the calculated results. It is worth noting that the voltage curve shown in Fig. 4.5(c) is a scaled version of the efficiency curve shown in Fig. 4.5(b). This is because  $V_{out} = V_{DD} \times DE \times \pi$  according to (4.1) and (4.2), and the relation  $P_{out} = V_{out}^2/2R_L$ . It has been explained that using  $R_L = 10R_{on,min}$  results in both good peak power and peak efficiency. However, the voltage steps increase and the AM resolution decreases with decreasing switch conductance, when fewer parallel cells operate. According to (4.1), to linearize the output voltage response by designing the load,  $R_L \ll 5R_{on,min}$  has to be satisfied. This inevitably leads to a very low efficiency and is not a preferred method. Instead, digital pre-distortion is usually employed with a look-up table (LUT) to find the proper AM code  $(AM_{code})$  associated with a given output magnitude [51, 61, 71–76]. This method is favored and easier to implement over designing a nonlinear  $G_{on}$ -to- $AM_{code}$  response.



Figure 4.5: Back-off (a) power, (b) DE (c) output voltage, and (d) voltage step of an inverse Class-D cell with 2.5-V supply and  $R_L$  of 1, 10, and 100  $\Omega$ . The switch conductance is swept with step size of 0.01 S.

#### Practical Performance with NMOS Switch

The analysis above uses an ideal switch model, and the derived power and efficiency are independent of the load reactance. This ideal characteristic does not hold when the switches are realized by NMOS devices, since the switch cannot remain in the off state without conducting current when the drain swing becomes exceedingly negative, so the voltage and current waveforms depicted in Fig. 4.2(f) do not exist in practice. Instead, under a negative voltage with magnitude exceeding the device threshold, the switch conducts a negative current and limits the negative voltage swing, degrading both the output power and efficiency. Additionally, the abrupt voltage jump at the transition between the first and second half-cycles must be smoothed by the device parasitic capacitance, injecting a large current into the load. This also degrades the performance. The inverse Class-D cell used in this work, illustrated in Fig. 4.6(a), is an NMOS device with a size of 1.53 mm/65 nm. A cascode thick-oxide device withstands a voltage swing that can in theory reach  $\pi V_{DD}$ . Using the cascode device, the simulated switch on-resistance is  $0.6 \Omega$ . The simulated voltage and current waveforms are plotted in Fig. 4.6(b) and (c), versus load impedances of 10 and  $10+10i \Omega$  respectively. It is noted that Fig. 4.6(b) resembles Fig. 4.2(d), while the current waveform in Fig. 4.6 (c) is very different from that in Fig. 4.2(f). The voltage can no longer swing deeply below zero and two injection pulses appear such that the overlap between the current and voltage waveforms increases and both power and efficiency are reduced. The simulated peak power and DE of the designed inverse Class-D power core are plotted, respectively, in Fig. 4.7 and 4.8, versus the load impedance  $Z_L$ . The power performance is fairly wideband, and the power contours in Fig. 4.7(a) and (b), simulated respectively at 0.6 and 3.6 GHz, look similar to each other. Fig. 4.7 confirms that the presence of an imaginary impedance degrades the output power. Similarly, Fig. 4.8 shows the DE also depends on the load reactance.

Explained by (4.1), when  $R_L$  is sufficiently high (e.g. 10  $\Omega$ ), the output power starts to degrade as  $R_L$  increases. As the output power decreases, the dc power dissipated by the device during the transit between the two half-cycles can no longer be neglected and hurts the efficiency. On the other hand, the DE also does lower when the operating frequency increases, resulting in higher dissipated power during switch transitions. Finally, to achieve both good output power and DE for the designed inverse Class-D core, a load impedance of about  $8+7j \Omega$  is preferred according to the simulation. This impedance is marked on Figs. 4.7 and 4.8, corresponding to an output power of about 31.5 dBm and DE of about 70%. The load is inductive to compensate the device parasitic capacitance.

## 4.3 Design of the CMOS Phase Modulator

The schematic of the digital phase modulator or interpolator (PI) is depicted in Fig. 9(a). The PI core is similar to a Gilbert-cell double-balanced IQ mixer [[127], Sec. 6]. The four dc bias currents are digitally-modulated and denoted by  $I_{I+}$ ,  $I_{I-}$ ,  $I_{Q+}$ , and  $I_{Q-}$ . The differential



Figure 4.6: (a) Schematic for characterizing the designed CMOS core, and drain voltage and current with (b)  $Z_L=10 \ \Omega$  and (c)  $Z_L=10+10j \ \Omega$ 



Figure 4.7: Simulated peak power for the designed inverse Class-D core vs. the load impedance  $(Z_L)$  at (a) 0.6 and (b) 3.6 GHz.



Figure 4.8: Simulated drain efficiency (DE) for the designed inverse Class-D core vs. the load impedance  $(Z_L)$  at (a) 0.6 and (b) 3.6 GHz.

bias currents for the in-phase and quadrature mixers are  $(I_{I+} - I_{I-})$  and  $(I_{Q+} - I_{Q-})$ , respectively. The mixer output's fundamental current (at the LO frequency  $\omega_{LO}$ ) has an inphase component proportional to  $(I_{I+} - I_{I-})$  and a quadrature component proportional to  $(I_{Q+} - I_{Q-})$ . Thus, the mixer output's fundamental voltage at  $\omega_{LO}$ , denoted by  $V_{mixer, fund}$ , is proportional to  $[(I_{I+} - I_{I-})cos(\omega_{LO}t) + (I_{Q+} - I_{Q-})sin(\omega_{LO}t)]$ . Notice that the mixer output voltage, denoted by  $V_{mixer}$ , includes higher harmonics so that  $V_{mixer} \neq V_{mixer,fund}$ . The current DACs for both the in-phase and quadrature mixers are 7-bit, as illustrated in Fig. 4.9(b), and are composed of three binary cells (with relative size of 1, 2, and 4) and 15 thermometer cells (with relative size of 8). The total current in the current DAC is  $127I_{DAC}$ , where  $I_{DAC} \approx 5.5 \ \mu A$  is the unit-cell current. The 7-bit current DAC used by the in-phase (and quadrature) mixer is able to output 128 differential current states from  $127I_{DAC}$  to  $-127I_{DAC}$  with a uniform step size of  $2I_{DAC}$ . The two digitally-modulated currents  $(I_{I+} - I_{I-})$  and  $(I_{Q+} - I_{Q-})$  are designed not to generate redundant phases at the PI output. The PI has an 8-bit resolution, where seven bits are used to select  $I_{I+}$  from 0 to  $127I_{DAC}$  and  $(I_{I+} - I_{I-}) = 2I_{I+} - 127I_{DAC}$ , and one bit is used to select  $(I_{Q+} - I_{Q-})$  between  $(128I_{DAC} - |I_{I+} - I_{I-}|)$  and  $(|I_{I+} - I_{I-}| - 128I_{DAC})$ . The table in Fig. 4.10(a) shows the two mixer currents  $(I_{I+} - I_{I-})$  and  $(I_{Q+} - I_{Q-})$  versus the 8-bit PM code  $(PM_{code})$ . Fig. 4.10(b) plots  $V_{mixer,fund}$  as a complex number in the phasor domain. It is observed that the output phase covers the full  $2\pi$  radius. The output phase as a function of  $PM_{code}$  can be expressed



Figure 4.9: Schematic of (a) the digitally-modulated phase modulator (b) current DAC in the IQ mixer, and (c) LO integrator.

by

$$Phase(V_{mixer\_fund}) = \tan^{-1} \left( \left| \frac{1+2x}{127-2x} \right| \right) + \frac{\pi}{2} floor \left( \frac{PM_{code}}{64} \right)$$

$$(4.5)$$

where  $x = (PM_{code} \mod 64)$ . Fig. 4.10(c) plots the output phase and phase step versus the  $PM_{code}$ . Notice that the phase step is not constant. The phase resolution has the finest value of  $0.9^{\circ}$ /step when the  $PM_{code} = 0$ , 64, 128, and 192 and the coarsest value of  $1.8^{\circ}$ /step when the  $PM_{code} = 32$ , 96, 160, and 224.

To drive the switch amplifier, the mixer output voltage  $V_{mixer}$  is converted to a digital waveform by a CML-CMOS converter, which eliminates the amplitude information. The



Figure 4.10: (a) Mixer I/Q bias current (b) mixer output fundamental voltage  $V_{mixer,fund}$ , and (c)  $V_{mixer,fund}$  phase and phase step vs.  $PM_{code}$ .



Figure 4.11: Exemplary situation without the integrators: (a) Mixer I/Q (differential) currents, (b) mixer output ( $V_{mixer}$ ) and (c) phase steps for  $V_{mixer,fund}$  and CML-CMOS output ( $D_{out}$ ).

converter is composed of a two-stage differential amplifier followed by self-biased inverter chains (see the Appendix). This nonlinear process translates  $V_{mixer}$  to zero and  $V_{DD}$ , respectively, for negative and positive inputs. The output square wave of the converter is denoted by  $D_{out}$ . For two  $D_{out}$  waveforms, the (fundamental) phase difference is equal to the difference in their zero-crossing timing multiplied by  $\omega_{LO}$ . If all of the harmonics at the mixer output are filtered such that  $V_{mixer} = V_{mixer,fund}$ , the CML-CMOS converter preserves the input phase, which means the fundamental phase difference between two  $D_{out}$  waveforms is equal to that of the corresponding two  $V_{mixer}$  waveforms.

Unfortunately, if harmonic components appear at the mixer output  $(V_{mixer})$ , the CML-CMOS converter can significantly distort the phase response. Since the phase modulator has to operate over a wide frequency range, the mixer cannot employ a band-pass or low-pass filter at its output to reject the harmonics [73]. It will be explained in detail that in order to achieve a linear phase response at the CML-CMOS output  $(D_{out})$ , the two integrators at the mixer inputs are critical, creating triangle I/Q LO waveforms with appropriate magnitudes. The integrator schematic has been illustrated in Fig. 4.9(c).

If the integrators are not employed, the LO inputs of the I/Q mixers, as well as the in-phase and quadrature currents flowing into the mixer load, are square waves. The inphase and quadrature (differential) mixer currents are simulated with square-wave LO drives swinging between 0.4 and 0.8 V. The simulated I/Q current waveforms are plotted in Fig. 4.11(a) for  $PM_{code} = 0$ , 16, 32, and 48. Since the two current waveforms resemble square waves with two distinct states and sharp transitions, the weighted sum of the I/Q currents and  $V_{mixer}$  have zero-crossing points that match that of the dominant current waveform and do not necessarily change with  $PM_{code}$ . The corresponding  $V_{mixer}$  waveforms for the four PM codes are plotted in Fig. 4.11(b). The two  $V_{mixer}$  corresponding to  $PM_{code} = 0$  and 16 have similar zero-crossing points, while  $V_{mixer}$  for  $PM_{code} = 32$  has zero-crossing points associated with very low slope  $(dV_{mixer}/dt)$ . As a result, the PI performs poorly without the integrators, where the phase steps are extremely fine at most of the PM codes and very coarse at  $PM_{code} = 32$ . The simulated phase steps for  $V_{mixer,fund}$  and  $D_{out}$  are plotted in Fig. 4.11(c) to  $PM_{code}$  from 0 to 64. The phase response of  $V_{mixer,fund}$  is close to the ideal response (4.5), but the phase response of  $D_{out}$  is significantly distorted and is not useful.

The designed integrator sink current  $I_{int}$  is digitally tunable with 13 current states (SEL from 1 to 13) from 60  $\mu$ A to 780  $\mu$ A with a step size of 60  $\mu$ A. The CMFB is critical for adjusting the PMOS current source in accordance with the sink current and maintaining the assigned dc level of 0.6 V. The integrator output in-phase and quadrature waveforms at 0.6 and 3.6 GHz are plotted respectively in Figs. 4.12(a) and (b) for four integrator configurations, SEL = 1, 5, 9, and 13. To achieve both good phase linearity and phase noise, the magnitude must be held at a moderate level. Generally speaking, using a low integrator current results in better phase nonlinearity but worse phase noise performance. On the other hand, the phase linearity is expected to degrade if the magnitude of the mixer input waveform is excessively high, such that the mixer differential pairs operate mostly with full current steering as if driven by square waves. The full phase modulator, as illustrated in Fig. 4.9(a), is simulated at 0.6 and 3.6 GHz. It is found that SEL = 5 ( $I_{int} = 300 \ \mu$ A) and



Figure 4.12: Integrator output I/Q waveform with frequency of (a) 0.6 and (b) 3.6 GHz. Four integrator currents are tested.

SEL = 13 ( $I_{int} \approx 780 \ \mu A$ ) are, respectively, appropriate for the two operation frequencies.

For phase modulator operation at 0.6 GHz, the simulated phase step of  $D_{out}$  is plotted in Fig. 4.13(a), and the simulated noise at a frequency offset of 10 MHz is plotted in Fig. 4.13(b). The simulations are conducted (in one of the four quadrants) with  $PM_{code}$ swept from 0 to 64 and integrator configurations SEL = 1, 5, 9, and 13. The phase step of  $V_{mixer,fund}$  (simulated but not shown here) is close to ideal (4.5), regardless of the integrator current. The simulation and the results presented in Fig. 4.13 warrant the use of SEL =5, which achieves relatively good noise and phase linearity. Using SEL = 13 results in a very nonlinear phase response with phase step as coarse as  $4^{\circ}/\text{code}$  at  $PM_{code} = 30$ , and the noise also degrades around  $PM_{code} = 30$ . To explain clearly, the 0.6-GHz mixer I/Q currents are plotted in Fig. 4.14(a) and  $V_{mixer}$  is plotted in Fig. 4.14(b). The simulations are conducted with SEL = 13 at six phase codes  $(PM_{code} = 0, 12, 24, 36, 48, 60)$ . The current waveforms have many low-slope regions (e.g. at 0.3 ns), corresponding to complete mixer current steering. The  $V_{mixer}$  slope at the zero-crossing points is lower when the I/Q currents are comparable (e.g.  $PM_{code} = 24$ ). The zero-crossing point of  $V_{mixer}$ , with a low slope, is sensitive to the circuit noise and control variables and yields both high phase noise and a coarse phase step when processed by the CML-CMOS converter. The abrupt jumps in the mixer current are caused by charge sharing during the switch transitions.

It is also observed in Fig. 4.13(b) that using SEL = 1 when the PI operates at 0.6 GHz



Figure 4.13: (a) Phase step for the 0.6-GHz CML-CMOS output  $(D_{out})$  to  $PM_{code}$ . (b) Simulated  $D_{out}$  noise.



Figure 4.14: The 0.6-GHz mixer (a) output I/Q current composition and (b) output voltage  $(V_{mixer})$  for six sample phase codes. Integrator SEL = 13.



Figure 4.15: The 0.6-GHz mixer (a) output I/Q current composition and (b) output voltage  $(V_{mixer})$  for six sample phase codes. Integrator SEL = 1.

results in the worst phase noise, at around -125 dBc/Hz. The mixer currents and  $V_{mixer}$  for SEL = 1 are respectively plotted in Figs. 4.15(a) and (b). The mixer LO drive has a peak-to-peak value of only 50 mV with SEL = 1 [see Fig. 4.12(a)], so the mixer differential pair operates in the linear region, and the mixer I/Q currents follow the input drive and are also triangle waveforms, as shown in Fig. 4.15(a). It is observed in Fig. 4.15(b) that the zero-crossing point of  $V_{mixer}$  shifts pretty uniformly with the phase code, which implies a linear  $D_{out}$  phase response. The good phase linearity has been confirmed by simulation. However, the  $V_{mixer}$  peak-to-peak magnitude is only 0.1 V and the slew rate (SR) at the zero-crossing point is only 0.12 V/ns. A 22-dB higher slew rate of 1.5 V/ns has been shown in Fig. 4.14(b) at (SEL = 13,  $PM_{code} = 0$ ), accounting for a noise improvement of 17 dB (-142 dBc/Hz).

The same simulations are conducted at a PI frequency of 3.6 GHz. The simulated  $D_{out}$  phase step and noise are plotted respectively in Fig. 4.16(a) and (b). At such a high frequency, the maximum reachable peak-to-peak magnitude at the integrator output is only 90 mV, [see Fig. 4.12(b)]; therefore, the mixer differential pair always operates in the linear region, with the IQ mixer currents following the input signals. It is observed in Fig. 4.16(b) that using SEL = 1 results in a high  $D_{out}$  noise. In this case, both the mixer output and input magnitudes are lower than 20 mV, and the integrator output voltage, previously shown



Figure 4.16: (a) Phase step for the 3.6-GHz CML-CMOS output  $(D_{out})$  to  $PM_{code}$ . (b) Simulated  $D_{out}$  noise.

in Fig. 4.12(b), is dominated by the coupled voltage from the integrator input and does not look like a triangle waveform. Obviously, the highest integrator current with SEL = 13 should be used. The corresponding mixer IQ currents and  $V_{mixer}$  are plotted in Fig. 4.17(a) and (b), respectively. A slew rate (SR) of around 1.5 V/ns is obtained at the zero-crossing point. Although this slew rate is almost the same as the SR achieved with  $\omega_{LO} = 0.6$  GHz, SEL = 13, and  $PM_{code} = 0$ , the simulated noise degrades by 7 dB to -135 dBc/Hz due to the more frequent noise injection. Generally speaking, the  $D_{out}$  noise is proportional to the zero-crossing SR multiplied by the operation frequency  $\omega_{LO}$ .

In summary, a high SR for the mixer output  $(V_{mixer})$  at the zero-crossing point contributes to good phase noise and linearity. To achieve this, the integrator output should be a triangle waveform with a proper magnitude (e.g.  $V_{pp} = 200 \text{ mV}$ ), such that the mixer differential pairs operate in the linear region to the maximum extent.

## 4.4 Interposer Designs

Three transformers are designed on five-layer high-density interconnection (HDI) PCBs. The three interposers target three frequency bands, LB, MB, and HB, respectively centered at 0.9, 1.5, and 2.5 GHz. The substrate material is TU768 with a dielectric constant of 4.3 and loss tangent of 0.019. The minimum metal width and spacing are 50  $\mu$ m and the copper thickness is 17  $\mu$ m. While an edge-side transformer with the minimum spacing cannot



Figure 4.17: The 3.6-GHz mixer (a) output I/Q current composition and (b) output voltage  $(V_{mixer})$  for six phase codes. Integrator SEL = 13.

achieve a sufficient coupling factor, the thin dielectric thickness of 60  $\mu$ m supports the use of broadside-coupling transformers. As introduced, the designed inverse Class-D power core prefers a load impedance of 8+7j  $\Omega$  for good power and efficiency, so the transformers are designed with a 1:2 turns ratio, which ideally presents a load impedance of 12.5  $\Omega$  to the PA core. The one-turn primary winding connected to the CMOS chip uses the second PCB metal layer, and the secondary winding, connected to the PCB motherboard and the 50- $\Omega$ load, uses the first and the third metal layers.

Three-dimensional illustrations of the designed LB, MB, and HB transformers are shown in Fig. 4.18. The passive structures are simulated by a full-wave EM simulator (ADS Momentum). The widely used simplified lumped model [128] is shown in Fig. 4.19(a), and the extracted parameters for the three designed transformers and some variations are provided in Fig. 4.19(b). The parameters are extracted based on the simulated Y-parameters, and include the primary and secondary-winding inductances ( $L_p$  and  $L_s$ ) and quality factors ( $Q_p$  and  $Q_s$ ), the magnetic coupling factor ( $k \equiv M/\sqrt{L_s L_p}$ ), and the coupling capacitor ( $C_c$ ). The maximum power gain  $G_{p,max}$  is calculated. Although  $G_{p,max}$  is extensively used in microwave circuit designs, this merit, which allows an arbitrary load impedance, does not



Figure 4.18: Three-dimensional illustrations for (a) LB (b) MB, and (c) HB transformers on HDI interposers.



Figure 4.19: Transformer on PCB interposer: (a) lumped model, and (b) extracted parameters.

properly characterize the transformer in this work, where only an SMD capacitor  $(C_s)$  is placed in parallel with the 50- $\Omega$  load. Alternatively, a new figure of merit  $G_{TXF,max}$  is used here, defined by the maximum power gain  $G_p$  for the present load condition. The simulated  $G_{TXF,max}$  for the three transformers are approximately -0.5 dB, very low-loss and close to  $G_{p,max}$ . The simulation shows that the quality factors of the transformer inductors are higher than 40.

Parasitic extraction reveals a 2 pF parasitic capacitor on each of the PA output nodes and a coupling capacitance of 0.8 pF. The total differential capacitor is 1.8 pF, denoted by  $C_{p}$ . The compact flip-chip connection prevents the use of an external parallel SMD capacitor at the transformer primary side, so  $C_p$  holds a constant value over the three designs. Since only a single CMOS die is used with the three packages, the transformer footprint must shrink on the MB and HB packages to reduce the inductance and accommodate higher operating frequencies. The LB, MB, and HB transformers are designed with primary inductances of 4.0, 2.0, and 1.0 nH, respectively, and the achieved coupling factors are respectively 0.79, 0.72, and 0.60. In general, achieving a good coupling factor is more difficult when the transformer has a smaller footprint. The low coupling factor can be the main loss contributor, in additional to the metal resistance. Using the same inductance for the primary winding, multiple test structures are simulated to achieve the highest coupling factor and, therefore, better bandwidth and insertion loss. Increasing the transformer size enhances the coupling factor, but the inductance also increases and results in a lower operating frequency. To compensate, a wider trace width has to be used. This approach can increase the coupling factor k to some extent, at the cost of a higher parasitic capacitance between the two windings.

In our design, which targets a low transformer loss, the two LC-tanks at the transformer primary and secondary windings use different resonance frequencies  $(L_pC_p \neq L_sC_s)$ . The corresponding analysis is provided to relate the transformer input impedance or equivalently the PA load impedance  $(Z_{in})$  to the transformer parameters. (Recall that an inductive  $Z_{in}$  of  $8+7j \ \Omega$  is desired.) For the ease of analysis, the transformer coupling capacitor is neglected, as in many previous works [128]. The low winding resistances are also ignored. With the model illustrated in Fig. 4.19(a),  $Z_{in}$  is equal to a resistive load Rp in parallel with two reactive loads  $jX_{p1}$  and  $jX_{p2}$  (i.e.  $Z_{in} = R_p//jX_{p1}//jX_{p2}$ ), where  $R_p$  and  $jX_{p1,2}$  can be derived by

$$R_{\rm p}(\omega) = \frac{L_p R_{\rm ant}}{k^2 L_s} (P^2 + Q^2)$$
(4.6)

$$jX_{\rm p1}(\omega) = \frac{-R_p(\omega)}{R_{\rm ant}} \times \left[\frac{1}{j\omega C_s}//j\omega L_s(1-k^2)\right]$$
(4.7)

$$jX_{p2}(\omega) = \left(j\omega C_{p} + \frac{1}{j\omega L_{p}(1-k^{2})}\right)^{-1},$$
(4.8)

where  $P \equiv 1 - \omega^2 L_s C_s (1 - k^2)$  and  $Q \equiv \omega L_s (1 - k^2) / R_{ant}$ .  $R_{ant}$  is the 50- $\Omega$  antenna load. In conventional designs with  $L_p C_p = L_s C_s$ , the resonance frequencies with a purely real input impedance can be obtained by solving  $X_{p1}(\omega) + X_{p2}(\omega) = 0$ . The three solution frequencies are expressed by

$$\omega_{1,2} = \frac{1}{\sqrt{L_s C_s}} \sqrt{\frac{1}{1-k^2} - \frac{\eta}{2}} \pm \sqrt{\frac{\eta^2}{4} + \frac{k^2}{(1-k^2)^2} - \frac{\eta}{(1-k^2)}}$$
(4.9)

and  $\omega_3 = 1/\sqrt{L_s C_s (1-k^2)}$ , where  $\eta \equiv L_s/(C_s R_{ant}^2)$ . If  $\eta$  is sufficiently low,  $\omega_1$  can be approximated by  $1/sqrtL_s C_s (1+k)$ ,  $\omega_2$  can be approximated by  $1/\sqrt{L_s C_s (1-k)}$ , and  $\omega_1 < \omega_3 < \omega_2$ . As  $\eta$  increases,  $\omega_1$  increases while  $\omega_2$  decreases. The two frequencies eventually become equal (lower than  $\omega_3$ ) and then transform into a complex conjugate pair. This known behavior is not applicable here because  $L_p C_p \neq L_s C_s$ .

To estimate the transformer input impedance  $Z_{in}$  graphically,  $R_p$ ,  $X_{p1}$ ,  $X_{p2}$ , and  $X_{p1}//X_{p2}$ expressed in [70–72] are plotted in Fig. 4.20 versus the operating frequency. The LB design parameters  $k_M = 0.79$ ,  $L_p = 4.0$  nH,  $C_p = 1.8$  pF,  $L_s = 13.1$  nH,  $C_s = 2.3$  pF, and  $R_{ant} =$ 50  $\Omega$  are used. Several critical points are annotated parametrically. First of all,  $R_p$  at low frequency, denoted by  $R_{p0}$ , can be derived by  $R_{ant}L_p/L_sk^2$ . As the frequency increases,  $R_p$ degrades and reaches its minimum value of  $R_{p,min} = R_{p0}(x - x^2/4)$  at  $\omega_x = \omega_0/\sqrt{1 - 0.5x}$ , where  $\omega_0 = 1/\sqrt{L_sC_s(1-k^2)}$  and  $x \equiv \eta(1-k^2)$ . With the previously mentioned parameters,  $R_{p0} = 24.5 \Omega$ , x = 0.86,  $\omega_0 = 9.4e9 (1.5 \text{ GHz})$ ,  $\omega_x = 7.1e9 (1.1 \text{ GHz})$ , and  $R_{p,min} = 16.5 \Omega$ . As the operating frequency increases beyond  $\omega_x$ ,  $R_p$  increases rapidly and can be approximated by  $R_p \approx R_{p0}\omega^4 L_s^2 C_s^2 (1-k^2)^2$ . Secondly,  $X_{p1}(\omega)$  is equal to the scaled input reactance of a capacitance  $C_s$  in parallel with an inductance of  $L_s(1-k^2)$ , where the frequency-dependent scaling factor is  $-R_p(\omega)/R_{ant}$ .  $X_{p1}(\omega)$  is negative for  $\omega < \omega_0$ , positive for  $\omega > \omega_0$ , and eventually approaches the cubic approximation  $R_{p0}L_s^2C_s(1-k^2)^2\omega^3/R_{ant}$ . Finally,  $X_{p2}(\omega)$ is equal to the input reactance of a capacitor  $C_p$  in parallel with an inductor of  $L_p(1-k^2)$ . The resonance frequency is  $\omega_p = 1/\sqrt{L_pC_p(1-k^2)}$  (3.1 GHz) with the parameters used.

The input impedance is the three components in parallel:  $R_p//jX_{p1}//jX_{p2}$ . At  $\omega_x$ , where  $R_p$  reaches  $R_{p,min}$ ,  $(X_{p1}//X_{p2}) \approx \omega_x L_p(1 - k^2/2)$ . At  $\omega_0$ , where  $X_{p1}$  is infinite,  $X_{p1}//X_{p2} = X_{p2} \approx \omega_0 L_p(1 - k^2)$  and  $R_p = xR_{p0}$ . Finally, the load impedances at  $\omega_x$  and  $\omega_0$ are respectively  $[R_{p,min}//\omega_x L_p(1 - k^2/2)]$  and  $[xR_{p,min}//\omega_0 L_p(1 - k^2)]$ . Plugging in the LB design parameters,  $Z_{in}$  is calculated at  $\omega_x$  and  $\omega_0$  to be 10+8j  $\Omega$  and 7+10j  $\Omega$ , respectively. At both frequencies,  $Z_{in}$  is close to the preferred value.

The simulated input impedances of the three packages, employing the simulated Sparameters and with the operating frequency swept, are plotted in Fig. 4.21. Some operating frequencies are annotated (in GHz). The input impedance is designed to be around  $8 + 7j \Omega$  at the operating frequencies of each of the three packages. The center frequencies of the LB, MB, and HB packages are 1.1, 1.8, and 2.5 GHz, respectively. According to the


Figure 4.20:  $R_p$ ,  $X_{p1}$ ,  $X_{p2}$ , and  $X_{p1}//X_{p2}$  as functions of the operating frequency. The input impedance  $Z_{in} = R_p//jX_{p1}//jX_{p2}$ .



Figure 4.21: Simulated input impedances for LB, MB,  $HB_A$  ( $C_s = 1.5$  pF), and  $HB_B$  ( $C_s = 0.5$  pF) vs. the operating frequency.

input impedance response shown in Fig. 4.21, the LB, MB, and HB packages are expected to have good power and efficiency from 0.4 to 1.7 GHz, 0.9 to 2.4 GHz, and 1.6 to 3.2 GHz, respectively.

To improve the performance at higher frequencies (e.g. 3.5 GHz), a smaller  $C_s$  of 0.5 pF can be used for the HB package  $(HB_B)$ . It can be inferred that although  $L_pC_p = L_sC_s$  with  $C_s = 0.5$  pF, there should only be a single resonance frequency with  $\text{Imag}(Z_{in}) = 0$  according to (4.9). Also, the transformer loss at 2.5 GHz should be higher with  $C_s = 0.5$  pF. Observing the two HB curves in Fig. 4.21, it is noticed that the filled square-symbol curve, associated with  $C_s = 1.5$  pF  $(HB_A)$ , has an input impedance of about  $10 + 8j \Omega$  across a wider frequency range (i.e. from 1.6 to 3.2 GHz) than that of the empty diamond-symbol curve, associated with  $C_s = 0.5$  pF  $(HB_B)$ . However, as the frequency further increases, the  $HB_A$  curve has its reactance increasing faster versus frequency than the  $HB_B$  curve, and the output power and efficiency should be higher for the  $HB_B$  package at higher frequencies (e.g. 3.5 GHz).

Fig. 4.22(a) plots the simulated peak power of the designed LB, MB,  $HB_A$  ( $C_s = 1.5 \text{pF}$ ) and  $HB_B$  ( $C_s = 0.5 \text{pF}$ ) packages. The simulated results with some variations introduced in the passives designs, as listed in Fig. 4.19(b), are also plotted in Fig. 4.22(a). Fig. 4.22(a) shows that higher power can be obtained with a higher coupling factor k. Fig. 4.22(b) plots the simulated DE of the used designs. The simulation exhibits peak power/DE of 30.4 dBm/68%, 29.8 dBm/70%, and 29.7 dBm/68%, respectively for the LB/MB/HB packages at 1.1, 1.6, and 2.3 GHz. The second HB solution ( $HB_B$ ) with  $C_s = 0.5$  pF improves both power and efficiency for frequencies higher than 3.4 GHz. In simulation, the LB, MB, and  $HB_B$  packages together achieve a wide bandwidth from 0.4 to 4 GHz with output power higher than 26 dBm and drain efficiency above 26%.

### 4.5 Measured Results

The chip photograph and the front/back view of the LB package are respectively shown in Fig. 4.23(a) and (b). The chip is fabricated in the TSMC 65-nm CMOS LOGIC process without MIM capacitors, and a top metal thickness is only 0.9  $\mu$ m. The active area on chip is about 1.7 mm<sup>2</sup>, and the number of available flip-chip bumps is limited by the chip area to less than 30. This limitation warrants streaming the 8-bit amplitude and phase codes into the chip via two serial 2.5-Gb/s LVDS inputs. An on-chip scan chain controls the static setup of the transmitter, including the bias currents for the integrator, phase interpolator, and the LO/CLK receivers. As shown in Fig. 4.23(b), the interposer is about 7 × 14 mm<sup>2</sup>, limited by the 7 × 14 ball grid array (BGA) on its bottom, which is used to interface with the PCB motherboard through a spring pin socket (Ironwood CBT-BGA 6001). Fig. 4.23(c) shows the photographs of the three interposers after die attachment. The interposers have a lot of unused area on the top and inner layers to accommodate the transformer designs. This warrants the proposed strategy of moving the passive structures out of the chip for better power and efficiency and a more generic CMOS transmitter design.



Figure 4.22: Simulated (a) peak power and (b) drain efficiency for the designed transmitter incorporating the on-interposer passives.





Figure 4.23: (a) Chip photograph. (b) Front and back view of the LB package. (c) Photographs of the LB, MB, and HB interposers after die attachment.



Figure 4.24: Measured peak power and drain/system efficiency for the LB, MB, HBA, and HBB packages.

The measured peak power, drain efficiency (DE), and system efficiency (SE) of the LB, MB,  $HB_A$ , and  $HB_B$  packages with a supply voltage of 2.5 V are shown in Fig. 4.24. The  $HB_A$  package uses a  $C_s$  of 1.5 pF and the  $HB_B$  package uses a  $C_s$  of 0.5 pF. The SE includes the power consumption of all components on the all-digital Tx. The power is measured via a spectrum analyzer calibrated with a power meter. In this measurement, both the AM and PM codes are set to the maximum value of 255 ( $AM_{code} = PM_{code} = 255$ ). The two LVDS signals are generated by an FPGA board (Xilinx VC707) connected to the PCB motherboard. The signals are fed into the chip via the socket and the interposer. The LO and two clocks signals are sinusoidal and generated by three signal generators sharing the same 10-MHz reference. The LB package has its best power/DE/SE of 29.2 dBm/60%/56% at 1.1 GHz. The corresponding peak performances for the MB and  $HB_A$  packages are 28.8 dBm/56%/52% and 28.4 dBm/58%/52%, respectively at 1.5 and 2.4 GHz. The  $HB_B$  package is able to output 25.5 dBm at 3.5 GHz with a good DE of 40% and SE of 35%. The collective bandwidth of the three packages (LB/MB/HB<sub>B</sub>) covers from 0.7 to 3.5 GHz, with peak power higher than 25.5 dBm and the DE higher than 40%.



Figure 4.25: Normalized Tx output magnitude and phase vs. the AM code.

Fig. 4.25 shows the normalized output magnitude and phase of the MB package at 2.4 GHz. The AM code is swept quasi-statically from 0 to 255, while the PM code is set at  $PM_{code} = 0$ . In this measurement, the LVDS signal is dc for the phase and a bit sequence with a repetition period of 4 ns for the AM stream. The compressive AM-AM response results from the inverse Class-D amplitude modulation scheme, where the switch conductance is proportional to the AM code. The AM-AM response is very similar to the illustrative curve shown in Fig. 4.5(c) with  $R_L = 10$ . For a desired transmitter complex output, the measured AM-AM table is consulted to select the proper AM code corresponding to an approximated output magnitude, while the AM-PM distortion associated with the selected AM code will be compensated later by the phase modulator.

Fig. 26 shows the measured phase response of the phase modulator. In this measurement, the PM code is swept quasi-statically from 0 to 255 while the AM code is set at 255. Fig. 4.26(a) and (b) plot the output phase and phase step when  $\omega_{LO} = 0.6$  GHz, with the integrator current set respectively at 300  $\mu$ A (SEL = 5) and 780  $\mu$ A (SEL = 13). The phase responses cover the full 360°, and the phase steps exhibit repetitive patterns every 90° (or 64 PM codes). Larger steps are found around PM codes of 32, 96, 160, and 224, and smaller steps are found around PM codes of 0, 64, 128, and 192. These nonlinear behaviors are expected with the designed phase modulator and have been explained. It can be observed by comparing Fig. 4.26(a) and (b) that the phase response at SEL = 5, with a maximum



Figure 4.26: Phase responses at (a) 0.6 GHz with SEL = 5, (b) 0.6 GHz with SEL = 13, and (c) 3.6 GHz with SEL = 13.

phase step of about  $2.5^{\circ}$ /step, is more linear than that at SEL = 13. The performance deviation caused by the integrator setup has also been explained. At 3.6 GHz, a higher integrator current should be used, and the measured phase response with SEL=13 is plotted in Fig. 4.26(c). The obtained maximum phase step is also about  $2.5^{\circ}$ /step. Fig. 4.26 demonstrates the wideband capability of the phase interpolator, which is able to work from 0.6 to 3.6 GHz. For a desired complex output, the PM-PM table is consulted to correct the phase distortion introduced by the AM modulator and to approximate the required output phase. The PM-AM effect is very minor and can be neglected.

After characterizing both the amplitude and phase modulators, modulation tests, includ-



Figure 4.27: Measured 64-QAM constellation and Tx performance at the six testing frequencies: 0.6, 1.2, 1.8, 2.4, 3.0, 3.6 GHz.

ing those for 64 QAM and 20-MHz WLAN and LTE, were performed at six RF frequencies with the LB, MB, and  $HB_B$  packages. The LB, MB, and  $HB_B$  packages are responsible for supporting frequencies of 0.6/1.2 GHz, 1.8/2.4 GHz, and 3.0/3.6 GHz, respectively. It is important to note that the transmitter AM-AM, AM-PM, and PM-PM responses, together with the dc settings, depend on the operating frequency. Therefore, each operating frequency has its dedicated AM-AM, AM-PM, PM-PM and dc setup tables.

The measured 64-QAM constellations (via a Keysight spectrum analyzer and VSA software) at the six frequencies are plotted in Fig. 4.27. The modulation symbol rate is 62.5 MS/s, corresponding to a data rate of 375 Mb/s. The measured average power, DE and SE, and constellation error vector magnitude (EVM) are annotated in Fig. 4.27. Good power, efficiency, and linearity in terms of EVM are achieved at the six frequencies. For example, the MB package generates an output power of 23.3 dBm at 2.4 GHz with a drain efficiency of 34% and an EVM of -33 dB. The modulation output power is lower than the peak power due to a signal PAPR of 3.7 dB.

To demonstrate a wideband and multi-standard Tx capability, the three packages are programmed to output a 20-MHz WLAN (802.11.g) signal at the six testing frequencies. The OFDM-based signal has a high PAPR value of 8.7 dB, which, without further compression, will result in a low transmitter power and efficiency. Recall that, in inverse Class-D designs with an output magnitude controlled by the switch conductance, the transmitter has its DE proportional to the output voltage. Thus, the transmitter has only 37% of its peak efficiency when operated with an 8.7-dB power back-off. To improve both the output power and efficiency, the ideal WLAN signal is scaled such that some of its output magnitudes are higher than the maximum transmitter output  $(AM_{code} = 255)$ . The signal is then clipped by replacing the out-of-bounds magnitudes with the maximum transmitter output. Scaling and clipping the signal causes its PAPR to decrease at the expense of increased signal distortion, degraded EVM, and a higher out-of-band noise floor. Fig. 4.28 plots the simulated EVM and noise (at an 80 MHz offset) of a WLAN signal versus the degree of signal scaling/clipping in terms of PAPR. The simulated EVM and noise of the signal, as approximated by the quantized outputs of the amplitude and phase modulators, are also plotted. The AM-AM, AM-PM, and PM-PM tables measured at 2.4 GHz are used in the simulation. Following the approximation, the simulated noise degrades roughly 15 dB to approximately -120 dBc/Hz. The noise is dominated by the quantization noise of the all-digital transmitter and does not increase with PAPR, unlike the original signal. On the other hand, the transmitter output resolution is sufficient to approximate the original signal with little EVM degradation if the original signal has an EVM higher than -40 dB. To achieve a good output power and efficiency while the EVM still meets the specification of -25 dB with sufficient margin, the transmitter is programmed to approximate a WLAN signal with a PAPR of 6.2 dB.

Finally, the measured output power, drain/system efficiency, and EVM of the three packages (LB, MB, and  $HB_B$ ) at the six frequencies are plotted in Fig. 4.29. At the standard frequency of 2.4 GHz, the transmitter achieves a high output power of 20.7 dBm with an EVM better than -30 dB. The drain and system efficiency are respectively 25% and 21%. If the transmitter is centered alternatively at 1.8/3.0 GHz, similar output power and efficiency can be obtained while the EVMs still meet specification. The measured output spectrums at 2.4 GHz and the demodulated 64-QAM constellation from the OFDM subcarriers are plotted in Fig. 4.30. The spectrums meet the required mask, but the noise levels are high, at -118 dBc/Hz with an 80 MHz offset. The noise level is about 15 dB higher than that of the state-of-the-art noise-filtering all-digital Tx [73], which employs a higher symbol rate of 1 GS/s and mixed-signal filtering.

The 20-MHz 64-QAM LTE uplink signal is used for the third modulation test. Similarly, the LTE baseband signal has a high PAPR of 8.9 dB and must be scaled and clipped to support good output power and efficiency. Although the LTE Single-Carrier Frequency-Division Multiple Access (SC-FDMA) technique allows using a lower PAPR than the WLAN OFDM, excessively scaling/clipping the signal can still result in a violation of linearity and ACLR requirements. Fig. 4.31(a) shows the simulated EVM and spectrum noise of the original LTE signal versus the signal PAPR. The counterpart EVM/noise at the Tx output is also plotted, again using the measured AM-AM, AM-PM, and PM-PM tables at 2.4 GHz.



Figure 4.28: Simulated EVM and noise for an ideal and quantized (by the all-digital Tx) WLAN signal vs. the signal PAPR.



Figure 4.29: Measured WLAN performance (power, DE/SE, EVM) at 2.4 GHz and other frequencies (i.e. 0.6, 1.2, 1.8, 3.0, 3.6 GHz).



Figure 4.30: Measured WLAN spectrum and demodulated 64-QAM constellation at 2.4 GHz (MB).

To achieve an EVM of -30 dB, the signal PAPR can be as low as 4 dB, and the dominant quantization noise is at approximately -117 dBc/Hz. Regulations on adjacent channel leakage power ratio (ACLR) also require attention and are simulated versus the signal PAPR, before and after the transmitter quantization. The results are plotted in Fig. 4.31(b). To achieve an  $ACLR_1$  lower than -30 dB and  $ACLR_2$  lower than -36 dB with sufficient margin, the transmitter being measured is programmed to output an LTE signal with a PAPR of 4.1 dB.

The measured Tx output power, drain and system efficiency, EVM, and  $ACLR_{1,2}$  on both sides of the RF carrier are summarized in Fig. 4.32. For all the six test frequencies, the output power is higher than 20 dBm with a DE above 25% and SE above 20%. The measured EVM and ACLRs meet specification. Among the six testing frequencies, the MB package at 1.8 GHz achieves its highest output power of 24 dBm with an excellent SE of 30%. The measured spectrums and the demodulated 64-QAM constellations at two testing frequencies, 1.2 and 3.6 GHz, are plotted in Fig. 4.33. The spectrums meet the ACLR specifications with substantial margin, and the output noise is approximately -115 dBc/Hz at an 80-MHz frequency offset for all the tested frequencies.



Figure 4.31: Simulated (a) EVM, noise and (b) ACLR for an ideal and quantized 20-MHz LTE signal vs. the signal PAPR.



Figure 4.32: Measured 20-MHz LTE power, drain/system efficiency, ACLR, and EVM at the six testing frequencies from 0.6 to 3.6 GHz.



Figure 4.33: Measured LTE spectrum and demodulated 64-QAM constellation at (a) 1.2 GHz (LB), (b) 2.4 GHz (MB), and (c) 3.6 GHz  $(HB_B)$ .

# 4.6 Design of the Band-Switching Interposer

#### Overview

The photograph of the proposed single-output, band-selecting Tx package is shown in Fig. 4.34(a), and the simplified schematic is illustrated in Fig. 4.34(b). The size of the interposer package is  $2 \times 2 \times 0.05 \ cm^3$ . The 6-layer HDI interposer uses minimum trace spacing and width of 50  $\mu$ m, and the minimum layer thickness is 60  $\mu$ m. The interposer disperses the signals/supplies of the CMOS ICs to a 1-mm pitch ball-grid array (BGA) on the interposer backside to match connections on the coarse-pitch PCB motherboard. Substantial design space is available on the interposer for high-quality SMD and PCB passives.

The band-switching scheme illustrated in Fig. 4.34(b) is as follows: When one of the three sub-Txs operates, the two switches on both sides of its primary winding are turned off, and the other four switches, associated with the two idle sub-Txs, are turned on (short-circuited). The power supplies of the two idle sub-Txs, fed through the transformer center tap, are also grounded by SPDTs.

The main design variables in this work are the three external SMD tuning capacitors  $(C_L, C_M, \text{ and } C_H)$  connected to the secondary windings of the three transformers. In [59], these SMD capacitors are simply used to resonate with the secondary windings. However, their roles in this reconfigurable Tx package are more profound. The capacitors and interposer codecide the load impedance presented to the PA core and the bandwidth of the operating sub-Tx. Therefore, the performances of the three sub-Txs are affected by all the three external capacitors. The three PCB transformers on the three packages in [59] have been optimized to cover different frequency bands and collectively cover a wide bandwidth. The transformers in this work adopt the same dimensions as those in [59], so the extracted parameters are the same and are listed in Fig. 4.19(b). Completely redesigning the transformers could improve the overall performance but the optimization would be complex. This is because each transformer needs to be optimized for the optimal loss/bandwidth for the associated Tx is turned off, also has to be considered.

At a supply voltage of 2.5 V, the ideal output of an inverse Class-D switching PA is a half-sinusoidal waveform swinging between 0 V and 7.5 V [59]. With the voltages at the drain nodes of the PA always taking a positive value, the band-switching switches can have the drain-bulk junctions always reversely biased without the undesired junction leakage in typical bulk-CMOS switches. In addition, the operating PA core sees a load impedance around 10  $\Omega$ , stepped down by the transformer, so the "off" switches do not have to withstand the substantial peak-to-peak swing of 20 V present at the 50- $\Omega$  antenna load. (Conventional PA switches are usually deployed at the antenna load [55].) As a result, the switch design can be simplified. However, the transformer primary windings of the two idle Txs are short-circuited by the band-selecting switches, and the non-zero switch on-resistance degrades the output power of the operating Tx because the switch on-resistance is stepped up by the transformer and presents a series resistance to the antenna load. The degradation is not significant with







Figure 4.34: (a) Photograph and (b) block diagram and schematic of the proposed singleoutput wideband Tx package.



Figure 4.35: Simplified schematic of the LB sub-Tx.

a low switch on-resistance (i.e.  $< 1 \Omega$ ).

### Operation of the LB Sub-Tx

Fig. 4.35 presents a schematic used to model the operation of the LB sub-Tx. In this case, a low-pass filter (LPF) forms at the LB transformer output. The secondary windings of the MB and HB transformers become two series inductors with inductance values of  $(1-k_{M2}^2)L_{s2} \approx 3.3$  nH and  $(1-k_{M3}^2)L_{s3} \approx 2.2$  nH, respectively. The parasitic (interwinding) coupling capacitors and the external tuning capacitors are the parallel capacitors in the LPF. The load-pull contours for the designed inverse Class-D PA core have been provided in [59] to help designing the transformers. However, with the transformers fixed it is more useful in the design perspective to simulate the power delivered to the transformer load to the transformer load impedance. The load-pull contours are frequency-dependent and are plotted in Fig. 4.36, at 0.6, 1.0, 1.6, and 2.2 GHz. The contours are plotted on a Smith chart, where the load impedance is converted into a reflection coefficient  $(S_{11})$  with reference impedance  $(Z_0)$  of 50  $\Omega$ . At 0.6 GHz, the preferred load impedance  $(Z_{load,LB})$  is close to 50  $\Omega$   $(S_{11} = 0)$  and at 1.0 GHz the preferred load impedance is around 50//(-50j)  $\Omega$ . This characteristic justifies placing a parallel capacitor at the transformer output to extend bandwidth.

For the output MNW illustrated in Fig. 4.35, the transformer load impedance  $(Z_{load,LB})$ is also presented on the Smith chart in Fig. 4.37, versus an operating frequency from 0.4 to 2.2 GHz. In this sweep,  $C_L$  is the only varying variable, and  $C_M$  and  $C_H$  are temporarily fixed at 1.5 pF and 0.5 pF, respectively. It shows that  $Z_{load,LB}$  can be approximated by  $50//(j\omega C_L)^{-1} \Omega$  at the frequencies of interest. From the power contours in Fig. 4.36(b)-(d) and the load trajectory in Fig. 4.37, it appears that there is an optimal range of  $C_L$ , between 2 and 4 pF, that results in good power performance up to 1.6 GHz (e.g., > 25 dBm). Using an excessive high  $C_L$  of 8 pF results in a low output power at frequencies beyond 1 GHz. Because the maximum power contour travels to the lower side on the Smith chart as the



Figure 4.36: Simulated power of the LB sub-Tx versus the load  $(Z_{load,LB})$   $S_{11}$   $(Z_0 = 50 \Omega)$  at (a) 0.6, (b) 1.0, (c) 1.6, and (d) 2.2 GHz.



Figure 4.37:  $S_{11}$  of the load  $(Z_{load,LB})$  from 0.4 to 2.2 GHz.



Figure 4.38: Simulated output (a) power and (b) DE for the LB sub-Tx with the interposer lumped model and EM-simulated S-parameters.

frequency increases, using a smaller  $C_L$  (e.g., 1 pF) that keeps  $S_{11}$  close to the origin is also not preferred.

Fig. 4.38(a) and (b) plot the simulated LB output power and drain efficiency (DE), respectively, for various  $C_L$  candidates. The simulations are obtained from active device co-simulating with the simplified interposer lumped model and the more accurate interposer *S*-parameter extracted from EM simulation (ADS Momentum). The lumped model is sufficiently accurate.

#### Operation of the MB Sub-Tx

When the MB sub-Tx operates, the transformer center tap of the LB and HB sub-Txs are switched to ground by SPDT supply switches, and the primary windings of the LB and HB transformers are connected to ground as well. Fig. 4.39. shows a simplified schematic. The load presented to the MB transformer, denoted by  $Z_{load,MB}$ , is a  $\pi$ -type LPF in series with an parasitic LC-tank ( $Z_{para,MB}$ ). The  $\pi$ -type LPF has a series inductor approximated by  $(1 - k_{M2}^2)L_{s2} \approx 2.2$  nH and two parallel capacitors,  $C_M$ , and  $C_H$ . The parasitic LC-tank, in series with the antenna load, is formed by the two capacitors,  $C_L$  and  $C_{c1}$ , and the inductance seen at the secondary side of the LB transformer, approximated by  $(1 - k_{M1}^2)L_{s1} \approx 4.9$  nH. As a result, the MB sub-Tx delivers low power to the antenna load at the resonance frequency of this LC-tank, denoted by  $f_{stop,MB}$ , where

$$f_{\rm stop,MB} \approx 1/[2\pi\sqrt{L_{s1}C_L(1-k_{M1}^2)}].$$
 (4.10)

 $f_{stop,MB}$  determines the lower-bound frequency of the MB sub-Tx. The magnitude of  $Z_{para,MB}$  is relatively high at  $f_{stop,MB}$ . Since the LB sub-Tx cannot output a high power at a high frequency such as  $f_{stop,MB}$ , the output capability of the Tx package at the LB/MB intersection is limited.

For the LB transformer, if  $L_{p1}C_p < L_{s1}(C_L + C_M)$  and  $\sqrt{(1 - k_{M1}^2)L_{s1}/2(C_L + C_M)} < 50$ , [[59], eq. (6)-(8)], the upper-bound frequency of the LB sub-Tx, denoted by  $f_{upper,LB}$ , can be approximated by

$$f_{\text{upper,LB}} \approx \frac{1}{2\pi} \sqrt{\frac{2}{L_{s1}(C_L + C_M) \left(1 - k_{M1}^2\right)} - \frac{1}{(C_L + C_M)^2 (50)^2}}.$$
(4.11)

The above two inequalities for (4.11) to be valid hold with  $C_P \approx 1.8 \text{ pF}$ ,  $C_L + C_M >> 1 \text{ pF}$ , and the LB transformer parameters has been listed in Fig. 4.19. At  $f_{upper,LB}$ , the output conductance seen by the LB (open-drain) PA core is  $(k_{M1}^2 L_s)/(L_{p1}50)$ , which is 0.04 S and deviates from the desired load conductance of  $\approx 0.1$  S. According to (4.10) and (4.11), the upper-bound frequency of the LB sub-Tx is lower or in the best case not sufficiently higher than  $f_{stop,MB}$  ( $f_{upper,LB} < \sqrt{2}f_{stop,MB}$ ), and the ratio  $f_{upper,LB}/f_{stop,MB}$  even goes lower with a low  $C_L$ . Therefore, the power performance at the LB/MB intersection worsens when a



Figure 4.39: Simplified schematic of the MB sub-Tx.



Figure 4.40: Simulated output power of the MB sub-Tx versus the transformer load impedance  $(Z_{load,MB})$  in terms of  $S_{11}$ .

small  $C_L$  is used. Fortunately, it will be demonstrated that the LB/MB intersection can be covered better by the HB sub-Tx.

Similar to Fig. 4.36, the power delivered to the MB transformer load is simulated for all the passive loads at the transformer output, and the load-pull results are plotted in Fig. 4.40, at 1.0, 1.6, 2.2, and 2.8 GHz. Fig. 4.40 indicates that a capacitive load is preferred for the MB transformer. Fig. 4.41 plots the  $S_{11}$  trajectories of MB transformer load impedance  $(Z_{load,MB})$  from 1.0 to 2.8 GHz. Several combinations of the two external capacitors  $(C_L, C_M)$  are explored.  $C_H$  is fixed at 0.5 pF, which has only a minor effect on the trajectories.

As mentioned, adopting a large  $C_L$  (e.g., 8 pF) decreases  $f_{stop,MB}$ , so the corresponding  $S_11$  trajectories in Fig. 4.41 has left the high-impedance region (e.g., Real $(S_{11})>0.6$ ) at a lower frequency (e.g., 1 GHz). Although using an 8-pF  $C_L$  achieves a higher  $f_{upper,LB}/f_{stop,MB}$ , the LB/MB intersection still cannot be covered properly. With  $C_L = 8$  pF, the power simulation in Fig. 4.38(a) shows noticeable power degradation beyond 1 GHz, and Figs. 4.40 and 4.41 show that at 1 GHz the MB  $S_{11}$  trajectories with  $C_L = 8$  pF still correspond to a low output power. With a  $C_L$  of 4 pF or 2 pF, the Tx output performance at the LB/MB intersection, a  $C_L$  of 8 pF is preferred. Further increasing  $C_L$  over 8 pF makes the LB sub-Tx very narrowband.

Fig. 4.41 also shows that with the same  $C_L$ , using a higher  $C_M$ , such as 3 pF, decreases  $\text{Imag}(S_{11})$  and can extract a high power from the PA core, according to the load-pull information in Fig. 4.40. However, with a higher  $C_M$  the  $S_{11}$  trajectories enter the low-power region (e.g.,  $\text{Real}(S_{11}) < -0.3$ ) at a lower frequency so the MB upper-bound frequency decreases. It appears that the MB upper-bound frequency can be extended by using a small  $C_L$ , since decreasing  $C_L$  moves  $S_{11}$  toward the bottom/RHS on the Smith chart. This is because when the MB sub-Tx operates at a higher frequency, the parasitic LC-tank from the LB transformer can be viewed as a series capacitor of  $C_L$  on the signal path, and decreasing  $C_L$  is equivalent to adding a series capacitor to compensate the inductive load. However, as mentioned and explained earlier, decreasing  $C_L$  worsens the output power at the LB/MB intersection.

The simulated output powers of the MB sub-Tx using the simplified lumped model are plotted in Fig. 4.42 with different  $(C_L, C_M)$  sets and  $C_H$  fixed at 0.5 pF. (The simulated results obtained with the interposer S-parameters are in the Appendix.) The occurrence of the low output power at  $f_{stop,MB}$ , the better high-frequency performance with a smaller  $C_L$ , and the power-bandwidth tradeoff associated with the  $C_M$  selection have all been explained.

#### Operation of the HB Sub-Tx

When the HB sub-Tx operates, the LB and MB sub-Txs are turned off and their primary windings are connected to ground. The simplified schematic for the HB sub-Tx is illustrated in Fig. 4.43. In this case, the HB transformer is loaded by a high-order LC-tank, and the equivalent parasitic component in series with the output antenna load can be modeled by



Figure 4.41:  $S_{11}$  of the transformer load impedance  $(Z_{load,MB})$  versus frequency from 1.0 to 2.8 GHz with step size of 0.2 GHz.

$$Z_{\text{para,HB,lossless}} \approx \frac{j\omega(L_M + L_L) \left(1 - \omega^2 \frac{L_M L_L C_L}{(L_M + L_L)}\right)}{(1 - \omega^2 L_M C_M)(1 - \omega^2 L_L C_L) - \omega^2 C_M L_L}.$$
(4.12)

The parasitic resistors are ignored for now. The two resonance frequencies of  $Z_{para,HB,lossless}$ , denoted by  $f_{stop1,HB}$  and  $f_{stop2,HB}$ , can be approximated by

$$f_{\text{stop1,HB}} \approx 1/\left[2\pi\sqrt{L_{s1}\left(1-k_{M1}^2\right)(C_L+C_M)}\right]$$
 (4.13)



Figure 4.42: Simulated output power for the MB sub-Tx with the interposer simplified lumped model..



Figure 4.43: Simplified schematic of the HB sub-Tx.



Figure 4.44: Simulated output power for the HB sub-Tx with the interposer lumped model.

$$f_{\text{stop2,HB}} \approx 1/[2\pi \sqrt{L_{s2} (1 - k_{M2}^2) (C_L / / C_M)}],$$
 (4.14)

where  $f_{stop1,HB}$  is substantially lower than the HB operation frequency. Since  $f_{stop1,HB}$  (4.13)  $< f_{stop,MB}$  (4.10)  $< f_{stop2,HB}$  (4.14), the HB sub-Tx can be operated at  $f_{stop,MB}$  to cover the LB/MB intersection.

Fig. 4.42 has shown that decreasing  $C_M$  can extend the upper-bound frequency of the MB sub-Tx; however,  $f_{stop2,HB}$  also increases as  $C_M$  goes lower and the output capability of the Tx package degrades at the MB/HB intersection. Since the LB sub-Tx is impossible to cover the MB/HB intersection, a higher  $C_M$  is preferred for a better power performance there. Finally, using a smaller  $C_M$  extends the upper frequency of the HB sub-Tx.

The simulated output powers of the HB sub-Tx are plotted in Fig. 4.44 using the simplified lumped model.  $C_L$  is fixed at 2 pF as suggested by the previous discussions. The lumped model is accurate since the results are close to that obtained with the interposer S-parameter. The dependences of  $f_{stop1,HB}$  and  $f_{stop2,HB}$  on  $C_M$  are very obvious. Compared to using  $C_H = 0$  pF, using  $C_H = 0.5$  pF improves the in-band performance of the HB sub-Tx, but the output power degrades rapidly at the upper band edge. This effect has been seen in the MB sub-Tx, where increasing  $C_M$  brings the similar tradeoff.

#### **Design Summary**

A collective consideration including the three sub-Txs is required when selecting  $(C_L, C_M, C_H)$ . In summary, using a small  $C_M$  (e.g. 1 pF) helps extending the maximum frequency of the MB and HB sub-Txs, but the performance at the MB/HB intersection suffers. For example, the simulated HB power in Fig. 4.44 shows that with a  $C_M$  of 1 pF the HB sub-Tx delivers low power before 3.2 GHz, while the MB Tx also cannot cover up to 3.2 GHz according to Fig. 4.42. Nevertheless, using a small  $C_M$  improves the high-frequency performance of the HB sub-Tx. A moderate  $C_M$  is selected at 2 pF. On the other hand, the LB/MB intersection can be covered by the HB sub-Tx. Therefore, a small  $C_L$ , such as 2 pF, can be adopted. The adoption of a small  $C_L$  extends the upper frequency of the MB sub-Tx and improves the output power at the MB/HB intersection.

The output power of the Tx package is simulated with the three sub-Txs turned on alternatively, and the results are plotted in Fig. 4.45 for a number of  $(C_L, C_M)$  combinations. Similar to [59],  $C_H$  is selected at 0 pF to improve the high-frequency performance of the HB sub-Tx at the cost of the HB peak power. The interposer S-parameter is used in this simulation. Fig. 4.45 demonstrates the explained performance dependences on the capacitor selection. The LB/MB intersection is observed to be covered partially by the HB sub-Tx when  $C_L$  is selected at 2 pF, and the highest power at the MB/HB intersection is achieved with a low  $C_L$  (2 pF) together with a high  $C_M$  (2 pF). In the end,  $C_L$ ,  $C_M$ , and  $C_H$  are selected at 2 pF, 2 pF, and 0 pF, respectively. With the selected ( $C_L$ ,  $C_M$ ,  $C_H$ ) and the extracted transformer parameters listed in Fig. 4.19(b),  $f_{stop,MB}$  (1),  $f_{upper,LB}$  (2),  $f_{stop1,HB}$ (4), and  $f_{stop2,HB}$  (5) can be calculated as 1.60e9, 1.39e9, 1.13e9, and 2.78e9, respectively. Indeed, with the selected capacitors Fig. 4.45 shows that the HB sub-Tx exhibits relatively low output power at  $f_{stop1,HB}$  and  $f_{stop2,HB}$ , and the MB and LB output powers are low at  $f_{stop,MB}$  and  $f_{upper,LB}$ , respectively.

### 4.7 Switch PA Reconfigured for Band Switching

The above discussions assume the band-switching switches are ideal with a zero onresistance and an infinite off-impedance. In practice, the non-zero switch on-resistance and switch off-capacitance degrade the power performance. The package output power is simulated under several practical switch realizations in the same 65-nm bulk CMOS process, and the results are plotted in Fig. 4.46. The band switches are assumed to be realized by cascode thick-oxide devices. It is observed from Fig. 4.46 that if the external switch is large (e.g.  $w = 4 \text{ mm}, R_{on} \approx 0.7 \Omega$ ) to achieve a low switch resistance, the switch off-capacitance degrades the HB performance. The switches cannot be too small as well otherwise the primary windings associated with the two idle sub-Txs are not properly shorted. The switch on-resistance is stepped up by the transformer and presents a series resistance in the transformer's secondary winding. The equivalent series resistance of the idle transformers, denoted by  $R_{eq}$ , is shown in Fig. 4.47(a). When the LB Tx operates, both  $R_{eq}$  from the MB and HB transform-



Figure 4.45: Simulated Tx output power (with interposer S-parameter) with the three sub-Tx turned on alternatively.

ers are in the power path and in series with the antenna load, while only the HB  $R_{eq}$  is in the power path when the MB Tx operates. Adopting the lumped-model parameters provided in Fig. 4.19(b), Fig. 4.47(b) plots the MB and HB  $R_{eq}$  when the LB sub-Tx operates at 1 GHz. Fig. 4.47(b) also plots the HB Req when the MB sub-Tx operates at 2 GHz. Even with a substantial on-resistance of 1  $\Omega$ , the power degradation due to the non-zero switch on-resistance is minor because the additional series resistance to the 50- $\Omega$  antenna load is only 5.2  $\Omega$  when the LB -Tx operates and is 2.1  $\Omega$  when the MB sub-Tx operates. In general, the LB performance is more vulnerable to the non-zero switch on-resistance and the HB performance is more vulnerable to the non-zero switch on-resistance. A scaled technology with lower parasitics can improve performance.

On the other hand, the SPDT supply switches at the center-taps of the three transformer primary windings are necessary to connect the 2.5-V supplies of the two idle DTxs to ground. The non-zero switch on-resistance at the inductor center tap of the operating Tx consumes dc power and can degrade both output power and efficiency. Fortunately, the switch parasitic capacitance is not an issue here because the SPDTs connect the transformer center taps to



Figure 4.46: Simulated Tx output power (with interpsoer S-parameter) with the three sub-Tx turned on alternatively.



Figure 4.47: (a) Equvialent inductor series reisitance  $(R_{eq})$  and (b)  $R_{eq}$  v.s. switch on-resistance.



Figure 4.48: Simulated Tx power (with interposer S-parameter) with non-ideal supply SPDT switches (with non-zero on-resitance).

low impedance nodes, either supply or ground, so a large switch device (or multiple switch devices put in parallel) can be used to reduce the on-resistance. The power degradation under non-zero switch on-resistance is simulated and plotted in Fig. 4.48. The power degradation is about 0.5 dB with a 0.2- $\Omega$  supply SPDT on-resistance. The SMD SPDTs used in this work have on-resistance much lower than 0.1  $\Omega$ , so the associated performance degradation is negligible.

The switch-based inverse Class-D PAs are useful to work in conjunction with this bandswitching interposer. The switch PA core of the two idle sub-Txs can be reconfigured to short-circuit the associated transformer primary windings. As the result, no external switch is required and no extra loading effect (i.e. switch off-capacitance) is introduced to the operating sub-Tx. Although the PA output common-source devices in this work and [59] cannot be configured to turn on simultaneously for the idle sub-Txs, denoted by DTx Reconfiguration 1, this static function is not difficult to implement. Alternatively, this CMOS design can turn-off the switch devices on the both sides simultaneously. Such a configuration, denoted by DTx Reconfiguration 2, also presents to the transformer primary windings a low impedance. This is because the total device output (parasitic) capacitance for DTx Reconfiguration 2 is substantial, including the output routing capacitance,  $C_{db}$ ,  $C_{dg}$ ,  $C_{sg}$ and  $C_{sb}$  of the cascode devices, and  $C_{db}$ ,  $C_{ds}$  and  $C_{dg}$  of the common source devices. After the passive extraction, the parallel capacitance that loads the primary winding, denoted by  $C_{p,diff}$ , is about 8 pF. The simplified schematics of the Tx package under LB, MB, and



Figure 4.49: Schematic illustration with the switch PA reconfigured for the supported bandswitching scheme: DTx Reconfiguration 2.

HB operation are plotted in Fig. 4.49 with the supported DTx Reconfiguration 2. The schematic corresponding to DTx Reconfiguration 1 can be obtained by pulling the input of the common-source devices in the idle DTxs to 1 V.

Using the interposer S-parameter, the simulated Tx power and DE achieved with ideal band-switching switches, the supported switch PA reconfiguration (DTx Reconfiguration 2) and the recommended reconfiguration (DTx Reconfiguration 1), are plotted in Fig. 4.50(a) and (b), respectively. Owing to the low switch-PA on-resistance of only 0.6  $\Omega$ , the results with DTx Reconfiguration 1 are very close to the ideal case. On the other hand, the supported DTx reconfiguration 2 has its power and DE degraded from the ideal case, but the degradation is minor at most frequencies.

With the primary windings of the two idle sub-Txs terminated by  $C_{p,diff}$ , the equivalent inductor seen at the transformer secondary winding, denoted by  $L_{eq}$ , becomes

$$L_{\rm eq} \approx \left(1 - k_M^2\right) L_s + k_M^2 L_s / (1 - \omega^2 L_p C_{p,\rm diff})$$
(4.15)



Figure 4.50: (a) EM-simulated power and (b) efficiency of the two DTx reconfiguration schemes compared to that with ideal external switches.

 $L_{eq} \approx L_s$  when the operation frequency is substantially lower than the resonance frequency  $f_{res} = 1/2\pi \sqrt{L_p C_{p,diff}}$ ,  $L_{eq}$  is extremely high at  $f_{res}$ , and  $L_{eq} \approx (1 - k_M^2)L_s$  at a high frequency where  $\omega^2 L_p C_{p,diff} >> 1$ . For the LB, MB, and HB transformers, the corresponding resonance frequencies of  $L_{eq}$  ( $L_{eq,LB}$ ,  $L_{eq,MB}$ , and  $L_{eq,HB}$ ), denoted by  $f_{res,LB}$ ,  $f_{res,MB}$ , and  $f_{res,HB}$ , are estimated at 0.9, 1.3, and 1.8 GHz, respectively.

When the LB sub-Tx operates, a high attenuation is expected to occur at operation frequency around  $f_{res,MB}$ . When the MB sub-Tx operates, a high attenuation is expected to occur at  $f_{res,HB}$  (1.8 GHz) which is below the MB operation frequency but still degrades the MB power on the low-frequency end. With the presence of  $C_{p,diff}$ , the original resonance frequency of the parasitic LC-tank loading the MB sub-Tx, at  $f_{stop,MB}$  (1), splits to two frequencies, which can be roughly estimated by

$$f_{\text{stop,MB,New1}} \approx f_{\text{stop,MB}} \times \sqrt{\left(1 + x - \sqrt{1 + x^2 + x(4k_{M1}^2 - 2)}\right)/2}$$
 (4.16)

$$f_{\text{stop,MB,New2}} \approx f_{\text{stop,MB}} \times \sqrt{\left(1 + x + \sqrt{1 + x^2 + x(4k_{M1}^2 - 2)}\right)/2}$$
 (4.17)

where  $x = (L_{s1}C_L)/(L_{p1}C_{p,diff})$ . Functions  $F_1(x, k_M)$  and  $F_2(x, k_M)$  are defined by the second term on the RHS of (7) and (8), respectively.  $f_{stop,MB,New1}$  and  $f_{stop,MB,New2}$  can be calculated as 0.7 and 2.0 GHz, respectively. Fig. 4.51 plots the simulated parasitic loading for the MB sub-Tx ( $Z_{para,MB}$ ) with DTx Reconfiguration 1 and 2. The simplified lumped model is used. The simulated  $Z_{para,MB}$  has a peak at 1.5 GHz with DTx Reconfiguration 1 and has peaks at 0.7 and 1.9 GHz with DTx Reconfiguration 2. The peak at 0.7 GHz is distant from the MB operation frequency, while the peak at 1.9 GHz does not introduce significant attenuation to the MB sub-Tx because it has a low magnitude. This is because the parasitic LC-tank has a low tank quality factor at the resonance frequency  $f_{stop,MB,New2}$ .

Finally, with the presence of a finite  $C_{p,diff}$ , the original HB parasitic resonance frequency at  $f_{stop2,HB}$  (5), calculated at 2.8e9, splits to  $f_{stop2,HB} \times F_1(x_2, k_{M2})$  ( $\approx 1.1 \text{ GHz}$ ) and  $f_{stop2,HB} \times F_2(x_2, k_{M2})$  ( $\approx 3.0 \text{ GHz}$ ), where  $x_2 = L_{s2}(C_L//C_M)/(L_{p2}C_{p,diff})$ . Indeed, it can be observed from Fig. 4.50(a) that the HB transmission zero around 2.5 GHz with DTx Reconfiguration 1 increases slightly with DTx Reconfiguration 2. On the other hand the original HB parasitic resonance frequency at  $f_{stop1,HB}$  (4), calculated at 1.1e9, splits to  $f_{stop1,HB} \times F_1(x_1, k_{M1})$  ( $\approx 0.6 \text{ GHz}$ ) and  $f_{stop1,HB} \times F_2(x_1, k_{M1})$  ( $\approx 1.7 \text{ GHz}$ ), where  $x_1 = L_{s1}(C_L + C_M)/(L_{p1}C_{p,diff})$ . Fig. 4.52 plots the simulated parasitic loading for the HB sub-Tx (Zpara,HB in Fig. 4.43). The simulated  $Z_{para,HB}$  with DTx Reconfiguration 1 has peaks at 1.01 GHz and 2.87 GHz and the simulated  $Z_{para,HB}$  with DTx Reconfiguration 2 has peaks at 0.54, 1.05, 1.72, and 3.12 GHz. The peak at 1.72 GHz with DTx Reconfiguration 2 has a low magnitude and does not introduce noticeable attenuation to the HB sub-Tx.



Figure 4.51: Simulated parasitic loading for the MB sub-Tx  $(Z_{para,MB})$  with DTx Reconfiguration 1 and 2.

Similarly, this is because the parasitic LC-tank formed by  $(C_L+C_M)$  and  $Leq_{LB}$  has a low tank quality factor at the resonance frequency  $f_{stop1,HB} \times F_2(x_1, k_{M1})$ .

In summary, with DTx Reconfiguration 2, the HB sub-Tx degrades little compared to DTx Reconfiguration 1. The output performance mainly degrades at the upper frequencies of the LB sub-Tx and the lower frequencies of the MB sub-Tx. In such a case, the HB sub-Tx plays a more important role in covering the LB/MB intersection.

## 4.8 Measured Results

As previously shown in Fig. 4.34(a), the interposer is about  $2 \times 2 \ cm^2$ , limited by the 1-mm pitch grid array (BGA) on its backside. The package is used to interface with the PCB motherboard through a spring pin socket (Ironwood CBT-BGA 6001).

First, the output power and DE of the Tx package are characterized with the bandswitching functionality carried out by almost-ideal switches. Both sides of the primary windings of the three transformers on interposer can be connected or disconnected to the system ground plane on interposer via minimum-footprint zero- $\Omega$  resistors or removable solder paste. The measured package output power and DE for the three sub-Txs are plotted in Fig. 4.53. The simulated results previously shown in Fig. 4.50 are also attached. The Tx



Figure 4.52: Simulated parasitic loading for the HB sub-Tx  $(Z_{para,HB})$  with DTx Reconfiguration 1 and 2.

package achieves a peak power higher than 22.9 dBm from 0.4 to 4.0 GHz with DE better than 25%. The LB sub-Tx has a peak power of 28.6 dBm at 0.85 GHz with DE of 50%, and the MB sub-Tx has its peak power of 27.4 dBm at 2.1 GHz with DE of 49%. The HB sub-Tx covers the LB/MB intersection and reaches 26.4, 24.7, and 22.9 dBm at 1.5, 3.0 and 4.0 GHz, respectively, with DE of 46, 34, and 26%. Overall, the measured results are predicted well by the simulated results. The fast power roll-off at frequency higher than 4 GHz could be resulted from the parasitic introduced by the spring pin socket.

In the following measurements with CW and modulated waveforms, the external connection pins on interposer are kept open and the introduced DTx Reconfiguration 2 is put into service. When one sub-Tx operates, the 2.5-V PA supply voltages and the 1-V common-source gate biases are short-circuited to ground for the two idle sub-Txs, while the 2.5-V gate biases for the cascode devices are still on. The measured output power and DE are plotted in Fig. 4.54. As expected, the output power of the LB sub-Tx degrades noticeably when the frequency goes higher, while the MB and HB performances degrade little. The degraded bandwidth of the LB sub-Tx is partially compensated by the HB sub-Tx. The peak power/DE from the reconfigurable Tx package are 28.3 dBm/47% at 0.85 GHz (LB), 26.3 dBm/44% at 1.6 GHz (HB), 27.2 dBm/49% at 2.05 GHz (MB), 24.9 dBm/32% at 3.1 GHz (HB), and 23.6 dBm/25% at 3.8 GHz (HB).

The reconfigurable Tx package has also been tested under modulated signals. To approximate the desired Tx output, the RF-DAC of the three sub-Txs, including the 8-b amplitude modulator and the 8-b phase modulator, must be characterized. For the three testing frequencies at 0.85 GHz (LB sub-Tx), 2.2 GHz (MB sub-Tx), and 3.1 GHz (HB sub-Tx), the output magnitude and phase of the amplitude modulators were measured with the amplitude modulator code swept from 0 to 255 while the phase modulator code fixed at 0, and the results are plotted in Fig. 4.55. Although the three CMOS designs are identical, the measured AM-AM and AM-PM responses shown in Fig. 4.55 appear to be different. This is because the load impedances seen by the three inverse Class-D cores, at different operation frequencies, are different. More details on the analysis of the characteristics of an inverse Class-D core can be found in [59], Section III]. To output a modulated signal, the AM-AM response is checked to approximate the desired Tx output magnitude, and the AM-PM distortion is corrected by the phase modulator. Although the AM-PM effect is not an issue here, it is noticed that the measured AM-PM effects are 27°, 27°, and 17° for the LB, MB, and HB operation, respectively. The AM-PM curves have the similar trend, but the phase shifts vary up to 60

While the capacitance nonlinearity plays a role in the AM-PM effect, the inverse Class-D PA analysis in [71] shows that even if the switch devices do not have any parasitic capacitance, the AM-PM effect still exists if the PA load impedance is not purely resistive. The AM-PM phase shift for this DTx core is simulated to the load impedance ( $Z_L = R_L + jX_L$ ) at 1 GHz. The results are plotted in Fig. 4.56 and the AM-PM phase shift is quite sensitive to  $Z_L$ . The preferred load impedance of the DTx is around 8+7j  $\Omega$  [59] and corresponds to a low phase shift around 10°. However,  $Z_L$  is not constant across frequency and the three sub Txs, which accounts for the phase-shift deviation. The measured phase shifts in this work



Figure 4.53: Measaured (a) Tx package output power and (b) DE with almost-ideal external band-selecting switches (solder on/off).



Figure 4.54: Measaured (a) Tx output power and (b) drain efficiecy (DE) with DTx reconfiguration 2.


Figure 4.55: Normalized Tx output magnitude and phase v.s. AM code.



Figure 4.56: AM-PM phase shift versus PA load impedance.

are around 20°. A recently reported inverse Class-D PA makes the output capacitance of an on-state cell close to that of an off-state cell [129] and achieves a good AM-PM phase shift from 7° to 14° with operation frequency from 2.5 to 4 GHz.

The 8-b phase modulators of the LB, MB, and HB sub-Txs are characterized at 0.85, 2.2, and 3.1 GHz, respectively, and the output phases are plotted in Fig. 4.57 with the

phase modulator code swept from 0 to 255. The AM code was fixed at the peak value of 255. The phase-modulated square wave that drives the switch PA is generated by a CML-to-CMOS convertor following the phase modulator core. Two integrators are employed at the mixer in-phase and quadrature LO inputs to transform the LO square waves to triangle waveforms with proper magnitude (i.e.,  $V_{pp} \approx 200 \text{ mV}$ ). The linearity of the phase modulator suffers significantly without the integrators. Substantial efforts in [[59], Section IV] have been dedicated to explaining the necessity of the integrator and the proper integrator current in the wideband phase modulator. The integrator current should increase with the operation frequency. The measured phase-modulator responses plotted in Fig. 4.57(a), (b), and (c) are at 0.85, 2.2, and 3.1 GHz, respectively, and the integrator currents must be adjusted accordingly. The proper integrator currents, annotated in the figures, can limit the maximum phase step to 2.4°, which verifies the wideband capability of the phase modulator.

The measured AM and PM responses are both employed to approximate the required Tx output, and each operating frequency has its dedicated AM-AM, AM-PM, PM-PM and dc setup tables. For the RF frequency at 0.85 (LB), 2.2 (MB), and 3.1 GHz (HB), the measured demodulated constellations for a 62.5-MS/s and 64-QAM modulated signal are plotted in Fig. 4.58. The measured output power, DE and SE, and the constellation EVM are annotated. For example, the Tx package, operated at 2.2 GHz with the MB sub-Tx, achieves an output power of 22 dBm, DE of 28%, and EVM of -30 dB. This custom IQ modulation test verifies the wide RF bandwidth of the reconfigurable Tx package.

By feeding in the proper AM and PM data streams into the Tx package with a total data rate of  $2 \times 2.5$  Gb/s and Tx symbol rate of 250 MS/s, the Tx package can support the 20-MHz WLAN and LTE signals. Fig. 4.59 plots the Tx output spectrum when the MB sub-Tx is programed to output a 54-Mb/s, 20-MHz WLAN signal at 2.2 GHz. The output signal has a peak-to-average power ratio (PAPR) of 6.3 dB and is a clipped version of the original signal with PAPR of 8.7 dB. Decreasing the signal PAPR increases the Tx output power and efficiency, but the signal integrity decreases. The simulated EVM and noise for the 20-MHz WLAN signal have been provided in [[59], Section VI] to the signal compression in terms of PAPR. The Tx package achieves a good output (average) power of 19.9 dBm, and the DE and SE and are 23% and 19%, respectively. The demodulated 64-QAM constellation and pilot signals from the OFDM subcarriers is also attached in Fig. 4.59. The measured EVM is -31 dB, meeting the specification of -25 dB with substantial margin, and the spectrums also meet the required mask. The out-of-band noise is about -117 dBc/Hz at an 80-MHz offset.

Finally, considering the LTE bands are allocated over a wide range of operation frequency [46], the Tx package is programed and reconfigured to output a 20-MHz, 64-QAM LTE signal at three frequencies: 0.85 GHz (LB sub-Tx), 2.2 GHz (MB sub-Tx), and 3.1 GHz (HB sub-Tx). The measured spectrum and demodulated constellations (from the 1200 subcarriers) are plotted in Fig. 4.60. The LTE signal, using the single-carrier frequency-division multiple access (SC-FDMA) technique, can be operated with a lower PAPR compared to the WLAN signal without violating the ACLR and EVM specifications (EVM < -25 dB,  $ACLR_1 < -30$  dB,  $ACLR_2$  i -36 dB). With sufficient margin, the Tx package is programmed to output







Figure 4.57: Phase responses at (a) 0.85, (b) 2.2, (c) 3.1 GHz.



Figure 4.58: Measured 64-QAM constellation and Tx performance at the six testing frequencies: (a) 0.85, (b) 2.2, (c) 3.1 GHz.



Figure 4.59: Measured WLAN spectrum at 2.2 GHz (MB Sub-Tx).



Figure 4.60: Measured LTE spectrum and demodulated 64-QAM constellation at (a) 0.85 GHz, (b) 2.2 GHz, and (c) 3.1 GHz.

a clipped LTE signal with a PAPR of 4.1 dB. Similarly, the simulated EVM, noise, and  $ACLR_{1,2}$  for the 20-MHz LTE signal have been provided in [[59], Section. VI] to the signal PAPR. The measured average power, DE/SE, EVM, and ACLR performances are annotated in Fig. 4.60 for the three testing frequencies. At 2.2 GHz, the Tx package outputs a average power of 22.1 dBm with DE of 27% and SE of 23%, and the corresponding  $ACLR_{1,2}$  and EVM meet specification. The output noise is -115 dBc/Hz at an 80-MHz offset.

### 4.9 Performance Comparison

Table 4.1 compares the CW performance of this work to reported CMOS Txs/PAs with peak power higher than 26 dBm [61, 70, 74–80, 82, 130–137]. To provide a fair comparison, one must be reminded that the works employing discrete baluns or tested with GSSG probes will suffer some degradation in power and efficiency (e.g. 1 dB) if an on-chip or on-package balun is realized and the associated loss is included. Using a moderate supply voltage of 2.5 V, the achieved peak power of the work is among the highest, and the achieved system efficiency is arguably the best, considering that the power consumptions of all components in the all-digital transmitter have been accounted for.

Table 4.2 compares the measured CW performance of this wideband and reconfigurable Tx package to the reported wideband Tx designs [51, 52, 54, 55, 61–64, 129]. The achieved power, DE, and system efficiency (SE) with all the on-chip power consumption included are comparable to the state-of-the-art single-band solutions. This work also features a single-output continuous bandwidth from 0.4 to 4 GHz and the all-digital input interface. One must

| Reference |             | Freq.<br>(GHz) | Peak<br>Power<br>(dBm) | DE<br>(%) | SE or<br>PAE<br>(%) | Output<br>Off-Chip<br>Network | Power<br>Core   | Modulation<br>Test    |                |
|-----------|-------------|----------------|------------------------|-----------|---------------------|-------------------------------|-----------------|-----------------------|----------------|
| [130]     | 12' TMTT    |                | 0.93                   | 29.4      | 28                  | 26                            | None            | Class AB              | 10-MHz LTE     |
| [75]      | 14'AS       | SSCC           | 1.2                    | 27.1      | 51                  | N.A.                          | LTCC Interposer | Class D <sup>-1</sup> | 20-MHz LTE     |
| [77]      | 09' T       | MTT            | 1.7                    | 33.8      | N.A.                | 50                            | Balun on IPD    | Class E               | None           |
| [133]     | 16' T       | MTT            | 1.75                   | 31.2      | N.A.                | 33                            | None            | Class AB              | WCDMA          |
| [82]      | 13' T       | MTT            | 1.85                   | 30.2      | N.A.                | 48                            | Balun on PCB    | Class AB              | 10-MHz LTE     |
| [131]     | 15' J       | SSC            | 1.9                    | 28.0      | N.A.                | 34                            | None            | Doherty               | 20-MHz LTE     |
| [70]      | 17' J       | SSC            | 1.9                    | 26.0      | N.A.                | 25                            | GSSG Probe      | Class D               | 20-MHz LTE     |
| [134]     | 4] 16' JSSC |                | 1.9                    | 35.3      | N.A.                | 33                            | On PCB          | Class AB              | WCDMA          |
| [78]      | ] 09' JSSC  |                | 2.0                    | 29.3      | N.A.                | 69 <sup>&amp;</sup>           | None            | Class E               | 20-MHz WLAN    |
| [135]     | ] 17' ISSCC |                | 2.0                    | 30.3      | N.A.                | 33                            | None            | Class D               | 20-MSs 65QAM   |
| [74]      | ] 14' RFIC  |                | 2.25                   | 29.1      | N.A.                | 42                            | Use GSSG probe  | Class E               | 20-MHz 64QAM   |
| [79]      | ] 12' JSSC  |                | 2.4                    | 27.7      | N.A.                | 45                            | Discrete Balun  | Class E               | 20-MHz WLAN    |
| [80]      | 14' ES      | SCIRC          | 2.4                    | 29.5      | 46.7                | N.A.                          | None Class E    |                       | 20-MHz LTE     |
| [137]     | 17' IS      | SSCC           | 2.4                    | 25.3      | N.A.                | 30                            | None            | Class D               | 10-MS/s 256QAM |
| [132]     | 16' F       | RFIC           | 2.48                   | 26.7      | 48                  | 39                            | None Class A    |                       | 20-MHz WLAN    |
| [136]     | 17' IS      | SSCC           | 2.5                    | 28.6      | N.A.                | 35                            | None            | Class D               | WLAN, LTE      |
| [61]      | 16' J       | SSC            | 2.6                    | 28.1      | 41                  | 35                            | None            | Class D <sup>-1</sup> | 8-MS/s 256QAM  |
| [76]      | 16' J       | SSC            | 3.7                    | 26.7      | 40                  | N.A.                          | None            | Doherty               | 1-MS/s 16QAM   |
| This V    | Work:       | LB             | 0.7/1.1                | 27.7/29.2 | 40/60               | 38/56                         | HDI PCB         | Class D <sup>-1</sup> | 63-MS/s 64QAM  |
| Th        | ree         | MB             | 1.5/2.3                | 28.8/27.7 | 56/54               | 53/49                         | Interposer      |                       | 20-MHz WLAN    |
| Pack      | Packages    |                | 3.0/3.5                | 26.8/25.5 | 37/40               | 33/35                         |                 |                       | 20-MHz LTE     |
| This W    | ork: Wid    | leband         | 0.4/0.85               | 27.2/38.3 | 30/47               | 26/44                         | 1               |                       |                |
| DTy       | Reconfi     | g. 2           | 2.1/3.1                | 27.1/24.9 | 48/31               | 43/27                         |                 |                       |                |

Table 4.1: Reported CMOS CW performance with peak power > 26 dBm (ordered by operating frequency).

be reminded that an insertion loss around 1 dB has to be accounted for when a diplexer is used in some reported works to create a dual-band Tx [51, 52, 54].

Table 4.3 compares the WLAN performance of this work to state-of-the-art Txs/PAs with output power higher than 17 dBm [51, 54, 65–68, 72, 78, 79, 132]. With the balun loss included, both the achieved power (20.7 dBm) and system efficiency (> 21%) are also excellent.

Finally, the reported CMOS LTE Txs/PAs with output power higher than 20 dBm are summarized in Table 4.4 [52, 70, 80–84, 130, 131, 136, 138, 139]. The prevailing technique is the envelop tracking (ET) with off-chip buck convertor [82–84]. Unlike an all-digital Tx, the input signal of an ET amplifier is already a scaled version of the output RF signal as conventional linear PAs. As can be seen, the 24-dBm output power and 30% SE achieved by this all-digital Tx are comparable to that achieved by the state-of-the-art ET amplifiers transmitting 10-MHz and 20-MHz LTE bandwidth.

| Reference       | Freq.<br>(GHz) | Wideband<br>Design<br>Method | Peak<br>Power<br>(dBm) | DE<br>(%) | SE or<br>PAE<br>(%) | Off-Chip<br>Output<br>Network | Power<br>Core         | Modulation<br>Test |  |
|-----------------|----------------|------------------------------|------------------------|-----------|---------------------|-------------------------------|-----------------------|--------------------|--|
| [64] 16' MWCL   | 0.65/0.9       | Adaptive PA Core Periphery   | 27.8/28.2              | N/A       | 52/54               | None                          | Class AB              | WLAN               |  |
| [52] 17' ISSCC  | 0.9/1.95       | Two PA Cores + Diplexer*     | 31.9/31.9              | N/A       | 56/46               | On IPD                        | Class F               | WCDMA/LTE          |  |
| [61] 16' JSSC   | 2.6/4.5        | 1 PA Core + Wideband MNW     | 28.1/26.0              | 41/27     | 35/21               | None                          | Class D-1             | 8MS/s 256QAM       |  |
| [129] 18' JSSC  | 2.5/3.1        | 1 PA Core + Wideband MNW     | 24.5/24.9              | 43/38     | N/A                 | None                          | Class D <sup>-1</sup> | 20MHz WLAN         |  |
| [54] 17' JSSC   | 2.4/5          | Two PA Cores + Diplexer*     | 29.0/26.0              | N/A       | N/A                 | None                          | Class AB              | WLAN               |  |
| [51] 16' ISSCC  | 2.4/5.5        | Two PA Cores + Diplexer*     | 27.0/25.5              | N/A       | N/A                 | PCB Balun                     | Class D <sup>-1</sup> | WLAN               |  |
| [55] 14' RFIC   | 0.8-6.0        | 3 Different PA Cores + SP3T  | 22.8-25.2              | 20-34     | N/A                 | None                          | Class E               | N/A                |  |
| [63] 15' ISSCC  | 2-6            | 1 PA Core + Wideband MNW     | 20.1-22.4              | N/A       | 19-28               | None                          | Class AB              | WLAN               |  |
| [62] 16' IMS    | 2.8-6          | 1 PA Core + Wideband MNW     | 20.8-22.1              | N/A       | 37-44               | None                          | Class AB              | 64QAM              |  |
| This Work       | 0.4            | Three Identical Tx Cores +   | 27.2                   | 30        | 26                  | HDI PCB                       | Class D <sup>-1</sup> | 63MS/s 64QAM       |  |
| DTx Reconfig. 2 | 0.85           | Band-selecting Interposer    | 28.3                   | 47        | 44                  | Interposer                    |                       | 20MHz WLAN         |  |
|                 | 2.1            |                              | 27.1                   | 48        | 43                  |                               |                       | 20MHz LTE          |  |
|                 | 3.1            |                              | 24.9                   | 31        | 27                  |                               |                       |                    |  |

\* diplexer not implemented and diplexer loss not included

Table 4.2: Peak performance of wideband/multi-band CMOS Tx designs.

| R           | Reference                    |              | Power<br>(dBm) | DE / (SE<br>or PAE)<br>(%) | EVM<br>(dB) | Output<br>Off-Chip<br>Network | Modulator                             |
|-------------|------------------------------|--------------|----------------|----------------------------|-------------|-------------------------------|---------------------------------------|
| [78]        | 09' JSSC                     | 2.0          | 19.6           | N.A./22.6&                 | -32         | None                          | Class G + ET (Off-chip AM/PM)         |
| [65]        | 11' JSSC                     | 2.4          | 19.6           | N.A./21.8                  | -25         | Discrete Balun                | Outphasing AM (Off-chip PM)           |
| [68]        | 11' JSSC                     | 2.25         | 17.7           | N.A./27                    | -32         | Discrete Balun                | Switched-Cap AM (Off-chip PM)         |
| [79]        | 12' JSSC                     | 2.4          | 20.2           | N.A./27.6                  | -31         | Discrete Balun                | Class G + Outphasing AM (Off-chip PM) |
| [66]        | 12' ISSCC                    | 2.4          | 20             | 22/18.6                    | -25         | Discrete Balun                | Outphasing AM (On-chip PM)            |
| [72]        | 13' ISSCC                    | 2.4          | 18.8           | 17/15                      | -25         | None                          | Digitally-modulated IQ Combining      |
| [67]        | 15' RFIC                     | 2.1          | 18.3           | N.A./13                    | -26         | Discrete Balun                | RF-Pulse Width Modulation             |
| [51]        | 16' ISSCC                    | 2.5          | 19.5           | 14.5/10                    | -30         | Discrete Balun#               | Digitally-modulated IQ Combining      |
| [132]       | 16' RFIC                     | 2.48         | 20.1           | 21/18                      | -25         | None                          | Linear PA                             |
| [54]        | 17' JSSC                     | 2.4          | 23.5           | N.A./15                    | -32         | None                          | Linear PA                             |
| Th          | nis Work                     | ork 1.8 21.6 |                | 27/23                      | -31         | HDI PCB                       | Class D <sup>-1</sup> AM              |
| Thre        | Three Packages               |              | 20.7           | 25/21                      | -31         | Interposer                    | (On-chip PM)                          |
| Th<br>DTx 1 | This Work<br>DTx Reconfig. 2 |              | 19.9           | 23/19                      | -31         | 1000                          |                                       |

&Loss in supply modulator not included #Balun loss included

Table 4.3: Reported 20-MHz WLAN CMOS Txs with average power > 17 dBm.

| Re         | Reference     |        | Freq.<br>(GHz) | BW<br>(MHz)<br>/QAM | Power<br>(dBm) | DE/<br>SE or PAE<br>(%) | Output<br>Off-Chip<br>Network | Modulator                 |
|------------|---------------|--------|----------------|---------------------|----------------|-------------------------|-------------------------------|---------------------------|
| [130]      | 12' TMTT      |        | 0.93           | 10/16               | 25.1           | N.A./15                 | None                          | Linear PA                 |
| [70]       | 17            | ' JSSC | 1.8            | 10/64               | 20.9           | N.A./15.2               | GSSG probe                    | Switched-Cap IQ Combining |
| [81]       | 13'           | ISSCC  | 1.8            | 20/64               | 21.3           | N.A./18                 | None                          | Linear PA                 |
| [138]      | 16'           | TMTT   | 1.7            | 10/16               | 28.5           | N.A./37                 | None                          | ET (Off-chip AM and PM)   |
| [82]       | [82] 13' TMTT |        | 1.85           | 10/16               | 26             | N.A./34.1               | Balun on PCB                  | ET (Off-chip AM and PM)   |
| [84]       | [84] 15' MWCL |        | 1.85           | 10/16               | 27.5           | N.A./42.4               | Balun on PCB                  | ET (Off-chip AM and PM)   |
| [131]      | 131] 15' JSSC |        | 1.9            | 20/16               | 23.4           | N.A./23.3               | None                          | Doherty                   |
| [83]       | 14' ISSCC     |        | 1.95           | 20/16               | 25.6           | N.A./32.2               | None                          | ET (Off-chip AM and PM)   |
| [52]       | 17'           | ISSCC  | 1.95           | 20/16               | 27.7           | N.A./33                 | On IPD                        | Linear PA                 |
| [80]       | 14' E         | SSCIRC | 2.4            | 20/64               | 22.8           | 21/N.A.                 | None                          | Out-phasing AM            |
| [139]      | 17'           | ISSCC  | 2.4            | 20/16               | 23.9           | N.A./36                 | None                          | ET (Off-chip AM and PM)   |
| [136]      | 17'           | ISSCC  | 2.5            | 10/64               | 20.7           | N.A./15                 | GSSG probe                    | Switched-Cap IQ Combining |
| This Wo    | ork           | LB     | 1.2            | 20/64               | 24.5           | 32/29                   | HDI PCB                       | Class D-1 AM              |
| Three Pack | ages          | MB     | 1.8            |                     | 24.0           | 34/30                   | Interposer                    | (On-chip PM)              |
|            |               | MB     | 2.4            |                     | 23.1           | 28/24                   |                               |                           |
| This Wo    | ork           | LB     | 0.85           |                     | 23.6           | 28/25                   |                               |                           |
| DTx Recor  | fig. 2        | MB     | 2.2            |                     | 22.1           | 27/23                   |                               |                           |

Table 4.4: Reported 10/20MHz LTE CMOS Txs with average power > 20 dBm (ordered by operating frequency)

### 4.10 Appendix

The schematics of the transmitter periphery circuits, including the receivers for the sinusoidal LO and clocks and the 2.5-Gb/s LVDS receivers, are illustrated in Fig. 4.61. With the two strong-arm comparators, the LVDS receiver is able to exploit both the rising and falling edges of the 1.25 GHz LVDS Rx clock. The comparators alternatingly output their bit decisions, which are then held and retimed by the subsequent latch and flip-flops. They are then presented to two 1-to-5 deserializers on the falling edge of the 1.25-GHz clock. The two deserializers are serial registers that take in input data at the rising edge of the 1.25-GHz clock and output 10 bits in parallel at the rising edge of the 250 MHz DeSer clock. The same receiver topology is used for both the LO and the LVDS Rx and DeSer clocks. The receiver is based on the CML-CMOS converter, composed of a two-stage differential amplifier followed by self-biased inverter chains. The common-mode rejection provided by the differential amplifier is critical to reject coupling from the transmitter output into the clock receivers through ground bounce.



Figure 4.61: Schematics for the LO and (LVDS Rx and DeSer) clock receivers and the LVDS data receiver (for the AM and PM streams).

### Chapter 5

### mmW Constellation Formation Exploiting Combination Redundancy

### 5.1 Leakage Suppression in Conventional Array

In a conventional phased array, the radiated power at a specific direction can be eliminated by a simple sidelobe canceler [105]. Assuming an half-wavelength array with 8 elements is deployed along the y-axis, as illustrated in Fig. 5.1, the pattern synthesis is the most effective on the vertical-plane with  $\phi = \pm 90^{\circ}$ . The standard spherical coordinates  $(R, \theta, \phi)$ is adopted here. With the input voltage at the *i*-th antenna elements denoted by vi and the far-field electric field at  $(\theta, \phi = \pm 90^{\circ})$  denoted by  $E_{\theta}$  for the  $\theta$ -direction component and  $E_{\phi}$ for the  $\phi$ -direction component, we have

$$E_{\theta(\phi)} = \sum_{1}^{i} G_{i,\theta(\phi)} \times v_i \times e^{-j\pi \times i \times \sin(\theta)}, \qquad (5.1)$$

where  $G_{i,\theta(\phi)}$  is the antenna transfer function, generally dependent on  $\theta$  and inversely proportional to the coupling distance R. The time-harmonic signals are described in the phasor domain. Assuming the antennas are identical, omni-directional and driven by the same input voltage, the normalized radiation, denoted by  $RAD_n$ , can be calculated by

$$RAD_n(\theta) = \frac{|\sin(N\pi \times \sin(\theta)/2)|}{|N\sin(\pi \times \sin(\theta)/2)|}.$$
(5.2)

Eq. (5.2) predicts a significant second sidelobe with magnitude only about -13-dB lower than the main-beam magnitude. Fig. 5.2 plots  $RAD_n$  for several array sizes. The simple sidelobe canceler works by tailoring the antenna input voltages. Assuming a spatial notch is desired at  $\theta = \theta_n$ , the corresponding antenna input voltages can be calculated by



Figure 5.1: Illustration of an 8-element array deployed along the y-axis.

$$[v_i]_{N \times 1} = [1]_{N \times 1} - [w_n]_{N \times 1} \times r_n, \tag{5.3}$$

where  $[w_n] = [e^{j\pi \times (N-1) \times sin(\theta_n)} e^{j\pi \times (N-2) \times sin(\theta_n)} \dots 1]^*$  and  $r_n = [w_n]^* \times [1]_{N \times 1}/N$ . With N = 8, Fig. 5.3 plots the relative radiation adopting (5.3) and have the spatial notch placed at  $\theta = 20^{\circ}$ ,  $30^{\circ}$ , and  $60^{\circ}$ . The corresponding antenna excitations  $([v_i]_{8\times 1})$  are also given in Fig. 5.3. Both fine-grain attentions and phase shifts must be introduced to the elements if a spatial notch is to be synthesized at  $\theta_n$  other than the notch angles of the uniform excitations (e.g.  $\theta_n = 30^{\circ}$ ). The nonuniform excitations, with some element outputs lower than the peak value, result in a lower Tx EIRP, and the degradation is significant if the synthesized notch is close to  $\theta = 0$ . The Tx EIRP degradation and the maximum element attenuation are functions of  $\theta_n$  and are plotted in Fig. 5.4 for N = 6, 8, and 20. EIRP degradation around 2 dB occurs when the null is placed on the first sidelobe of the uniformly-driven array (i.e.,  $\theta_n = 21^{\circ}$  for N = 8).

Classic linear array synthesis such as the Dolph-Chebyshev technique [[106], Sec. 5] can suppress the sidelobe levels to a given value. Although additional phase shifts are not required, the element output magnitudes follow the coefficients in the Chebyshev polynomials and are nonuniform, which also entails EIRP degradation. Fig. 5.5 plots the synthesized radiation patterns of an 8-element array with multiple levels of sidelobe suppression. The antenna excitations are also listed and the corresponding EIRP degradation can be as high as 4 dB.



Figure 5.2: Array pattern along  $\phi = 90^{\circ}$  with uniform antenna excitations.



Figure 5.3: Array pattern with sidelobe canceller.



Figure 5.4: Array EIRP degradation with sidelobe canceller.



Figure 5.5: Dolph–Tschebyscheff array pattern and element excitations.

### 5.2 Combination Redundancy and Leakage Suppression in a Digitally-Modulated Phased Array

#### **Constellation Formation and Combination Redundancy**

Both the sidelobe canceler and the Dolph-Chebyshev technique cannot be adopted to existing digitally-modulated phased arrays due to the insufficient mmW-DAC output magnitude and phase resolution. As the result, a new Tx spatial-filtering technique is required. In this work, the Tx elements can output QPSK symbols (1, j, -1, -j) and can also be turned off (0) to save dc power. The element output constellation is illustrated in Fig. 5.6(a), along with the spatially-combined constellation allowing nonidentical element outputs from two elements (nonidentical element contribution). The total numbers of code combination and spatial symbols are  $5^2$  and 13, respectively. Therefore, redundancy exists when forming some spatial symbols. For example, the spatial symbol -1+j can be achieved by two code combinations, with the local symbol from the first element at either -1 or j. The spatial symbols with the highest power (i.e., 2, 2j, -2, -2j) only have a unique combination with identical element outputs.

With eight elements in the phased array, there are 145 spatial symbols, as illustrated in Fig. 5.6(b), from the 5<sup>8</sup> code combinations. Many QPSK constellations can be extracted from the 145 symbols, and the one with the highest EIRP, (8, 8j, -8, -8j), does not have any combination redundancy. Alternatively, the combinations can be maximized for the QPSK constellation selected at (4+4j, -4+4j, -4-4j, 4-4j), where each spatial symbol operates four of the eight elements with quadrature LO and therefore has 70 code combinations. Since the local elements are operated at the maximum power, the total Tx output power and package heat dissipation shall remain the same across the combinations. This redundancy-rich constellation has an EIRP 3-dB lower than the maximum EIRP. The third QPSK constellation (1+7j, -7+1j, -1-7j, 7-1j) has EIRP only 1.1-dB lower than the peak value and each symbol has 8 code combinations.

On the other hand, only six elements are involved for the array to synthesize a more complicated 16QAM constellation. The 85 available spatial symbols are shown in Fig. 5.6(c) and the two possible 16QAM realizations are highlighted. The constellation with the highest EIRP is  $(2, 6, 4+2j, 2+4j) \ge (1, j, -1, -j)$  and similarly the high-power symbols (6, 6j, -6, -6j) can only be uniquely achieved. Multiple combinations exist for the other 12 low-power symbols. To increase the number of combinations, the second 16QAM constellation at  $(1+j, 3+j, 1+3j, 3+3j) \ge (1, j, -1, -j)$  can be adopted at the cost of the reduced EIRP by 3 dB. In this case, the high-power symbols have 20 combinations.

The number of combinations for the symbols in the mentioned spatially-combined QPSK and 16QAM constellations are summarized in Table 5.1. When the array synthesizes the 16QAM symbols, some combinations turn off a part of the Tx elements and are more efficient than the others. For example, for the symbol +2 in the high EIRP operation, operating only



Figure 5.6: (a) Element output symbols and the spatially-combined symbols with two elements. (b) Spatial symbols with 8 elements and three QPSK constellations. (c) Spatial symbols with 6 elements and the only two possible 16QAM constellations.

| QPSK Synthesized by 8 QPSK-OOK Elements  |                  |           |             |                 |            |  |  |  |  |  |
|------------------------------------------|------------------|-----------|-------------|-----------------|------------|--|--|--|--|--|
| High                                     | n EIRP           | Me        | dium        | High Redundancy |            |  |  |  |  |  |
| QPSK                                     | # of             | QPSK      | # of        | QPSK            | # of       |  |  |  |  |  |
| Symbol                                   | Selections       | Symbol    | Selections  | Symbol          | Selections |  |  |  |  |  |
| +8                                       | 1                | +1+7j     | 8           | +4+4j           | 70         |  |  |  |  |  |
| +8j                                      | 1                | -7-1j     | 8           | -4+4j           | 70         |  |  |  |  |  |
| -8                                       | 1                | -1-7j     | 8           | -4-4j           | 70         |  |  |  |  |  |
| <b>-</b> 8j                              | 1                | +7-1j     | 8 +4-4j     |                 | 70         |  |  |  |  |  |
| 16QAM Synthesized by 6 QPSK-OOK Elements |                  |           |             |                 |            |  |  |  |  |  |
|                                          | High EIRP        |           | Hię         | gh Redunda      | ancy       |  |  |  |  |  |
| 16QA                                     | M Symbol         | # of Sel. | 16QAM 3     | # of Sel.       |            |  |  |  |  |  |
| (+2) × (*                                | 1, j, -1, -j)    | 480 (15*) | (+1+1j) × ( | 1, j, -1, -j)   | 690 (30*)  |  |  |  |  |  |
| (+4+2j)                                  | × (1, j, -1, -j) | 15 (15*)  | (+3+1j) × ( | 1, j, -1, -j)   | 150 (60*)  |  |  |  |  |  |
| (+2+4j)                                  | × (1, j, -1, -j) | 15 (15*)  | (+1+3j) × ( | 1, j, -1, -j)   | 150 (60*)  |  |  |  |  |  |
| (+6) × (                                 | 1, j, -1, -j)    | 1 (1*)    | (+3+3j) × ( | 1, j, -1, -j)   | 20 (20*)   |  |  |  |  |  |

\*efficient combination

Table 5.1: Combination redundancy for the QPSK/QAM spatial symbols.

two elements (e.g.,  $v_i = [1; 1; 0; 0; 0; 0]$ ) is more efficient than operating more elements (e.g.,  $v_i = [1; 1; 1; -1; 1; -1]$ ).

#### **Array Efficiency**

The peak phased-array efficiency with the antennas driven by digitally-modulated Tx element is higher than the conventional phased array driven by linear Tx elements, and the back-off efficiency of a digital array can be improved by turning-off some elements to save dc power. Table 5.2 compares the array efficiency for 16QAM achieved by a 6-element conventional phased array with linear Tx elements and the proposed digitally-modulated elements that synthesize the two constellations as Fig. 5.6(c) describes. The maximum-power constellation has symbols  $(2, 6, 4+2j, 2+4j) \times (1, j, -1, -j)$ , and the redundancy-rich constellation has symbols  $(1+j, 3+j, 1+3j, 3+3j) \times (1, j, -1, -j)$ . The redundancy-rich constellation synthesized by the digital array has an EIRP degradation of 3 dB but can potentially achieve a low spatial leakage, and the efficiency of this case is compared to the conventional phased array operated with 3-dB power back-off, which could be close to the

| Maximum I                  |                         | Conventional                        |                                |                            |                            |                                | Digita                                   | al                                                |                   |                 |  |  |
|----------------------------|-------------------------|-------------------------------------|--------------------------------|----------------------------|----------------------------|--------------------------------|------------------------------------------|---------------------------------------------------|-------------------|-----------------|--|--|
| Spatial Symbol             | EI                      | EIRP                                |                                | Total Tx<br>Power          |                            | <sup>ic</sup><br>s-A           | P <sub>dc</sub><br>Class-l               | B F                                               | otal Tx<br>Power  | P <sub>dc</sub> |  |  |
| (+2)×(1, j, -1, -j)        | 4                       | lх                                  |                                | 0.67                       | 6                          | /                              | 2y                                       |                                                   | 2                 | 2y/z            |  |  |
| (+4+2j)×(1, j, -1, -j)     | 2                       | 0x                                  |                                | 3.33                       | 6                          | /                              | 4.5y                                     |                                                   | 6                 | 6y/z            |  |  |
| (+2+4j)×(1, j, -1, -j)     | 2                       | 0x                                  |                                | 3.33                       | 6                          | /                              | 4.5y                                     |                                                   | 6                 | 6y/z            |  |  |
| (+6)×(1, j, -1, -j)        | 3                       | 6x                                  |                                | 6                          | 6                          | /                              | 6y                                       |                                                   | 6                 | 6y/z            |  |  |
| Average                    | 2                       | 0x                                  |                                | 3.33                       | 6                          | /                              | 4.25y                                    | 11<br>                                            | 5                 | 5y/z            |  |  |
| Conventional: Class        | -A                      |                                     | EIRP                           | $/P_{dc} = 20x/$           | 6у                         | Tot                            | al Tx Pow                                | er/P <sub>dc</sub> =                              | 3.33/6y           |                 |  |  |
| Conventional: Class-B      |                         |                                     | EIRP                           | $P_{dc} = 20x/$            | 4.25y                      | Tot                            | al Tx Pow                                | I Tx Power/P <sub>dc</sub> = 3.33/4.25y           |                   |                 |  |  |
| Digital                    |                         | EIRP/P <sub>dc</sub> = 20xz/5y Tota |                                |                            | al Tx Pow                  | Tx Power/P <sub>dc</sub> = z/y |                                          |                                                   |                   |                 |  |  |
| EIRP 3-dB Back-Off         |                         |                                     | Conventional                   |                            |                            |                                | Digit<br>(Efficient                      | Digital Digital<br>Efficient Code) (Inefficient C |                   |                 |  |  |
| Spatial Symbol             | Spatial Symbol EIRP     |                                     | al Tx<br>wer                   | P <sub>dc</sub><br>Class-A | P <sub>dc</sub><br>Class-B |                                | Total Tx<br>Power                        | $P_{\sf dc}$                                      | Total Tx<br>Power | P <sub>dc</sub> |  |  |
| (+1+j)×(1, j, -1, -j)      | +1+j)×(1, j, -1, -j) 2x |                                     | .33                            | 6у                         | 1.41                       | /                              | 2                                        | 2y/z                                              | 6                 | 6y/z            |  |  |
| (+3+1j)×(1, j, -1, -j)     | 10x                     | 1.                                  | .67                            | 6у                         | 3.16                       | /                              | 4                                        | 4y/z                                              | 6                 | 6y/z            |  |  |
| (+1+3j)×(1, j, -1, -j)     | 10x                     | 1.                                  | .67                            | 6у                         | 3.16                       | /                              | 4                                        | 4y/z                                              | 6                 | 6y/z            |  |  |
| (+3+3j)×(1, j, -1, -j) 18x |                         |                                     | 3                              | 6у                         | 4.24                       | /                              | 6                                        | 6y/z                                              | 6                 | 6y/z            |  |  |
| Average 10x                |                         |                                     | .67                            | 6у                         | 3.00                       | /                              | 4                                        | 4y/z                                              | 6                 | 6y/z            |  |  |
| Conventional: Class        | -A                      |                                     | $EIRP/P_{dc} = 10x/6y$         |                            |                            |                                | Total T                                  | Total Tx Power/P <sub>dc</sub> = 1.67/6y          |                   |                 |  |  |
| Conventional: Class        |                         | $EIRP/P_{dc} = 10x/3y$              |                                |                            |                            | Total T                        | Total Tx Power/P <sub>dc</sub> = 1.67/3y |                                                   |                   |                 |  |  |
| Digital (Efficient Cod     | e)                      |                                     | $EIRP/P_{dc} = 10xz/4y$        |                            |                            |                                | Total T                                  | Total Tx Power/ $P_{dc} = z/y$                    |                   |                 |  |  |
| Digital (Inefficient Co    | ode)                    |                                     | EIRP/P <sub>dc</sub> = 10xz/6y |                            |                            |                                | Total T                                  | Total Tx Power/ $P_{dc} = z/y$                    |                   |                 |  |  |

Table 5.2: 16QAM efficiency achieved by conventional array and the proposed digital array.

case when the conventional leakage suppression methods are involved. The back-off efficiency of the conventional Tx elements is assumed to follow either the Class-A or Class-B back-off characteristic, with a constant dc power of the dc power proportional to the output voltage.

Without loss of generality, the maximum element output power is set at 1 and the corresponding spatial power (EIRP) is set at x. The element power consumption under the peak power is set at y for the linear Tx element and y/z for the more efficient digitally-modulated Tx element (z > 1). Many reported mmW PAs have their peak efficiency around 2x of the OP1dB efficiency (z = 2) [140–143].

The symbol-wise and average spatial power, total Tx output power, and total dc power consumption  $(P_{dc})$  are parametrized in Table 5.2 To generate the peak-EIRP constellation, the digital array and conventional Class-A and Class-B arrays have array efficiency of 4zx/y, 3.33x/y, and 4.71x/y, respectively. This array efficiency, denoted by  $EFF_1$ , is defined by EIRP/ $P_{dc}$ . A higher  $EFF_1$  (EIRP/ $P_{dc}$ ) can be achieved by the digital array compared to the conventional Class-B array because z > 1.18 can be expected. The digital array is assumed to synthesize the symbols (2, 2j, -2, -2j) with only two elements. The digital array performs even better if the second efficiency metric  $(EFF_2)$  is adopted, which is defined by the total Tx output power divided by  $P_{dc}$ . A higher  $EFF_2$  indicates that a lower portion of the dc power is converted to the package heat dissipation.

In generating the 16QAM constellation with 3-dB power back-off, the digital array can select either efficient or inefficient combinations for the low-power symbols. The highest  $EIRP/P_{dc}$  can be achieved with all the symbols synthesized by efficient combinations, while  $EFF_2$  does not depend on the selected combinations. The digital array always has a better  $EFF_2$  compared to a conventional Class-B array. With z = 2, the digital array also has a higher  $EFF_1$ , even if all the low-power spatial symbols take inefficient combinations.

#### Redundancy Exploitation for Low Spatial Leakage

First the redundancy-rich QPSK constellation with spatial symbols of  $(4+4j) \times (1, j, -1, -j)$  is synthesized with 8 elements. From Table 5.1, each QPSK symbol has 70 code combinations. If the combination with the lowest leakage has been found for one QPSK symbol, the optimal output combinations for the other three QPSK symbols can be easily obtained from simultaneously phase-shifting all the element outputs by 90°, 180°, and 270°, which should result in the same leakage. Fig. 5.7 plots the calculated radiation at  $\phi = 90^{\circ}$  for 3 of the 70 combinations that synthesize the spatial symbol 4+4j. The combination  $v_i = [j; j; 1; j; 1; j; 1; 1]$  have the lowest leakage at  $\theta = 10^{\circ}$ , 20°, and 40°, and the combination  $v_i = [1; 1; j; j; 1; 1; 1; j; 1; 3]$  is the optimal combination if a low radiation is desired at  $\theta = 50^{\circ}$ . Relative radiation higher than 0 dB can be observed, which is possible because the maximum array EIRP is 3 dB higher than that achieved at  $\theta = 0^{\circ}$ .

The lowest leakages that can be achieved at each zenith angle ( $\theta$ ) via redundancy exploitation are calculated and plotted in Fig. 5.8. The leakage suppression can be better than -25 dBc for most angles and even at  $\theta = 7^{\circ}$ . The radiation pattern of a uniformly-excited 8-element array, previously shown in Fig. 5.2, is also attached. The comparison shows the



Figure 5.7: Radiation patterns for 3 of the 70 combinations that synthesize spatial symbol 4+4j.

digitally-modulated array utilizing combination redundancy can improve the spatial leakage compared to the radiation generated by uniformly driving the antennas.

Secondly, the 16QAM constellation with spatial symbols of  $(1+1j, 3+1j, 1+3j, 3+3j) \times (1, j, -1, -j)$  is synthesized with six Tx elements. The optimal combinations must be located and adopted for the four symbols in the first quadrant. According to Table 5.1, the spatial symbols 1+1j, 3+1j, 1+3j, and 3+3j have 690, 150, 150, and 20 combinations, respectively, and among them 30, 90, 90, 20 combinations are efficient. For a special case that minimizes the array radiation at  $\theta = 20^{\circ}$ , the optimal combinations and the corresponding radiation patterns for the four symbols are calculated and plotted in Fig. 5.9. In this special case, the leakages of the symbols 3+1j and 1+3j cannot be suppressed as well as the other two symbols and dominate the leakage power. The average leakage is -21.8 dB, defined by the average leakage power of the four symbols divided by that at  $\theta = 0^{\circ}$ . The optimal combination for the symbol 1+1j uses  $v_i = [1; -1; 1; j; -j; j]$  and is an inefficient combination.

Fig. 5.10. plots the lowest radiations achieved via redundancy exploitation for two cases. The first case involves all the combinations and the second case involves only the



Figure 5.8: Array pattern with uniform excitation (solid line) and the lowest radiation exploiting the combination redundancy (symbol line).



Figure 5.9: Radiation patterns for the four 16QAM symbols that use the optimal combinations for the lowest radiation at  $\theta = 20^{\circ}$ .

efficient combinations. As expected, the leakage suppression is less effective utilizing only the efficient combinations. For most  $\theta$  angles from 10° to 60°, the average leakage can be suppressed to -25 dBc with all the combinations and -20 dBc with only the efficient combinations. The radiation pattern of a uniformly-excited 6-element array, also previously shown in Fig. 5.2, is also attached. This should be the radiation pattern with all the elements outputting identical 16QAM waveforms, which can have EIRP improvement by 3 dB but is not supported by the QPSK-OOK elements in this work. Alternatively, the high-EIRP 16QAM spatial waveform, illustrated in Fig. 5.6(c), can be formed with nonidentical element contributions, and multiple combinations exist for the low-power symbols. Therefore, the spatial leakage for this high-EIRP 16QAM constellation can also be reduced to some extent by exploiting the very limited redundancy, and the minimum radiation has been included in Fig. 5.10. The leakage can be suppressed substantially at most directions by adopting the low-EIRP constellation. Some exceptions exist; for example, the high-EIRP constellation can achieve a very low leakage at  $\theta = 20^{\circ}$  and should be adopted if a low radiation is desired there. Finally, it can be observed in Fig. 5.10 that the leakage suppression is not effective at  $\theta = 42^{\circ}$  for the three cases having nonidentical element outputs. With  $G_{i,\theta(\phi)}$  in (5.1) set at unity and  $\theta$  at 42°,  $E_{\theta(\phi)}$  is reduced to  $(v_1 + v_4) + (v_2 + v_5)e^{-j2\pi/3} + (v_3 + v_6)e^{-j4\pi/3}$ . Although  $E_{\theta(\phi)} = 0$  for the spatial symbol "3+3j" achieved with  $v_i = [1; 1; 1; j; j; j]$ , it cannot achieve a low radiation for other spatial symbols such as "3+1j" and "4+2j".

### 5.3 mmW Tx Element

#### Circuit Design

This section introduces the Tx element used in the array implementation. Although the element has been reported in [104], necessary results are reorganized and presented in this work with some new simulated results.

The Tx block diagram is illustrated in Fig. 5.11(a) and the chip photograph is shown in Fig. 5.11(b). The chip size is  $1.45 \times 0.87 \text{ mm}^2$ . The Tx element is fabricated in TSMC 28-nm bulk CMOS and is composed of a 2-Gb/s QPSK modulator in series with and a 15.7dBm E-band PA. The PA and modulator can be characterized independently with GSG pads at the modulator output and the PA input. In such a sub-element characterization, the interconnection between the two components is removed by focused ion beam (FIB) post processing.

Fig. 5.12 illustrates the schematic for Tx LO generation and distribution. The mmW LO signal is generated by a frequency doubler driven by an injection-lock frequency tripler (ILFT). The ILFT is based on a 40-GHz LC-tank VCO and the injection is the thirdharmonic current from a differential common-source pair. The LO phase can be shifted by adjusting the VCO varactor bias voltage, and this functionality is important to align multiple Tx elements. LO buffer amplifiers follow the doubler and amplify the differential LO to a total power of 6 dBm (in simulation). The inphase and quadrature LO waveforms driving



Figure 5.10: Array pattern with uniform excitation (solid line) and the lowest radiations exploiting the combination redundancy (symbol lines).

the QPSK modulator is generated by a differential quadrature hybrid with similar design methodology to that reported in [144]. The quadrature hybrid and the following matching transformers have 5-dB loss and the total LO power delivered into the QPSK modulator is 1 dBm.

The ILFT locking range depends on the injection currents and therefore the ac voltage swings at the gates of the injection common-source pair, which is matched to single-ended 50  $\Omega$  by a transformer and matching capacitors. The simulated LO-port input S11 is plotted in Fig. 5.13(a) and is centered at 13.2 GHz (79.2-GHz Tx output frequency). However, the bondwire inductance around 1.5 nH is not compensated at the LO port, which degrades the input matching and enhances the required input LO power. The simulated voltage swings at the gates of the injection cells are plotted in Fig. 5.13(b). With the bondwire, the simulated voltage swings are around 0.35 V with a 5-dBm/13-GHz LO source, and the simulated output locking range is around 300 MHz. The simulated locking ranges of the Tx element are plotted in Fig. 5.13(c) under different LO source powers and Tx free-running frequencies. A higher LO power is required to operate the Tx element at a lower frequency to compensate the input mismatch.



Figure 5.11: (a) Transmitter block diagram. (b) Die photograph. The chip size is  $1.45 \times 0.87$  mm2.



Figure 5.12: Schematic for the LO generation and distribution.

The schematic for the QPSK modulator is illustrated in Fig. 5.14. Two double-balanced mixers (with in-phase and quadrature LO) are combined in the current domain. The mixer LO ports are driven by the quadrature LO and the baseband ports are digitally-modulated by the 2-bit phase codeword. The I/Q mismatch can be corrected by the digitally-modulated auxiliary vector modulator. The auxiliary modulator is put in parallel with the main modulator and has a similar structure. Its LO ports are also driven by the in-phase and quadrature LO, and the baseband ports and the tail currents, controlling respectively the IQ polarity and current magnitude of the correcting current, are controlled by the memoryless pre-distortion information stored in the on-chip look-up table (LUT) [145]. The digital codewords are also the input of the LUT. A three-stage RF buffer amplifier follows the modulator and the simulated output power is 5 dBm.

The PA schematic is shown in Fig. 5.15. The three-stage CMOS PA is designed as a hybrid linear/digitally-modulated PA. The first two stages are realized by minimum-length (28-nm) common-source devices with supply voltage of 1 V and total device periphery of 80  $\mu$ m and 160  $\mu$ m, respectively. The PA output stage is realized by two differential cascode devices combined by a distributed active transformer (DAT), and the total common-source device size is 448  $\mu$ m. This power-combining technique has been widely adopted in mmW PA power combining [140, 142]. The supply voltage,  $V_{cas,supply}$ , can be conveniently provided from the center tap of the DAT transformer. The cascode cells are composed of three subcells put in parallel, with relative device size of 1, 2, and 4, to potentially provide a 3-bit magnitude resolution, although in this work the OOK-QPSK Tx element only needs 1-bit magnitude resolution. The cascode gates are driven with high-state and low-state voltages of  $V_{up}$  and  $V_{dn}$ , respectively. When the standalone PA is characterized as a linear PA, all the cascode cells are turned on with  $V_{up}$  of 1.6 V and  $V_{cas,supply}$  of 1.85V. At 75 GHz, the



Figure 5.13: (a) Input return loss of the LO port. (b) Injection pair gate swings with 5-dBm LO input source power. (c) Tx locking range.

CHAPTER 5. MMW CONSTELLATION FORMATION EXPLOITING COMBINATION REDUNDANCY



Figure 5.14: Schematic for the QPSK modulator.

simulated PA S21, peak power, and PAE are 16 dB, 17 dBm, and 13%, respectively. When the PA is operated as a digitally-modulated magnitude modulator,  $V_{cas,supply}$  is reduced to 1.6 V to relax the drain-gate stress of the off cascode cells with  $V_{dn}$  of 0.5 V.

Finally, The Tx element has a 5-Gb/s LVDS Rx includes a preamplifier, a strong-arm comparator, and a 1-to-5 DFF-based deserializer. The block diagram is illustrated in Fig. 5.16. The received 3-bit AM code, 2-bit PM code, and the corresponding and pre-loaded 32 sets of IQ compensation codes are retimed by the negative edge of the 1-GHz clock and sent to the modulator and PA. The clock Rx is a two-stage differential pair followed by a self-biased inverter chain. Although realized in a different CMOS process, the data and clock Rx designs are similar to those used in [59].

#### Measurement

The Tx output frequency is determined by the VCO tuning range (TR). The total capacitance of the VCO LC-tank is controlled by a 7-state capacitor bank and a varactor. The measured free-running Tx LO frequency is plotted in Fig. 16 to the discrete VCO states and the continuous varactor tuning voltage. The measured TR is 14% from 73 to 84 GHz, very close to the simulated value of 15%.



Figure 5.15: PA Schematic.



Figure 5.16: LVDS Rx schematic.



Figure 5.17: Measured Tx output free-running frequency.

The Tx output phase noise (PN) is indirectly measured at 3 GHz with the aid of an E-band down-conversion module. The phase noise of the free-running 76.3-GHz Tx (VCO capacitor bank state = 5 and  $V_{tune} = 1.12 \text{ V}$  is -92 dBc/Hz at a 1-MHz offset. The calculated FOM, defined by PN –  $20\log(\text{Freq.}/\Delta f) + 10\log(P_{dc}/1 \text{ mW})$ , is -177 dB with the VCO power consumption at 17 mW. The measured free-running Tx noise is plotted in Fig. 5.18, which follows the 20-dB/decade degradation. The measured Tx noise with the ILFT locked to 76.3 GHz by a 12.72-GHz LO source with power delivered into the chip estimated at 5 dBm, is plotted in Fig. 5.18. The PCB is driven by a 15-dBm signal generator to compensate the estimated PCB trace loss of 6 dB and bondwire mismatch loss of 4 dB. When the VCO freerunning frequency is aligned to the injection frequency (12.72 GHz), the measured locking range is about 250 MHz, and the modulator phase noise below 100 MHz can be suppressed significantly to the level of the source noise. The accumulated Tx noise from 0.1 MHz to 1 GHz is lower than -40 dBc, which should be able to support the wide communication bandwidth higher than 1 GHz. In addition, the 250-MHz locking range, incorporating the varactor DACs (on PCB) with voltage resolution around 1.5 mV, can provide a fine phase resolution for the multiple Tx elements to be aligned properly in the phased array. With the injection frequency fixed at 12.72 GHz, the VCO stays locked with an 80-mV  $V_{tune}$  window ranged from 1.08 to 1.16 V, which corresponds to a substantial LO phase shift around  $260^{\circ}$  $(130^{\circ} \text{ before the doubler})$ . The phase noise of an ILO has been known to depend on the frequency difference between the free-running and the injection frequencies [[108], Sec. 4]. Fig. 5.18 shows that if  $V_{tune}$  is set at 1.16 V to create a substantial phase shift around 80°, the accumulated Tx noise degrades noticeably.

The PA S-parameter was on-probe characterized with a VNA and achieves a maximum



Figure 5.18: Tx (76.3 GHz) output phase noise.

 $S_{21}$  (power gain) of 13.7 dB at 74 GHz and 3-dB bandwidth from 67 to 80 GHz. The measured and simulated results S-parameters are plotted in Fig. 5.19. The PA power consumption is 0.35 W, and the measured peak power is 15.7 dBm at 78 GHz with a compressed power gain of 8 dB and PAE of 8.9%. The PA operates linearly up to its  $OP_{1dB}$  of 11.8 dBm with OP1dB PAE at 4.2%. The large-signal PA power performance is shown in Fig. 5.20, measured by a power meter. The measured small-signal gain with the power meter matches the VNA-measured dB( $S_{21}$ ).

The 2-Gb/s QPSK constellation from the complete Tx element, including the PA and the QPSK modulator, is plotted in Fig. 5.21(a). The achieved constellation EVM is -22 dB with the IQ compensation provided by the auxiliary modulator. The measured output power is 13.1 dBm at 75 GHz, lower than the PA maximum power due to the insufficient power driving the PA. (The measured output power from the QPSK modulator is about 4 dBm [104].) Considering the lower PA output swing, the PA supply voltage is reduced to 1.6 V to improve the Tx efficiency when the PA is connected to the QPSK modulator. The Tx efficiency is 4.8%, including all the on-chip power consumptions. The total power consumption of the Tx element is 430 mW, where the PA and modulator consume 270 mW



Figure 5.19: Measured and simulated linear PA S-parameters.



Figure 5.20: Measured PA large-signal power and efficiency.



Figure 5.21: (a) Tx output QPSK constellation at 75.4 GHz and (b) dc power breakdown.

and 160 mW, respectively. Fig. 5.21(b) summarizes the dc power breakdown of the Tx element.

The PA and Tx performances have been summarized in [104] and compared to the reported 28-nm CMOS PAs and CMOS Txs, including [100, 142, 143, 145–148] and many others. The achieved Tx power and efficiency, including the power consumed on generating the LO signal, are competitive while the data rate is on the low end, limited by the 1-to-5 Rx DeSer. The injection-lock LO sextupler, all-digital interface, IQ compensation, and the PA magnitude modulation make the Tx element an appropriate building block for the digitally-modulated phased arrays implemented in this work.

### 5.4 Phased Array Implement and Characterization

#### Design of the Antenna Elements

The 8-element phased array antennas are fabricated on a 6-layer Rogers PCB with thirdorder high-density interconnection (HDI) that employs 75- $\mu$ m blind laser drills. The board photograph and the PCB stack up are shown in Fig. 5.22(a) and (b), respectively. The chip-on-board (COB) assembly uses bondwire connections. The mutual distance between the array elements has been enlarged to 2.3 mm to better routing the dense signals.

To reduce the signal attenuation and impedance mismatch induced by the bondwire inductance, the wires connecting the Tx outputs to the antenna inputs are made as short



Figure 5.22: Photograph of the array PCB board with COB assembly. (b) Rogers PCB stackup.

as possible, which are around 500  $\mu$ m. This is achieved by thinning the CMOS dies to 100  $\mu$ m and placing the dies into the PCB cavities close to the antennas. Nevertheless, the design of the PCB antennas must include the bondwire inductance. HFSS full-wave EM simulation, including the bondwires and the silicon substrates, has been performed to optimize the design of the slot-loop antennas. The slot loops use the top metal layer  $(L_1)$ for both the signal trace and the ground plane. The antennas have slot width of 75  $\mu$ m and outer dimension of 1.15 mm by 0.95 mm. The antennas have to be designed at a larger size compared to the case not considering wire, such that the antenna resonance frequency without the bondwire is lower than the operation frequency. In such a case, the antenna input impedance is capacitive and can compensate the wire inductance. The simulated 8port S-parameters for the antenna array are plotted in Fig. 5.23(a). The simulated antenna bandwidths, with input  $S_{11}$  lower than -5 dB, are 4 GHz from 71.8 GHz to 75.8 GHz. The mutual couplings between the antenna elements are very weak, with  $S_{ij}$  lower than -25 dB if two adjacent ports are selected and  $S_{ij}$  lower than -35 dB if two distant ports are selected. Fig. 5.23(a) plots the simulated  $S_{11}$  for the antenna elements with the bondwire lengths offset by  $\pm 100 \ \mu m$  from the nominal value. It can be observed that the antenna bandwidth can tolerate such a reasonable variation.

The simulated antenna patterns at 75 GHz, along the y-axis with  $\phi = \pm 90^{\circ}$ , are plotted in Fig. 5.24(a) for the eight antennas including the wires. It is observed that the radiation component along the  $\theta$  direction (i.e.,  $E_{\theta}$ ) is significantly lower and can be ignored. The peak antenna gains for the elements are around 3 dBi at the broadside direction ( $\theta = 0$ ), and the radiation efficiencies are about 50%, defined by the total radiated power divided by the power



Figure 5.23: (a) Simulated S-parameters for the antenna array. (b) Simulated antenna  $S_{11}$  with  $\pm 100 \ \mu m$  offset in the bondwire lengths.

delivered into the antenna. The 3-dB beam-width is around 90°. In addition to the antenna gain, the antenna array is co-simulated with the active circuits and the simulated EIRP, also along the y-axis, is plotted in Fig. 5.24(b). The elements have peak EIRP around 17 dBm. Again, only the radiation along the  $\phi$  direction and the  $E_{\phi}$  component are considered. When one element operates, the other seven elements are terminated by the measured PA  $S_{22}$  to improve the simulation accuracy.

Finally, the simulated EIRPs at 75 GHz with eight and six antennas uniformly driven by Tx elements are plotted in Fig. 5.25. Since the antenna elements have a wide beamwidth, the simulated results resemble the ideal calculation previously shown in Fig. 5.2. The simulated maximum EIRP is 34.6 dBm with 8 elements and 32.1 dBm with 6 elements. The EIRPs achieved by 8 and 6 elements improve 17.6 dB and 15.1 dB from the unitcell EIRP, which agree well with the  $20\log(N)$  relation for ideal combining where N is the number of elements. Although not the focus of this study, beam steering is possible with this phased array because the output phase of each Tx element can be continuously and statically adjusted over a wide range (i.e.,  $260^{\circ}$ ) by tuning the ILFT varactor, and the QPSK modulator provides (dynamically) discrete phase shifts of 90°, 180°, and 270°.

#### Measured Results of the Array Elements

The measurement setup for the far-field spatial formation and leakage suppression is illustrated in Fig. 26.

The HDI Rogers PCB hosts the eight chips and distributes the dc supplies via the on-PCB linear regulators and programable voltage dividers. The eight LVDS data streams are generated by a 11-channel digital pattern generator (DPG11-8M). The maximum data rate of the pattern generator is 1 GS/s, which limits the output symbol rate of the Tx elements to 0.2 GS/s. Therefore, while the Tx element has been demonstrated to achieve a QPSK data rate of 2 Gb/s, the phased array Tx is only demonstrated with lower data rates, at 0.4 Gb/s and 0.8 Gb/s for the spatially-formed QPSK and 16QAM constellation, respectively. The 200-MHz and 1-GHz deserializer clocks for the eight elements are generated by two supporting PCBs, which take in a single clock signal and distribute it into 8 paths with individual digital-control delay line (DCDL) ICs. The LO source at around 13 GHz is generated by a signal generator and distributed to the Tx elements by a 1-to-8 Wilkinson power divider realized on the Rogers antenna board.

The Rx antenna is placed at  $\theta = 0$  and 40 cm from the antenna board. The Rx antenna is a standard WR-10 horn with a high antenna gain of 23 dBi. The received mmW signal is down-converted to 1.2 GHz, amplified, and finally collected and processed by an oscilloscope. The total Rx conversion gain has been calibrated by a power meter directly connecting to the Rx antenna. The second Rx measures the spatial leakage along the  $\phi$  direction (i.e.,  $E_{\phi}$ ) at a coupling distance of 60 cm. The second Rx does not have to demodulate the constellation and is simply composed of the same WR-10 horn antenna, a mmW amplifier module, and a power meter. For both the main and the second Rx, the received power is translated into the Tx EIRP via the Friis transmission equation.

CHAPTER 5. MMW CONSTELLATION FORMATION EXPLOITING COMBINATION REDUNDANCY



|               | Ant1 | Ant2 | Ant3 | Ant4 | Ant5 | Ant6 | Ant7 | Ant8 |
|---------------|------|------|------|------|------|------|------|------|
| Rad. Eff. (%) | 50.4 | 50.1 | 50.0 | 50.1 | 50.1 | 50.2 | 50.2 | 50.5 |

(a)



Figure 5.24: (a) Simulated antenna gain and (b) Tx EIRP for the 8 elements.



Figure 5.25: Simulated array EIRP with uniform excitations.



Figure 5.26: Setup for far-field measurement.


Figure 5.27: Measured Tx EIRP at  $\theta = 0$  for the eight Tx elements.

First the element EIRPs are measured at  $\theta = 0$  versus the Tx output frequency. The elements are turned on one at a time and the Tx frequency is swept. The measured EIRPs for the eight elements are plotted in Fig. 5.27. The Tx EIRP degrades as the Tx frequency increases from 75 to 79 GHz. This degradation is expected because the antenna input S11 and the PA S21 degrade with frequency in this frequency region. The WR-10 components have the lower cut-off frequency at 75 GHz, which is the lower-bound frequency in measurement. The peak measured EIRP is around 15.7 dBm, with the second element operated at 75.3 GHz. This peak value is quite close to the simulated EIRP of 17 dBm. However, the fourth element has a substantially lower EIRP of 12.9 dBm at 75.3 GHz, so the second element has to be attenuated and its maximum power cannot be exploited. Instead, the operation frequency is selected at 76 GHz. At 76 GHz, the uncompensated element EIRPs range from 13.2 to 14.3 dBm, and the variation is not significant and corresponds to the maximum magnitude error of -24 dB. The magnitude mismatch can be further improved by adjusting the PA fist-stage gate bias voltages for some of the Tx elements.

It has been mentioned that the Tx elements must have sufficient locking ranges (e.g., 250 MHz) for a low Tx noise and to ensure the elements can be aligned properly with a low sensitivity to voltage and temperature variations. The measured locking ranges for the eight elements are plotted in Fig. 5.28. In agreement to the simulated results, a higher locking



Figure 5.28: Measured locking ranges for the Tx elements under antenna board input power of 20 and 26 dBm.

range can be obtained at a higher frequency under the same LO power. The element locking ranges achieve 250 MHz at 76 GHz with an LO source of 26 dBm at the input of the antenna PCB. The additional power loss on the 1-to-8 Wilkinson power divider is estimated at 6 dB.

The eight elements are injection-locked and have been operated one at a time to output the 0.4-Gb/s QPSK constellation with EIRP around 13.5 dBm, and the received and demodulated constellations are plotted in Fig. 5.29. The constellations without operating the auxiliary IQ suffer from IQ mismatch and only have EVMs around -18 dBc Since the mismatch could be correlated, it may not be averaged and suppressed by spatially-combining multiple elements [149]. Therefore, the local constellations are compensated, with EVMs improved to around -25 dB, before the spatial combining.

## Spatial Combining with Peak Array EIRP

To spatially combine the eight elements, the element output phases, observed at  $\theta = 0$ , must be the same. Since the sixth element achieves the best local QPSK constellation with EVM of only -28 dB, it is selected as the test-key element to be aligned by the other



Figure 5.29: Demodulated 0.4-Gb/s QPSK constellation from the 8 elements, with and without the IQ compensation.

elements. To further reduce the magnitude mismatch, the element EIRPs are adjusted to 13.6 dBm by fine-tuning the PA gate biases.

For the (injection-locked) Tx elements to be calibrated, its VCO varactor bias is swept around the nominal value which set the free-running frequency at 76 GHz, and the four QPSK phases (I+/I-/Q+/Q-) are spatially-combined with the opposite phases generated by the sixth (test-key) elements. The Rx power should be canceled if the phases are aligned properly. Owing to the residual phase and magnitude mismatch between elements and the finite varactor voltage resolution, the cancellations that can be achieved are around 22 dB, which corresponds to a phase error of 4.4° if the magnitude mismatch is negligible. The measured cancellations for phase-aligning the first, third, fifth, and the seventh elements (to the sixth element) are shown in Fig. 5.30. For all the seven elements to be phase-aligned to the sixth cell, the varactor biases are within 5 mV from the nominal values and should not cause Tx noise degradation as explained via Fig. 5.18.

In this work with a modest symbol rate of 200 MS/s, aligning properly the 200-MHz deserializer clocks for the eight elements is not very challenging. The clock phases at the input of the antenna board are matched by the DCDLs (SY89296UTG) on the supporting PCB that have the minimum delay resolution of 10 ps. The mismatch between the clock paths on the array PCB has been controlled within 1 cm, which corresponds to a relatively small mismatch in time (around 50 ps) compared to the 5-ns symbol period.

After the output phase alignment of the elements, the spatial QPSK and 16QAM constellations are synthesized and measured at  $\theta = 0$  with identical element contributions and the maximum possible EIRP. The QPSK constellations are spatially-combined by eight and six elements and the 16QAM constellation is formed by six elements. No redundancy is available for the high-power symbols in the constellations. In addition to the main-beam performance, the leakage power at  $\theta \approx 20^{\circ}$ ,  $30^{\circ}$ ,  $45^{\circ}$ , and  $60^{\circ}$  are also measured one at a time. The measured results are summarized in Table 5.3. The peak QPSK EIRP is 31.6 dBm with 8 elements and 29.1 dBm with 6 elements, and the total on-silicon power consumptions are 3.23 and 2.39 W, respectively. The measured leakages will be compared later to the leakages achieved by the proposed constellation formulation exploiting combination redundancy.

The 16QAM constellation is formed by six elements. The high-power symbols  $6 \times (1, j, -1, -j)$  have unique combinations and the back-off symbols  $(4+2j) \times (1, j, -1, -j), (2+4j) \times (1, j, -1, -j)$ , and  $2 \times (1, j, -1, -j)$  all have 15 efficient combinations. With the small number of combinations utilized, the lowest leakages measured at the four angles are listed in Table 5.3. The average EIRP is 26.5 dBm with total on-silicon power consumption of 2.21 W.

The performances of this mmW phased array Tx and previous arts ([89, 92, 93, 95, 96, 98]) are summarized in Table 5.4. Table 5.4 only includes works with the antenna package and the active circuits fabricated in bulk CMOS/SiGe processes. The demonstrated element output power and array EIRP are both excellent. The array efficiencies are also high and could be further improved with a higher RF power that operates the Tx PAs at the peak PAE.

Two definitions on the array efficiency,  $EFF_1$  and  $EFF_2$ , are defined as  $\text{EIRP}/P_{dc}$  and (total output power)/ $P_{dc}$ , respectively. Since the array power consumption is proportional to



Figure 5.30: Output phase calibration (via aligning the phases to the sixth element) for the (a) first, (b) third, (c) fifth, and (d) seventh Tx element.

|                                                      | 8 Elements       | 6 Elements       | 6 Elements       |  |  |
|------------------------------------------------------|------------------|------------------|------------------|--|--|
| Carrier Frequency (GHz)                              |                  | 76               |                  |  |  |
| EIRP at $\theta = 0^{\circ} (dBm)$                   | 31.6             | 29.1             | 26.5             |  |  |
| Demodulated<br>Constellation at $\theta = 0^{\circ}$ |                  |                  |                  |  |  |
| EVM (dB)                                             | -28.5            | -27.7            | -25.8            |  |  |
| System dc Power (W)                                  | 3.23             | 2.39             | 2.21             |  |  |
| Leakage at $\theta = 20^{\circ} (dBm)$               | 13.4 (-18.0 dBc) | 14.5 (-14.6 dBc) | 12.7 (-13.8 dBc) |  |  |
| Leakage at $\theta = 30^{\circ}$ (dBm)               | 11.7 (-19.7 dBc) | 12.2 (-16.9 dBc) | 8.2 (-18.3 dBc)  |  |  |
| Leakage at $\theta = 45^{\circ}$ (dBm)               | 9.0 (-22.4 dBc)  | 11.0 (-18.1 dBc) | 10.4 (-16.1 dBc) |  |  |
| Leakage at $\theta = 60^{\circ} (dBm)$               | 4.5 (-26.9 dBc)  | 2.7 (-26.4 dBc)  | 1.0 (-25.5 dBc)  |  |  |

Table 5.3: Spatially-combined QPSK/16QAM constellations with the highest EIRP.

| Ref.             | Process         | # of<br>Elem. | Freq.<br>(GHz) | Element<br>Peak | Peak<br>EIRP | Tx<br>$P_{dc}$ | Peak A<br>Efficient | Array<br>cy <sup>&amp;</sup> (%) | Antenna<br>Package |
|------------------|-----------------|---------------|----------------|-----------------|--------------|----------------|---------------------|----------------------------------|--------------------|
|                  |                 |               |                | (dBm)           | (aBm)        | (w)            | $EFF_{1,n}$         | EFF <sub>2</sub>                 |                    |
| [95]<br>12'TMTT  | 65-nm<br>CMOS   | 4             | 61             | 7               | N.A.         | 0.4            | N.A.                | 5.0                              | LTCC               |
| [93]<br>13'JSSC  | 0.18-μm<br>SiGe | 4             | 80             | 5               | N.A.         | 1.0            | N.A.                | 1.3                              | PCB                |
| [98]<br>13'ISSCC | 65-nm<br>CMOS   | 8             | 60             | 9.6             | 22           | 0.38           | 5.2                 | 19.2                             | PCB                |
| [92]<br>14'JSSC  | 40-nm<br>CMOS   | 16            | 60             | 8               | 27           | 1.19           | 2.6                 | 8.5                              | PCB                |
| [89]<br>16'TMTT  | 0.13-µm<br>SiGe | 64            | 60             | 5               | 38           | 7.0            | 14.1                | 2.7                              | Glass              |
| [96]<br>16'ISSCC | 28-nm<br>CMOS   | 4             | 60             | 7.5             | 24           | 0.67           | 9.4                 | 3.3                              | PCB                |
| This<br>Work     | 28-nm<br>CMOS   | 8             | 76             | 13.1<br>(15.7*) | 31.6         | 3.23           | 5.6                 | 5.1<br>(9.2*)                    | PCB                |

 $^{\&}EFF_{1,n} \equiv \text{EIRP}/(\# \text{ of Element})/(\text{Tx } P_{\text{dc}}); EFF_2 \equiv (\text{Total Output Power})/(\text{Tx } P_{\text{dc}})$ \*element peak output power

Table 5.4: Bulk CMOS/SiGe phased array Txs with antenna integration at around 60  $\rm GHz$ 

the number of element (N) while the array EIRP is proportional to  $N_2$ ,  $EFF_1$  is normalized by N ( $EFF_{1,n} \equiv EIRP/P_{dc}/N$ ) to compare the works with different N more fairly.  $EFF_1$ emphasizes the performance for communication while  $EFF_2$  characterizes better the package heat dissipation and does not depend on antenna gain. A higher  $EFF_2$  indicates that a lower portion of the dc power is converted to the package heat dissipation. It has been explained via Table 5.2 that both  $EFF_1$  and  $EFF_2$  can be higher using digitally-modulated array elements with nonidentical element contributions. This efficiency-enhancement method has also been adopted by [98].

Although the maximum symbol rate achieved by the Tx element is 1 GS/s, which projects a spatially-combined 16QAM data rate of 4 Gb/s, the phased array data rate is limited by the instrument at 0.8 Gb/s.

## 5.5 Measured Results on Leakage Suppression

The novelty of this work lies in the redundancy exploitation for the digital-modulated mmW phased array to achieve a low spatial leakage, which is the focus of this section. First the redundancy-rich 0.4-Gb/s QPSK constellation is synthesized by eight elements with spatial symbols of  $(4+4j) \times (1, j, -1, -j)$ , where each symbol has 70 combinations. The Tx EIRP and constellation EVM are measured at  $\theta = 0^{\circ}$  with the combinations swept, and the results are presented in Fig. 5.31(a).

In this measurement, the same code index is used for the four spatial symbols, where the symbols -4+4j, -4-4j, and 4-4j are synthesized by introducing uniform phase shifts to the element outputs that synthesize 4+4j, by 90°, 180°, and 270°, respectively. The Tx performance at the communication direction is quite insensitive to the adopted combination. Across the 70 combinations, the measured EIRPs are around 28.5 dBm, 3-dB lower than the maximum EIRP, and the EVMs are lower than -25.7 dB. The array power consumption is 3.2 W.

The spatial leakages at  $\theta$  of 20°, 30°, 45°, and 60° have also been measured for all the combinations. As an example, the measured leakages at  $\theta$  of 20° are presented at Fig. 5.31(b), which also include the leakages corresponding to the static QPSK symbols. As expected, the leakages of the static symbols only slightly deviate from the leakage of the modulated signal. The leakages (EIRP) have a wide range of power level from 4 to 17 dBm, which indicates that selecting the optimal combination, (17, 17, 17, 17) in this case, is important. Similarly, the optimal combination for the lowest leakage at  $\theta$  of 30°, 45°, and 60° can be located at code index of 45, 29, and 2, respectively.

The measured results at the communication direction and the low-leakage directions are summarized in Table 5.5. The lowest radiations at  $\theta$  of 20°, 30°, 45°, and 60° are 4.1, -2.8, -0.1, and -1.6 dBm, respectively, which correspond to relative leakages from -24.5 to -31.3 dB. The leakages have been improved from those associated with the peak-EIRP QPSK constellation.



Figure 5.31: (a) Tx EIRP and constellation EVM at  $\theta = 0^{\circ}$  and (b) spatial leakage at  $\theta = 20^{\circ}$  for the redundancy-rich QPSK symbols.

| Low-Leakage<br>Direction                                                             | $\theta = 20^{\circ}$ | $\theta = 30^{\circ}$ | $\theta = 45^{\circ}$ | $\theta = 60^{\circ}$ |
|--------------------------------------------------------------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|
| Code Combination<br>$(I_n, I_p, Q_n, Q_p)$                                           | (17, 17, 17, 17)      | (45, 45, 45, 45)      | (29, 29, 29, 29)      | (2, 2, 2, 2)          |
| EIRP at $\theta = 0^{\circ}$                                                         | 28.7 dBm              | 28.3 dBm              | 28.7 dBm              | 28.4 dBm              |
| EVM                                                                                  | -27.4 dB              | -27.3 dB              | -27.5 dB              | -26.3 dB              |
| Demodulated<br>Constellation<br>at $\theta = 0^{\circ}$<br>(Data Rate<br>= 0.4 Gb/s) |                       |                       |                       |                       |
| Leakage at $\theta = 20^{\circ}$                                                     | 4.1 dBm               | 13.1 dBm              | 8.8 dBm               | 12.7 dBm              |
| 0.85                                                                                 | (-24.5 dBc)           | (-15.4 dBc)           | (-19.7 dBc)           | (-15.8 dBc)           |
| Leakage at $\theta = 30^{\circ}$                                                     | 10.7 dBm              | -2.8 dBm              | 6.2 dBm               | 12.5 dBm              |
|                                                                                      | (-17.8 dBc)           | (-31.3 dBc)           | (-22.3 dBc)           | (-16.0 dBc)           |
| Leakage at $\theta = 45^{\circ}$                                                     | 2.0 dBm               | 2.4 dBm               | -0.1 dBm              | 8.3 dBm               |
|                                                                                      | (-26.5 dBc)           | (-26.1 dBc)           | (-28.6 dBc)           | (-20.2 dBc)           |
| Leakage at $\theta = 60^{\circ}$                                                     | 7.8 dBm               | 9.4 dBm               | 3.2 dBm               | -1.6 dBm              |
| HERE!                                                                                | (-20.7 dBc)           | (-19.1 dBc)           | (-25.3 dBc)           | (-30.1 dBc)           |

Table 5.5: Spatially-combined QPSK constellations and the corresponding leakage.

The leakages of the uniformly-driven array and the digital array exploiting redundancy have been plotted in Fig. 5.8, assuming ideal Tx and antenna. The updated results with EM-simulated antennas and active circuits co-simulation are plotted in Fig. 5.32. The measured leakages at the four angles are also marked on Fig. 5.32 for comparison. Lower (relative) leakages from 3.2 to 11.6 dB have been achieved with the proposed technique, which verifies its effectiveness.

Secondly, the 0.8-Gb/s 16QAM constellation with spatial symbols of  $(1+j, 3+j, 1+3j, 3+3j) \times (1, j, -1, -j)$ . is synthesized by six elements. The four symbols (1+j, 3+j, 1+3j, 3+3j) have 690, 150, 150, and 20 code combinations, respectively, and among which 30, 60, 60, and 20 combinations are efficient. The spatial symbols in the second, third, and the fourth quadrant with the same code index to their first-quadrant counterparts only involve uniform phase shifts. Therefore, similar to the QPSK case, the combination codes with the lowest leakages for the first-quadrant symbols can be directly adopted for the other symbols. The average EIRP and constellation EVM, for 20 of the many combinations, are plotted in Fig. 5.33(a). The measurement again shows the performance at  $\theta = 0^{\circ}$  is insensitive to the adopted combination. The average EIRP at  $\theta = 0^{\circ}$  is about 23.3 dBm with EVM better than -24 dB.

Ideally, all the inefficient and efficient code combinations for the four first-quadrant sym-

206



Figure 5.32: Measured spatial leakages associated with the high-EIRP QPSK constellation and the low-EIRP constellation with optimal combinations.

bols should be explored. Nevertheless, for the spatial symbol 1+j only 120 of the 660 inefficient combinations are tested and a sufficiently low leakage has been found. Fig. 5.33(b) plots the measured leakages at  $\theta$  of 30° for the four first-quadrant symbols, and the optimal code indices are 93, 55, 81, 14, for symbols 1+j, 3+j, 1+3j, and 3+3j, respectively.

The 16QAM measured results at  $\theta = 0$  and the four low-leakage directions are summarized in Table 5.6. The lowest leakages at  $\theta$  of 20°, 30°, 45°, and 60° are -18.0, -24.4, -25.0, and -20.4 dB, respectively. Both efficient and inefficient combinations are included in the optimization while the four sets of optimal combinations employ only inefficient combinations, where all the six elements operated as QPSK cells. The power consumption is 2.38 W. The optimized leakages only employing the efficient combinations in the optimization are also included in Table 5.6. The in-beam performances are roughly the same, while the total dc power can be reduced substantially from 2.32 W to 2.01 W. Since the spatial symbols have fewer efficient combinations, the achieved leakages become higher with degradation around 1 dB.

Fig. 5.34 plots the simulated counterparts for the theoretical curves in Fig. 5.10, which include the radiation from a uniformly-driven 6-element array, the minimum leakage from the low-EIRP constellation utilizing all the combinations, and the minimum leakage from



Figure 5.33: Tx EIRP and constellation EVM at  $\theta = 0^{\circ}$  and (b) spatial leakages at  $\theta = 30^{\circ}$  for the redundancy-rich 16QAM symbols.

| Low-Leakage<br>Direction                                                             | $\theta = 20$ °         | $\theta = 30^{\circ}$   | $\theta = 30^{\circ}$ $\theta = 45^{\circ}$ |                        |  |  |  |
|--------------------------------------------------------------------------------------|-------------------------|-------------------------|---------------------------------------------|------------------------|--|--|--|
| Exploiting All Combinations                                                          |                         |                         |                                             |                        |  |  |  |
| Code Combination<br>(``1+1j", ``3+1j", ``1+3j", ``1+3j", ``3+3j")                    | (116, 77, 62, 7)        | (93, 55, 81, 14)        | (33, 56, 77, 7)                             | (88, 23, 31, 9)        |  |  |  |
| EIRP at $\theta = 0^{\circ}$                                                         | 23.3 dBm                |                         |                                             |                        |  |  |  |
| System dc<br>Power (W)                                                               | 2.38                    |                         |                                             |                        |  |  |  |
| Demodulated<br>Constellation<br>at $\theta = 0^{\circ}$<br>(Data Rate = 0.8<br>Gb/s) |                         |                         |                                             |                        |  |  |  |
| EVM                                                                                  | -25.4 dB                | -25.2 dB                | -25.2 dB                                    | -25.5 dB               |  |  |  |
| Leakage at $\theta = 20^{\circ}$                                                     | 5.3 dBm                 | 11.2 dBm                | 15.5 dBm                                    | 14.6 dBm               |  |  |  |
|                                                                                      | (-18.0 dBc)             | (-12.1 dBc)             | (-7.8 dBc)                                  | (-8.7 dBc)             |  |  |  |
| Leakage at $\theta = 30^{\circ}$                                                     | 9.0 dBm<br>(-14.3 dBc)  | -1.1 dBm<br>(-24.4 dBc) | 9.9 dBm<br>(-13.4 dBc)                      | 5.1 dBm<br>(-18.2 dBc) |  |  |  |
| Leakage at $\theta = 45^{\circ}$                                                     | 9.2 dBm<br>(-14.1 dBc)  | 11.7 dBm<br>(-11.6 dBc) | -1.7 dBm<br>(-25.0 dBc)                     | 7.0 dBm<br>(-16.3 dBc) |  |  |  |
| Leakage at $\theta = 60^{\circ}$                                                     | 11.2 dBm<br>(-12.1 dBc) | 7.7 dBm<br>(-15.6 dBc)  | 7.7 dBm<br>(-15.6 dBc)                      | 2.9 dBm<br>(-20.4 dBc) |  |  |  |
| Exploiting Only Efficient Combinations                                               |                         |                         |                                             |                        |  |  |  |
| System dc<br>Power (W)                                                               | 2.01                    |                         |                                             |                        |  |  |  |
| EIRP at $\theta = 0^{\circ}$                                                         | 23.3 dBm                |                         |                                             |                        |  |  |  |
| EVM                                                                                  | -23.9                   | -24.1                   | -23.6                                       | -24.0                  |  |  |  |
| Achieved Leakage                                                                     | 6.3 dBm<br>(-17.0 dBc)  | -0.5 dBm<br>(-23.8 dBc) | -0.3 dBm<br>(-23.6 dBc)                     | 4.5 dBm<br>(-18.8 dBc) |  |  |  |

Table 5.6: Spatially-combined 16QAM constellations and the corresponding leakage.



Figure 5.34: Measured spatial leakages associated with the high-EIRP 16QAM constellation and the low-EIRP constellation with optimal combinations.

the high-EIRP constellation utilizing the limited redundancy. Only the latter two cases can synthesize 16QAM constellation with the six OOK-QPSK elements. The measured leakages at the four angles are also marked on Fig. 5.34 for comparison. With the proposed technique, lower relative leakages can be achieved at  $\theta$  of 20°, 30°, and 45°, and the improvements are 4.2, 6.1, and 8.9 dB, respectively. However, the proposed method has a higher leakage at  $\theta$ = 60° and should not be adopted when a low leakage is desired there. To explain briefly, at  $\theta$  = 60° the simulated radiated fields from adjacent elements have phase shifts around 180°; therefore, the leakages can be very low for the high-EIRP constellation if the four symbols are synthesized by  $v_i = [1; 1; 0; 0; 0; 0]$ , [1; 1; 1; 1; j; j], [1; 1; j; j; j; j], and [1; 1; 1; 1; 1;1]. However, the highest suppression for the symbol 3+3j in the low-EIRP constellation is only 9.5 dBc.

## 5.6 Appendix

The digital array has been demonstrated to simultaneously synthesize a QPSK/16QAM constellation at  $\theta = 0^{\circ}$  and a low leakage at another direction. The constellation is distorted with observation made at other directions due to the ununiform phase shifts to the constellation compositions [99, 103]. In contrast, the constellation of a conventional passed array is less affected by the observation direction. Fig. 5.35 plots the calculated EVM degradations



Figure 5.35: 16QAM EVM degradation with deviation in the Rx angle.

for the 16QAM constellation. The degradation is quite insensitive to the adopted element output combinations, while the observation window (i.e., EVM < -20 dB) is very narrow with  $-0.7^{\circ} < \theta < 0.7^{\circ}$ . In the 16QAM measurement with the lowest radiation synthesized at  $\theta = 30^{\circ}$ , the position of the main Rx has been slightly adjusted along the y-direction by  $\pm 0.5$  cm, which corresponds to the angle shift in  $\theta$  by  $\pm 0.7^{\circ}$ . The measured EVMs indeed degrade significantly from -25 dB to -18 dB.

Although low leakage can be achieved over a wide range of directions with the optimal combinations, each optimal combination only corresponds to a narrow low-leakage window. Therefore, substantial optimal combinations must be located and recorded. Fig. 5.8 has shown the lowest leakage with a QPSK constellation synthesized at  $\theta = 0^{\circ}$ . The optimal combinations have been located with a resolution of  $\theta = 1^{\circ}$ . For each optimal combination, the leakages around its intended direction are calculated and plotted in Fig. 5.36, which shows that the existent optimal combinations do not have a sufficiently high resolution to achieve a low leakage over the continuous receiving angle ( $\theta$ ). For example, both the optimal combinations for  $\theta = 11^{\circ}$  and  $\theta = 12^{\circ}$  cannot achieve a low leakage at  $\theta = 11.5^{\circ}$ , and a dedicated optimal combination might be required for  $\theta = 11.5^{\circ}$ . Assuming the number of the recorded optimal combinations is not limited, the calculated lowest leakages over the continuous receiving angle are also calculated for N = 12, which can be significantly suppressed to -35 dBc or below for most directions thanks to the larger number of code combinations (i.e., 924 combinations for each symbol).



Figure 5.36: Leakage over the continuous receiving angle ( $\theta$ ) with the optimal combinations recorded with resolution of  $\theta = 1^{\circ}$ . (QPSK constellation synthesized at  $\theta = 0^{\circ}$ )



Figure 5.37: Lowest leakage with the optimal combinations recorded with resolution of  $\theta = 0.01^{\circ}$ . (QPSK constellation synthesized at  $\theta = 0^{\circ}$ )

## Chapter 6

## **Conclusion and Future Works**

This thesis dealt with various design challenges for the RF transmitters addressing RFID, cellualr, and mmW applications.

Achieving a good PTE efficiency is arguably the most important design merits in IPT designs coupling with a miniature tag. We have proposed a novel analytical method to optimize the RF-dc PTE for a two-coil inductive power transfer system. The analytical approach, expressing the IPT RF-dc PTE in terms of the coupled-coil geometries and the metal/substrate property, was verified by simulated and measured results. An IPT was optimized at 2.2 mm with a differential PCB reader coil and a CMOS rectenna with coil size only of 0.01 mm<sup>2</sup>. The IPT was designed at 4.8 GHz and achieves a state-of-the-art RF-dc PTE.

Based on the developed tools, future works could be further improving the WPT PTE by employing thick-metal processes for the tag. It is also meaningful to further reduce the Tx power consumption with a custom designed and integrated PA. The PA should be optimized to deliver its peak power to the Tx coil with a simple interfacing matching network.

On the uplink side, the Tx-to-Rx blocker and noise leakage has been known to limit the uplink data rate. We have proposed a novel two-tone UL technique that can be adopted to both near-field and far-field systems. An uplink data rate of 10 Mb/s was demonstrated with the aid of a carefully-selected duplexer, and the data rate is two orders of magnitude better than the conventional backscattering UL communicating with the same miniature tag with coil size only of 0.01 mm<sup>2</sup>. A waveform shaping method has also been invented to improve the UL power. It demonstrates that with the proposed three-tone Tx waveforms, the tag-excited IM3 carrier can be enhanced by 4 dB without compromising the Tx peak power and tag harvested dc power.

Future explorations could employ the IM3 uplink technique (or the more general multitone Tx technique) to radar and vital detection applications. A uplink system with less stringent SNR requirement may even adopt an integrated electrical duplexer, which is less bulky but has worse power handling and isolation. Compared to the linear PA modules used throughout our works, integrated and custom made Tx can output the multi-tone Tx waveforms more efficiently.

In addition to the Tx efficiency, a wide RF bandwidth is required for the multi-mode and multi-band cellular transmitters. This thesis also presented a frequency reconfigurable CMOS all-digital transmitter with integrated phase and amplitude paths. The same CMOS chip was flip-chip connected to three PCB HDI interposers targeting three coarse frequency bands. The combined operating bandwidth of the three packages covers a wide range from 0.7 to 3.5 GHz, where a CW power higher than 25.5 dBm and DE above 40% can be achieved. The three packages have their peak power higher than 28 dBm and peak DE of 60%. The three packages have been tested with digital modulation schemes, including 64 QAM, 802.11.g WLAN, and 20-MHz LTE, at frequencies from 0.6 to 3.6 GHz, verifying the design's ability to support universal standards adaptation and frequency reconfiguration. This work presented a single-output band-switching HDI interposer design that hosts three identical CMOS Tx chips. The output power of the Tx package was higher than 22.9 dBm from 0.4 to 4 GHz with DE better than 25%. The excellent bandwidth was achieved by band-selecting and rotating the three sub-Tx's (LB/MB/HB) via the on-interposer passives. The reconfigurable and wideband Tx package also exhibited high power and efficiency for modulated signals, including 64-QAM, 20-MHz WLAN and LTE.

Our high-power and wideband module still has a relatively high out-of-band noise that can overwhelm the FDD receiver without using a duplexer. Attentions should also be paid on the recent development on tunable duplexers, including important merits such as Tx-to-Antenna loss,  $OP_{1dB}$ , power consumption, Antenna-to-Rx loss, and nose figure. Of course, the recommended DTx reconfiguration that turns on the idle DTx switches (to short-circuit the primary winding of the idle transformer) has not been implemented in our work and can be carried out as an immediate effort.

Finally, many mmW phased arrays have exhibited beam focusing and watt-level effective isotropically radiated power (EIRP). The conventional mmW phased array is also capable of synthesizing a radiation null or low sidelobe levels. However, for a efficient digitally-modulated mmW phased array, the element output resolution is usually not sufficient to support the existing leakage suppression techniques. This thesis presented a digitally-modulated 8-element phased array Tx. The array achieved a peak EIRP of 31.6 dBm at 76 GHz with dc power consumption of 3.2 W. The Tx elements are efficient OOK-QPSK cells in 28nm bulk CMOS. It demonstrates that QPSK and 16QAM constellations can be spatially-synthesized efficiently with nonidentical element contributions. In addition, the QPSK and 16QAM constellations can be synthesized by multiple element output combinations, and a low spatial leakage around -25 dBc can be achieved at most directions via adopting the optimal combinations (redundancy exploitation).

The data rate of our implemented phased array is directly limited by the on-shelf LVDS data generator. Mismatches between PCB traces and the delay resolution of the PCB delay components are secondary data rate limiters. Therefore, future works (or demonstration) should put all the Tx elements on the same CMOS chip and preload the data into the on-chip memory.

## Bibliography

- J. Garnica, R. A. Chinga, and J. Lin, "Wireless power transmission: From far field to near field," *Proceedings of the IEEE*, vol. 101, no. 6, pp. 1321–1331, 2013.
- [2] S. Han and D. D. Wentzloff, "0.61 W/mm<sup>2</sup> resonant inductively coupled power transfer for 3D-ICs," in *Custom Integrated Circuits Conference (CICC)*, 2012 IEEE, IEEE, 2012, pp. 1–4.
- [3] A. Radecki, H. Chung, Y. Yoshida, N. Miura, T. Shidei, H. Ishikuro, and T. Kuroda, "6W/25mm<sup>2</sup> inductive power transfer for non-contact wafer-level testing," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*, 2011 IEEE International, IEEE, 2011, pp. 230–232.
- [4] R. Matias, B. Cunha, and R. Martins, "Modeling inductive coupling for wireless power transfer to integrated circuits," in Wireless Power Transfer (WPT), 2013 IEEE, IEEE, 2013, pp. 198–201.
- [5] M. Tabesh, N. Dolatsha, A. Arbabian, and A. M. Niknejad, "A power-harvesting pad-less millimeter-sized radio," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 4, pp. 962–977, 2015.
- [6] M. Zargham and P. G. Gulak, "Fully integrated on-chip coil in 0.13μm CMOS for wireless power transfer through biological media," *IEEE transactions on biomedical* circuits and systems, vol. 9, no. 2, pp. 259–271, 2015.
- [7] U.-M. Jow and M. Ghovanloo, "Design and optimization of printed spiral coils for efficient transcutaneous inductive power transmission," *IEEE Transactions on biomedical circuits and systems*, vol. 1, no. 3, pp. 193–202, 2007.
- [8] S. Rao and J.-C. Chiao, "Body electric: Wireless power transfer for implant applications," *IEEE Microwave Magazine*, vol. 16, no. 2, pp. 54–64, 2015.
- K. Bernstein. (2015). Supply chain hardware integrity for electronics defense (shield), [Online]. Available: https://www.darpa.mil/program/supply-chain-hardwareintegrity-for-electronics-defense (visited on 11/09/2018).
- [10] L. Greenemeier. (2017). The pentagon's seek-and-destroy mission for counterfeit electronics, [Online]. Available: https://www.scientificamerican.com/article/thepentagon-rsquo-s-seek-and-destroy-mission-for-counterfeit-electronics (visited on 11/09/2018).

- R. J. Smith. (2011). Counterfeit chips plague pentagon weapons systems, [Online]. Available: https://www.publicintegrity.org/2011/11/07/7323/counterfeitchips-plague-pentagon-weapons-systems (visited on 11/09/2018).
- [12] B. Zhao, N.-C. Kuo, B. Liu, Y.-A. Li, L. Lotti, and A. M. Niknejad, "A 5.8 GHz power-harvesting 116μm × 116μm "dielet" near-field radio with on-chip coil antenna," in *Solid-State Circuits Conference-(ISSCC)*, 2018 IEEE International, IEEE, 2018, pp. 456–458.
- [13] X. Chen, W. G. Yeoh, Y. B. Choi, H. Li, and R. Singh, "A 2.45-GHz near-field RFID system with passive on-chip antenna tags," *IEEE Transactions on Microwave Theory* and *Techniques*, vol. 56, no. 6, pp. 1397–1404, 2008.
- [14] Y. Shi, M. Choi, Z. Li, Z. Luo, G. Kim, Z. Foo, H.-S. Kim, D. D. Wentzloff, and D. Blaauw, "A 10 mm<sup>3</sup> inductive coupling radio for syringe-implantable smart sensor nodes," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 11, pp. 2570–2583, 2016.
- [15] W. Biederman, D. J. Yeager, N. Narevsky, A. C. Koralek, J. M. Carmena, E. Alon, and J. M. Rabaey, "A fully-integrated, miniaturized (0.125 mm<sup>2</sup>) 10.5 μW wireless neural sensor," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 4, pp. 960–970, 2013.
- [16] D. Ahn and M. Ghovanloo, "Optimal design of wireless power transmission links for millimeter-sized biomedical implants.," *IEEE Trans. Biomed. Circuits and Systems*, vol. 10, no. 1, pp. 125–137, 2016.
- [17] A. Ibrahim and M. Kiani, "A figure-of-merit for design and optimization of inductive power transmission links for millimeter-sized biomedical implants," *IEEE transactions* on biomedical circuits and systems, vol. 10, no. 6, pp. 1100–1111, 2016.
- [18] S. A. Mirbozorgi, P. Yeon, and M. Ghovanloo, "Robust wireless power transmission to mm-sized free-floating distributed implants," *IEEE transactions on biomedical circuits and systems*, vol. 11, no. 3, pp. 692–702, 2017.
- [19] L. Gao, Y. Yang, A. Brandon, J. Postma, and S. Gong, "Radio frequency wireless power transfer to chip-scale apparatuses," in *Microwave Symposium (IMS)*, 2016 *IEEE MTT-S International*, IEEE, 2016, pp. 1–4.
- [20] B. Arakawa, L. Gao, Y. Yang, J. Guan, A. Gao, R. Lu, and S. Gong, "Simultaneous wireless power transfer and communication to chip-scale devices," in *Microwave Symposium (IMS)*, 2017 IEEE MTT-S International, IEEE, 2017, pp. 311–314.
- [21] N.-C. Kuo, B. Zhao, and A. M. Niknejad, "Inductive wireless power transfer and uplink design for a CMOS tag with 0.01 mm<sup>2</sup> coil size," *IEEE Microw. Wireless Compon. Lett.*, vol. 26, no. 10, pp. 852–854, 2016.
- [22] N.-C. Kuo, B. Zhao, and A. M. Niknejad, "Near-field power transfer and backscattering communication to miniature RFID tag in 65 nm CMOS technology," in *Microwave* Symposium (IMS), 2016 IEEE MTT-S International, IEEE, 2016, pp. 1–4.

- [23] B. Lee, M. Kiani, and M. Ghovanloo, "A triple-loop inductive power transmission system for biomedical applications," *IEEE transactions on biomedical circuits and* systems, vol. 10, no. 1, pp. 138–148, 2016.
- [24] M. Kiani, U.-M. Jow, and M. Ghovanloo, "Design and optimization of a 3-coil inductive link for efficient wireless power transmission," *IEEE transactions on biomedical circuits and systems*, vol. 5, no. 6, pp. 579–591, 2011.
- [25] A. K. RamRakhyani and G. Lazzi, "On the design of efficient multi-coil telemetry system for biomedical implants," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 7, no. 1, pp. 11–23, 2013.
- [26] S. Kim, J. S. Ho, and A. S. Poon, "Wireless power transfer to miniature implants: Transmitter optimization," *IEEE Transactions on Antennas and Propagation*, vol. 60, no. 10, pp. 4838–4845, 2012.
- [27] M. E. Halpern and D. C. Ng, "Optimal tuning of inductive wireless power links: Limits of performance," *IEEE Trans. on Circuits and Systems*, vol. 62, no. 3, pp. 725–732, 2015.
- [28] R. Jegadeesan and Y.-X. Guo, "Topology selection and efficiency improvement of inductive power links," *IEEE Transactions on Antennas and Propagation*, vol. 60, no. 10, pp. 4846–4854, 2012.
- [29] N.-C. Kuo, B. Zhao, and A. M. Niknejad, "Equation-based optimization for inductive power transfer to a miniature CMOS rectenna," *IEEE Transactions on Microwave Theory and Techniques*, vol. 66, no. 5, pp. 2393–2408, 2018.
- [30] E. Moradi, S. Amendola, T. Björninen, L. Sydänheimo, J. M. Carmena, J. M. Rabaey, and L. Ukkonen, "Backscattering neural tags for wireless brain-machine interface systems," *IEEE Transactions on Antennas and Propagation*, vol. 63, no. 2, pp. 719– 726, 2015.
- [31] EPCglobal. (2015). [2] EPC radio-frequency identity protocols generation-2 UHF RFID, version 2.0.1, [Online]. Available: https://www.gs1.org/sites/default/ files/docs/epc/Gen2\_Protocol\_Standard.pdf (visited on 11/09/2018).
- [32] W.-K. Kim, M.-Q. Lee, J.-H. Kim, H.-S. Lim, J.-W. Yu, B.-J. Jang, and J.-S. Park, "A passive circulator with high isolation using a directional coupler for RFID," in *Microwave Symposium Digest, 2006. IEEE MTT-S International*, IEEE, 2006, pp. 1177–1180.
- [33] J.-Y. Jung, C.-W. Park, and K.-W. Yeom, "A novel carrier leakage suppression frontend for UHF RFID reader," *IEEE Transactions on Microwave Theory and Techniques*, vol. 60, no. 5, pp. 1468–1477, 2012.
- [34] A. Shirane, Y. Fang, H. Tan, T. Ibe, H. Ito, N. Ishihara, and K. Masu, "RF-powered transceiver with an energy-and spectral-efficient IF-based quadrature backscattering transmitter," *IEEE J. Solid-State Circuits*, vol. 50, no. 12, pp. 2975–2987, 2015.

- [35] G. Papotto, F. Carrara, A. Finocchiaro, and G. Palmisano, "A 90nm CMOS 5Mb/s crystal-less RF transceiver for RF-powered WSN nodes," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International*, IEEE, 2012, pp. 452–454.
- [36] Y. Sun, D. Li, and A. Babakhani, "A wirelessly-powered 1.46GHz transmitter with on-chip antennas in 180nm CMOS," in 2018 IEEE/MTT-S International Microwave Symposium-IMS, IEEE, 2018, pp. 278–280.
- [37] Y. Rajavi, M. Taghivand, K. Aggarwal, A. Ma, and A. S. Poon, "An RF-powered FDD radio for neural microimplants," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 5, pp. 1221–1229, 2017.
- [38] G. A. Vera, Y. Duroc, and S. Tedjini, "Third harmonic exploitation in passive UHF RFID," *IEEE Transactions on Microwave Theory and Techniques*, vol. 63, no. 9, pp. 2991–3004, 2015.
- [39] N.-C. Kuo, B. Zhao, and A. M. Niknejad, "Novel inductive wireless power transfer uplink utilizing rectifier third-order nonlinearity," *IEEE Transactions on Microwave Theory and Techniques*, vol. 66, no. 1, pp. 319–331, 2018.
- [40] N.-C. Kuo, B. Zhao, and A. M. Niknejad, "Inductive power transfer uplink using rectifier second-order nonlinearity," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 63, no. 11, pp. 2073–2085, 2016.
- [41] N.-C. Kuo, B. Zhao, and A. M. Niknejad, "Intermodulation uplink for far-field passive RFID applications," in 2018 IEEE/MTT-S International Microwave Symposium-IMS, IEEE, 2018, pp. 274–277.
- [42] N.-C. Kuo and A. M. Niknejad, "Single-antenna FDD reader design and communication to a commercial UHF RFID tag," *IEEE Microwave and Wireless Components Letters*, vol. 28, no. 7, pp. 630–632, 2018.
- [43] N.-C. Kuo, B. Zhao, and A. M. Niknejad, "A 10-Mb/s uplink utilizing rectifier thirdorder intermodulation in a miniature CMOS tag," *IEEE Microwave and Wireless Components Letters*, vol. 27, no. 11, pp. 1031–1033, 2017.
- [44] N.-C. Kuo and A. M. Niknejad, "RF-powered-tag intermodulation uplink with threetone transmitter for enhanced uplink power," Manuscript submitted for publication, 2018.
- [45] C. V. N. Index. (2017). Global mobile data traffic forecast update, 2016-2021 white paper, [Online]. Available: https://www.cisco.com/c/en/us/solutions/ collateral/service-provider/visual-networking-index-vni/mobile-whitepaper-c11-520862.html (visited on 11/09/2018).
- [46] Wikipedia. (2015). LTE frequency bands, [Online]. Available: https://en.wikipedia. org/wiki/LTE\_frequency\_bands (visited on 11/09/2018).

- [47] X. Liu, A. Nejdel, M. Palm, L. Sundström, M. Törmänen, H. Sjöland, and P. Andreani, "A 65 nm CMOS wideband radio receiver with delta-sigma-based A/D-converting channel-select filters," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 7, pp. 1566– 1578, 2016.
- [48] A. Goel, B. Analui, and H. Hashemi, "Tunable duplexer with passive feed-forward cancellation to improve the RX-TX isolation," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 62, no. 2, pp. 536–544, 2015.
- [49] L. Calderin, S. Ramakrishnan, A. Puglielli, E. Alon, B. Nikolić, and A. M. Niknejad, "Analysis and design of integrated active cancellation transceiver for frequency division duplex systems," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 8, pp. 2038– 2054, 2017.
- [50] W. E. Neo, Y. Lin, X.-D. Liu, L. C. De Vreede, L. E. Larson, M. Spirito, M. J. Pelk, K. Buisman, A. Akhnoukh, A. De Graauw, et al., "Adaptive multi-band multi-mode power amplifier using integrated varactor-based tunable matching networks," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 9, pp. 2166–2176, 2006.
- [51] Z. Deng, E. Lu, E. Rostami, D. Sieh, D. Papadopoulos, B. Huang, R. Chen, H. Wang, W. Hsu, C. Wu, et al., "A dual-band digital-WiFi 802.11 a/b/g/n transmitter soc with digital I/Q combining and diamond profile mapping for compact die area and improved efficiency in 40nm CMOS," in Solid-State Circuits Conference (ISSCC), 2016 IEEE International, IEEE, 2016, pp. 172–173.
- [52] J. Ko, X. Guo, C. Cao, S. Rajapandian, S. Peng, J. Li, W. Lee, N. Baskaran, and C. Wang, "A high-efficiency multiband class-F power amplifier in 0.153 μm bulk CMOS for WCDMA/LTE applications," in *Solid-State Circuits Conference (ISSCC)*, 2017 *IEEE International*, IEEE, 2017, pp. 40–41.
- [53] W. Kim, K. S. Yang, J. Han, J. J. Chang, and C.-H. Lee, "An EDGE/GSM quadband CMOS power amplifier," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 10, pp. 2141–2149, 2014.
- [54] S. T. Yan, L. Ye, R. Kulkarni, E. Myers, H.-C. Shih, H. Wu, S. Saberi, D. Kadia, D. Ozis, L. Zhou, et al., "An 802.11 a/b/g/n/ac WLAN transceiver for 2× 2 MIMO and simultaneous dual-band operation with +29 dBm psat integrated power amplifiers," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 7, pp. 1798–1813, 2017.
- [55] H. Wang and H. Hashemi, "A 0.5–6 GHz 25.6 dBm fully integrated digital power amplifier in 65-nm CMOS," in *Radio Frequency Integrated Circuits Symposium*, 2014 *IEEE*, IEEE, 2014, pp. 409–412.
- [56] TDK. (2016). Multilayer diplexer dpx105950dt-6012a1 datasheet, [Online]. Available: https://product.tdk.com/info/en/documents/data\_sheet/rf\_dpx\_ dpx105950dt-6012a1\_en.pdf (visited on 11/09/2018).

- [57] Y. Jin and C. Nguyen, "Ultra-compact high-linearity high-power fully integrated dc– 20-GHz 0.18-μm CMOS T/R switch," *IEEE Transactions on Microwave Theory and Techniques*, vol. 55, no. 1, pp. 30–36, 2007.
- [58] Skyworks. (2016). Product brief: General purpose RF switches, [Online]. Available: http://www.skyworksinc.com/uploads/documents/PB\_RFSwitches\_PB121\_15B. pdf (visited on 11/09/2018).
- [59] N.-C. Kuo, B. Yang, A. Wang, L. Kong, C. Wu, V. P. Srini, E. Alon, B. Nikolić, and A. M. Niknejad, "A wideband all-digital CMOS RF transmitter on HDI interposers with high power and efficiency," *IEEE Transactions on Microwave Theory and Techniques*, vol. 65, no. 11, pp. 4724–4743, 2017.
- [60] H. Wang, C. Sideris, and A. Hajimiri, "A CMOS broadband power amplifier with a transformer-based high-order output matching network," *IEEE journal of solid-state circuits*, vol. 45, no. 12, pp. 2709–2722, 2010.
- [61] J. S. Park, S. Hu, Y. Wang, and H. Wang, "A highly linear dual-band mixed-mode polar power amplifier in CMOS with an ultra-compact output network," *IEEE Journal* of Solid-State Circuits, vol. 51, no. 8, pp. 1756–1770, 2016.
- [62] J.-K. Nai, Y.-H. Hsiao, Y.-S. Wang, Y.-H. Lin, and H. Wang, "A 2.8–6 GHz highefficiency CMOS power amplifier with high-order harmonic matching network," in *Microwave Symposium (IMS)*, 2016 IEEE MTT-S International, IEEE, 2016, pp. 1– 3.
- [63] W. Ye, K. Ma, and K. S. Yeo, "A 2-to-6GHz class-AB power amplifier with 28.4% PAE in 65nm CMOS supporting 256QAM," in *Solid-State Circuits Conference-(ISSCC)*, 2015 IEEE International, IEEE, 2015, pp. 1–3.
- [64] B. Kim, D.-H. Lee, S. Hong, and M. Park, "A multi-band CMOS power amplifier using reconfigurable adaptive power cell technique," *IEEE Microwave and Wireless Components Letters*, vol. 26, no. 8, pp. 616–618, 2016.
- [65] H. Xu, Y. Palaskas, A. Ravi, M. Sajadieh, M. A. El-Tanani, and K. Soumyanath, "A flip-chip-packaged 25.3 dBm class-D outphasing power amplifier in 32 nm CMOS for WLAN application," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 7, pp. 1596– 1605, 2011.
- [66] P. Madoglio, A. Ravi, H. Xu, K. Chandrashekar, M. Verhelst, S. Pellerano, L. Cuellar, M. Aguirre, M. Sajadieh, O. Degani, et al., "A 20dBm 2.4 GHz digital outphasing transmitter for WLAN application in 32nm CMOS," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, IEEE, 2012, pp. 168–170.
- [67] K. Cho and R. Gharpurey, "A 25.6 dBm wireless transmitter using RF-PWM with carrier switching in 130-nm CMOS," in *Radio Frequency Integrated Circuits Sympo*sium (RFIC), 2015 IEEE, IEEE, 2015, pp. 139–142.

- [68] S.-M. Yoo, J. S. Walling, E. C. Woo, B. Jann, and D. J. Allstot, "A switched-capacitor RF power amplifier," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 12, pp. 2977– 2987, 2011.
- [69] S.-M. Yoo, J. S. Walling, O. Degani, B. Jann, R. Sadhwani, J. C. Rudell, and D. J. Allstot, "A class-G switched-capacitor RF power amplifier," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 5, pp. 1212–1224, 2013.
- [70] W. Yuan and J. S. Walling, "A multiphase switched capacitor power amplifier," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 5, pp. 1320–1330, 2017.
- [71] D. Chowdhury, S. V. Thyagarajan, L. Ye, E. Alon, and A. M. Niknejad, "A fullyintegrated efficient CMOS inverse class-D power amplifier for digital polar transmitters," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 5, pp. 1113–1122, 2012.
- [72] C. Lu, H. Wang, C. Peng, A. Goel, S. Son, P. Liang, A. Niknejad, H. Hwang, and G. Chien, "A 24.7 dBm all-digital RF transmitter for multimode broadband applications in 40nm CMOS," in *Solid-State Circuits Conference Digest of Technical Papers* (ISSCC), 2013 IEEE International, IEEE, 2013, pp. 332–333.
- [73] L. Ye, "Design and analysis of digitally modulated transmitters for efficiency enhancement," PhD thesis, UC Berkeley, 2013.
- [74] R. Bhat and H. Krishnaswamy, "A watt-level 2.4 GHz RF I/Q power DAC transmitter with integrated mixed-domain FIR filtering of quantization noise in 65 nm CMOS," in Proc. IEEE Radio Freq. Integr. Circuits Symp., 2014, pp. 413–416.
- [75] N.-C. Kuo, B. Yang, C. Wu, L. Kong, A. Wang, M. Reiha, E. Alon, A. M. Niknejad, and B. Nikolic, "A frequency-reconfigurable multi-standard 65nm CMOS digital transmitter with LTCC interposers," in *Solid-State Circuits Conference (A-SSCC)*, 2014 IEEE Asian, IEEE, 2014, pp. 345–348.
- [76] S. Hu, S. Kousai, and H. Wang, "A broadband mixed-signal CMOS power amplifier with a hybrid class-G Doherty efficiency enhancement technique," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 3, pp. 598–613, 2016.
- [77] H. Lee, C. Park, and S. Hong, "A quasi-four-pair class-E CMOS RF power amplifier with an integrated passive device transformer," *IEEE Transactions on Microwave Theory and Techniques*, vol. 57, no. 4, pp. 752–759, 2009.
- [78] J. S. Walling, S. S. Taylor, and D. J. Allstot, "A class-G supply modulator and class-E PA in 130 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 9, pp. 2339– 2347, 2009.
- [79] P. A. Godoy, S. Chung, T. W. Barton, D. J. Perreault, and J. L. Dawson, "A 2.4-GHz, 27-dBm asymmetric multilevel outphasing power amplifier in 65-nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 10, pp. 2372–2384, 2012.

- [80] A. Banerjee, R. Hezar, L. Ding, N. Schemm, and B. Haroun, "A 29.5 dBm class-E outphasing RF power amplifier with performance enhancement circuits in 45nm CMOS," in *European Solid State Circuits Conference (ESSCIRC)*, ESSCIRC 2014-40th, IEEE, 2014, pp. 467–470.
- [81] K. Onizuka, S. Saigusa, and S. Otaka, "A 1.8 GHz linear CMOS power amplifier with supply-path switching scheme for WCDMA/LTE applications," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*, 2013 IEEE International, IEEE, 2013, pp. 90–91.
- [82] D. Kang, B. Park, D. Kim, J. Kim, Y. Cho, and B. Kim, "Envelope-tracking CMOS power amplifier module for LTE applications," *IEEE Trans. Microw. Theory Techn.*, vol. 61, no. 10, pp. 3763–3773, 2013.
- [83] K. Oishi, E. Yoshida, Y. Sakai, H. Takauchi, Y. Kawano, N. Shirai, H. Kano, M. Kudo, T. Murakami, T. Tamura, et al., "A 1.95 GHz fully integrated envelope elimination and restoration CMOS power amplifier using timing alignment technique for WCDMA and LTE," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 12, pp. 2915–2924, 2014.
- [84] S. Jin, B. Park, K. Moon, J. Kim, M. Kwon, D. Kim, and B. Kim, "A highly efficient CMOS envelope tracking power amplifier using all bias node controls," *IEEE Microwave and Wireless Components Letters*, vol. 25, no. 8, pp. 517–519, 2015.
- [85] P. Asbeck and Z. Popovic, "ET comes of age: Envelope tracking for higher-efficiency power amplifiers," *IEEE Microwave Magazine*, vol. 17, no. 3, pp. 16–25, 2016.
- [86] S. Sehajpal, S. S. Taylor, D. J. Allstot, and J. S. Walling, "Impact of switching glitches in class-G power amplifiers," *IEEE Microwave and Wireless Components Letters*, vol. 22, no. 6, pp. 282–284, 2012.
- [87] H.-C. Lu, C.-C. Kuo, S.-A. Wei, P.-S. Huang, and H. Wang, "Ultra broad band cmos balanced amplifiers using quadrature power splitters on glass integrated passive device (GIPD) and LTCC with flip chip interconnects for SiP integration," in *Microwave* Symposium Digest (MTT), 2012 IEEE MTT-S International, IEEE, 2012, pp. 1–3.
- [88] N.-C. Kuo, B. Yang, A. Wang, L. Kong, C. Wu, V. P. Srini, E. Alon, B. Nikolić, and A. M. Niknejad, "A 0.4-to-4-GHz all-digital RF transmitter package with a bandselecting interposer combining three wideband CMOS transmitters," *IEEE Transactions on Microwave Theory and Techniques*, vol. 66, no. 11, pp. 4967–4984, 2018.
- [89] S. Zihir, O. D. Gurbuz, A. Kar-Roy, S. Raman, and G. M. Rebeiz, "60-GHz 64and 256-elements wafer-scale phased-array transmitters using full-reticle and subreticle stitching techniques," *IEEE Transactions on Microwave Theory and Techniques*, vol. 64, no. 12, pp. 4701–4719, 2016.
- [90] T. Chi, F. Wang, S. Li, M.-Y. Huang, J. S. Park, and H. Wang, "A 60GHz on-chip linear radiator with single-element 27.9 dBm *Psat* and 33.1 dBm peak EIRP using multifeed antenna for direct on-antenna power combining," in *Solid-State Circuits Conference (ISSCC), 2017 IEEE International*, IEEE, 2017, pp. 296–297.

- [91] B. Hanafi, O. Guerbuez, H. Dabag, J. F. Buckwalter, G. Rebeiz, and P. Asbeck, "Q-band spatially combined power amplifier arrays in 45-nm CMOS SOI," *IEEE Transactions on Microwave Theory and Techniques*, vol. 63, no. 6, pp. 1937–1950, 2015.
- [92] M. Boers, B. Afshar, I. Vassiliou, S. Sarkar, S. T. Nicolson, E. Adabi, B. G. Perumana, T. Chalvatzis, S. Kavvadias, P. Sen, et al., "A 16TX/16RX 60 GHz 802.11 ad chipset with single coaxial interface and polarization diversity," *IEEE journal of solid-state* circuits, vol. 49, no. 12, pp. 3031–3045, 2014.
- [93] S. Shahramian, Y. Baeyens, N. Kaneda, and Y.-K. Chen, "A 70-100 GHz directconversion transmitter and receiver phased array chipset demonstrating 10 Gb/s wireless link," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 5, pp. 1113–1125, 2013.
- [94] A. Natarajan, A. Valdes-Garcia, B. Sadhu, S. K. Reynolds, and B. D. Parker, "Wband dual-polarization phased-array transceiver front-end in SiGe BiCMOS," *IEEE Transactions on Microwave Theory and Techniques*, vol. 63, no. 6, pp. 1989–2002, 2015.
- [95] J.-L. Kuo, Y.-F. Lu, T.-Y. Huang, Y.-L. Chang, Y.-K. Hsieh, P.-J. Peng, I.-C. Chang, T.-C. Tsai, K.-Y. Kao, W.-Y. Hsiung, et al., "60-GHz four-element phased-array transmit/receive system-in-package using phase compensation techniques in 65-nm flip-chip CMOS process," *IEEE Transactions on Microwave Theory and Techniques*, vol. 60, no. 3, pp. 743–756, 2012.
- [96] G. Mangraviti, K. Khalaf, Q. Shi, K. Vaesen, D. Guermandi, V. Giannini, S. Brebels, F. Frazzica, A. Bourdoux, C. Soens, et al., "A 4-antenna-path beamforming transceiver for 60GHz multi-Gb/s communication in 28nm CMOS," in *Solid-State Circuits Conference (ISSCC), 2016 IEEE International*, IEEE, 2016, pp. 246–247.
- [97] S. Kishimoto, N. Orihashi, Y. Hamada, M. Ito, and K. Maruhashi, "A 60-GHz band CMOS phased array transmitter utilizing compact baseband phase shifters," in *Radio Frequency Integrated Circuits Symposium, 2009. RFIC 2009. IEEE*, IEEE, 2009, pp. 215–218.
- [98] J. Chen, L. Ye, D. Titz, F. Gianesello, R. Pilard, A. Cathelin, F. Ferrero, C. Luxey, and A. M. Niknejad, "A digitally modulated mm-wave cartesian beamforming transmitter with quadrature spatial combining," in *Solid-State Circuits Conference Digest* of Technical Papers (ISSCC), 2013 IEEE International, IEEE, 2013, pp. 232–233.
- [99] J. Chen, Advanced architectures for efficient mm-Wave CMOS wireless transmitters. University of California, Berkeley, 2013.
- [100] K. Dasgupta, S. Daneshgar, C. Thakkar, K. Datta, J. Jaussi, and B. Casper, "A 25 Gb/s 60 GHz digital power amplifier in 28nm CMOS," in ESSCIRC 2017-43rd IEEE European Solid State Circuits Conference, IEEE, 2017, pp. 207–210.

- [101] A. Balteanu, I. Sarkas, E. Dacquay, A. Tomkins, G. M. Rebeiz, P. M. Asbeck, and S. P. Voinigescu, "A 2-bit, 24 dBm, millimeter-wave SOI CMOS power-DAC cell for watt-level high-efficiency, fully digital m-ary QAM transmitters," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 5, pp. 1126–1137, 2013.
- [102] A. Balteanu, S. Shopov, and S. P. Voinigescu, "A high modulation bandwidth, 110 GHz power-DAC cell for IQ transmitter arrays with direct amplitude and phase modulation," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 10, pp. 2103–2113, 2014.
- [103] S. Shopov, O. D. Gurbuz, G. M. Rebeiz, and S. P. Voinigescu, "A D-band digital transmitter with 64-QAM and OFDM free-space constellation formation," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 7, pp. 2012–2022, 2018.
- [104] N.-C. Kuo and A. M. Niknejad, "An E-band QPSK transmitter element in 28-nm CMOS with multistate power amplifier for digitally-modulated phased arrays," in 2018 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), IEEE, 2018, pp. 184–187.
- [105] H. L. Van Trees, Optimum array processing: Part IV of detection, estimation, and modulation theory. John Wiley & Sons, 2004.
- [106] R. S. Elliot, Antenna theory and design. John Wiley & Sons, 2006.
- [107] A. Suarez, "Check the stability: Stability analysis methods for microwave circuits," *IEEE Microwave Magazine*, vol. 16, no. 5, pp. 69–90, 2015.
- [108] A. Suárez, Analysis and design of autonomous microwave circuits. John Wiley & Sons, 2009, vol. 190.
- [109] W. Zhang, S.-C. Wong, K. T. Chi, and Q. Chen, "Analysis and comparison of secondary series-and parallel-compensated inductive power transfer systems operating for optimal efficiency and load-independent voltage-transfer ratio," *IEEE Transactions on Power Electronics*, vol. 29, no. 6, pp. 2979–2990, 2014.
- [110] N.-C. Kuo, B. Zhao, and A. M. Niknejad, "Bifurcation analysis in weakly-coupled inductive power transfer systems," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 63, no. 5, pp. 727–738, 2016.
- [111] N.-C. Kuo and A. M. Niknejad, "Low-leakage constellation formation exploiting combination redundancy in a digitally-modulated mmw phased array," Manuscript submitted for publication, 2018.
- [112] M. Nariman, F. Shirinfar, A. P. Toda, S. Pamarti, A. Rofougaran, and F. De Flaviis, "A compact 60GHz wireless power transfer system," *IEEE Transactions on Microwave Theory and Techniques*, vol. 64, no. 8, pp. 2664–2677, 2016.
- [113] M. Nariman, F. Shirinfar, S. Pamarti, A. Rofougaran, and F. De Flaviis, "Highefficiency millimeter-wave energy-harvesting systems with milliwatt-level output power," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 64, no. 6, pp. 605– 609, 2017.

- [114] K. Kotani, A. Sasaki, and T. Ito, "High-efficiency differential-drive CMOS rectifier for UHF RFIDs," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 11, pp. 3011–3018, 2009.
- [115] S. C. Thierauf, *High-speed circuit board signal integrity*. Artech House, 2017.
- [116] S. J. Orfinidas. (2016). Electromagnetic waves and antennas, [Online]. Available: http://www.ece.rutgers.edu/~orfanidi/ewa/ (visited on 11/09/2018).
- [117] H.-C. Hsieh, C.-N. Chiu, C.-H. Wang, and C. H. Chen, "A new approach for fast analysis of spurious emissions from RF/microwave circuits," *IEEE Transactions on Electromagnetic Compatibility*, vol. 51, no. 3, pp. 631–638, 2009.
- [118] A. M. Niknejad and R. G. Meyer, "Analysis, design, and optimization of spiral inductors and transformers for Si RF ICs," *IEEE Journal of Solid-State Circuits*, vol. 33, no. 10, pp. 1470–1481, 1998.
- [119] S. S. Mohan, M. del Mar Hershenson, S. P. Boyd, and T. H. Lee, "Simple accurate expressions for planar spiral inductances," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 10, pp. 1419–1424, 1999.
- [120] J. Kimionis and M. M. Tentzeris, "RF tag front-end design for uncompromised communication and harvesting," in *RFID Technology and Applications Conference* (*RFID-TA*), 2014 IEEE, IEEE, 2014, pp. 109–114.
- [121] I. Mayordomo, R. Berenguer, A. García-Alonso, I. Fernández, and Í. Gutiérrez, "Design and implementation of a long-range RFID reader for passive transponders," *IEEE Transactions on Microwave Theory and Techniques*, vol. 57, no. 5, pp. 1283– 1290, 2009.
- [122] N. F. B. Bautista and J. J. S. Marciano, "Enhanced FM0 decoder for UHF passive RFID readers using duty cycle estimations," in *RFID-Technologies and Applications* (*RFID-TA*), 2011 IEEE International Conference on, IEEE, 2011, pp. 306–312.
- [123] A. Boaventura, D. Belo, R. Fernandes, A. Collado, A. Georgiadis, and N. B. Carvalho, "Boosting the efficiency: Unconventional waveform design for efficient wireless power transfer," *IEEE Microwave Magazine*, vol. 16, no. 3, pp. 87–96, 2015.
- [124] M. S. Trotter, J. D. Griffin, and G. D. Durgin, "Power-optimized waveforms for improving the range and reliability of RFID systems," in *RFID*, 2009 IEEE International Conference on, IEEE, 2009, pp. 80–87.
- [125] S. C. Cripps, Advanced techniques in RF power amplifier design. Artech House, 2002.
- [126] J. Bae, H. Koo, H. Lee, W. Lim, W. Lee, H. Kang, K. C. Hwang, K.-Y. Lee, and Y. Yang, "High-efficiency rectifier (5.2 GHz) using a class-F Dickinson charge pump," *Microwave and Optical Technology Letters*, vol. 59, no. 12, pp. 3018–3023, 2017.
- [127] B. Razavi, *RF Microelectronics*. Upper Saddle River, NJ, 2011.

- [128] C. J. Galbraith, T. M. Hancock, and G. M. Rebeiz, "A low-loss double-tuned transformer," *IEEE Microwave and Wireless Components Letters*, vol. 17, no. 11, pp. 772– 774, 2007.
- [129] J. S. Park, Y. Wang, S. Pellerano, C. Hull, and H. Wang, "A CMOS wideband current-mode digital polar power amplifier with built-in AM–PM distortion selfcompensation," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 2, pp. 340–356, 2018.
- [130] B. François and P. Reynaert, "A fully integrated watt-level linear 900-MHz CMOS RF power amplifier for LTE-applications," *IEEE Transactions on Microwave Theory* and Techniques, vol. 60, no. 6, pp. 1878–1885, 2012.
- [131] E. Kaymaksut and P. Reynaert, "Dual-mode CMOS Doherty LTE power amplifier with symmetric hybrid transformer," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 9, pp. 1974–1987, 2015.
- [132] H. Ahn, S. Baek, H. Ryu, I. Nam, and O. Lee, "A highly efficient WLAN CMOS PA with two-winding and single-winding combined transformer," in *Radio Frequency Integrated Circuits Symposium (RFIC), 2016 IEEE*, IEEE, 2016, pp. 310–313.
- [133] P. Oßmann, J. Fuhrmann, K. Dufrêne, J. Fritzin, J. Moreira, H. Pretl, and A. Springer, "Design of a fully integrated two-stage watt-level power amplifier using 28-nm CMOS technology," *IEEE Transactions on Microwave Theory and Techniques*, vol. 64, no. 1, pp. 188–199, 2016.
- [134] H. Qian, Q. Liu, J. Silva-Martinez, and S. Hoyos, "A 35 dBm output power and 38 db linear gain PA with 44.9% peak PAE at 1.9 GHz in 40 nm CMOS," *IEEE Journal* of Solid-State Circuits, vol. 51, no. 3, pp. 587–597, 2016.
- [135] R. Bhat, J. Zhou, and H. Krishnaswamy, "A > 1W 2.2 GHz switched-capacitor digital power amplifier with wideband mixed-domain multi-tap FIR filtering of OOB noise floor," in *Solid-State Circuits Conference (ISSCC)*, 2017 IEEE International, IEEE, 2017, pp. 234–235.
- [136] A. Passamani, D. Ponton, E. Thaller, G. Knoblinger, A. Neviani, and A. Bevilacqua, "A 1.1 V 28.6 dBm fully integrated digital power amplifier for mobile and wireless applications in 28nm CMOS technology with 35% PAE," in *Solid-State Circuits Conference (ISSCC), 2017 IEEE International*, IEEE, 2017, pp. 232–233.
- [137] V. Vorapipat, C. S. Levy, and P. M. Asbeck, "A class-G voltage-mode Doherty power amplifier," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 12, pp. 3348–3360, 2017.
- [138] B. Park, D. Kim, S. Kim, Y. Cho, J. Kim, D. Kang, S. Jin, K. Moon, and B. Kim, "High-performance CMOS power amplifier with improved envelope tracking supply modulator," *IEEE Transactions on Microwave Theory and Techniques*, vol. 64, no. 3, pp. 798–809, 2016.

- [139] X. Liu, H. Zhang, M. Zhao, X. Chen, P. K. Mok, and H. C. Luong, "A 2.4 V 23.9 dBm 35.7%-PAE-32.1 dBc-ACLR LTE-20Mhz envelope-shaping-and-tracking system with a multiloop-controlled AC-coupling supply modulator and a mode-switching PA," in *Solid-State Circuits Conference (ISSCC), 2017 IEEE International*, IEEE, 2017, pp. 38–39.
- [140] J. Chen and A. M. Niknejad, "A compact 1V 18.6 dBm 60GHz power amplifier in 65nm CMOS," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE International, IEEE, 2011, pp. 432–433.
- [141] C.-W. Wu, Y.-H. Lin, Y.-H. Hsiao, C.-F. Chou, Y.-C. Wu, and H. Wang, "Design of a 60-GHz high-output power stacked-FET power amplifier using transformer-based voltage-type power combining in 65-nm CMOS," *IEEE Transactions on Microwave Theory and Techniques*, no. 99, pp. 1–13, 2018.
- [142] D. Zhao and P. Reynaert, "21.3 dBm 18.5 GHz-BW 8-way E-band power amplifier in 28 nm high performance mobile CMOS," *Electronics Letters*, vol. 53, no. 19, pp. 1310– 1312, 2017.
- [143] S. V. Thyagarajan, A. M. Niknejad, and C. D. Hull, "A 60 GHz drain-source neutralized wideband linear power amplifier in 28 nm CMOS," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 61, no. 8, pp. 2253–2262, 2014.
- [144] S. Kang, S. V. Thyagarajan, and A. M. Niknejad, "A 240 GHz fully integrated wideband QPSK transmitter in 65 nm CMOS," J. Solid-State Circuits, vol. 50, no. 10, pp. 2256–2267, 2015.
- [145] T. LaRocca, Y.-C. Wu, K. Thai, R. Snyder, N. Daftari, O. Fordham, P. Rodgers, M. Watanabe, Y. Yang, M. Ardakani, et al., "A 64QAM 94GHz CMOS transmitter SoC with digitally-assisted power amplifiers and thru-silicon waveguide power combiners," in Radio Frequency Integrated Circuits Symposium, 2014 IEEE, IEEE, 2014, pp. 295–298.
- [146] A. Medra, V. Giannini, D. Guermandi, and P. Wambacq, "A 79GHz variable gain low-noise amplifier and power amplifier in 28nm CMOS operating up to 125 deg C," in *European Solid State Circuits Conference (ESSCIRC), ESSCIRC 2014-40th*, IEEE, 2014, pp. 183–186.
- [147] Y. Chao, L. Li, and H. C. Luong, "An 86-to-94.3 GHz transmitter with 15.3 dBm output power and 9.6% efficiency in 65nm CMOS," in *Solid-State Circuits Conference* (ISSCC), 2016 IEEE International, IEEE, 2016, pp. 346–347.
- [148] K. K. Tokgoz, S. Maki, S. Kawai, N. Nagashima, J. Emmei, M. Dome, H. Kato, J. Pang, Y. Kawano, T. Suzuki, et al., "A 56Gb/s W-band CMOS wireless transceiver," in Solid-State Circuits Conference (ISSCC), 2016 IEEE International, IEEE, 2016, pp. 242–243.

[149] B. Rupakula, A. H. Aljuhani, and G. M. Rebeiz, "Linearity and efficiency improvements in phased-array transmitters with large number of elements and complex modulation," in 2018 IEEE/MTT-S International Microwave Symposium-IMS, IEEE, 2018, pp. 791–793.