# Energy-Efficient mm-Wave Systems for Communication and Sensing



Maryam Tabesh

# Electrical Engineering and Computer Sciences University of California at Berkeley

Technical Report No. UCB/EECS-2016-171 http://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-171.html

December 1, 2016

Copyright © 2016, by the author(s). All rights reserved.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.

### Energy-Efficient mm-Wave Systems for Communication and Sensing

by

Maryam Tabesh

A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy

in

Engineering - Electrical Engineering and Computer Sciences

in the

Graduate Division

of the

University of California, Berkeley

Committee in charge:

Professor Ali M. Niknejad, Chair Professor Jan M. Rabaey Professor Paul K. Wright

Fall 2014

# Energy-Efficient mm-Wave Systems for Communication and Sensing

Copyright 2014 by Maryam Tabesh

#### Abstract

#### Energy-Efficient mm-Wave Systems for Communication and Sensing

by

Maryam Tabesh

Doctor of Philosophy in Engineering - Electrical Engineering and Computer Sciences

University of California, Berkeley

Professor Ali M. Niknejad, Chair

Wireless technology is facing several important trends. Growing demand for throughput and capacity on one end, and the ever-increasing push towards ubiquitous connectivity on the other end, have strained current systems and standards. Energy efficiency remains a critical issue for all of these applications, and has become an important challenge to balance in face of higher performance requirements. New applications and standards require radio systems that can provide multi-Gb/s wireless links on mobile devices with minimal impact on battery life. On the other side, connectivity is projected to move from people to objects in the realm of Internet of Things (IoT), scaling to trillions of radios in the next decade. This work addresses capacity and connectivity demands for next-generation radios.

In the area of IoT, ultra-low power highly miniaturized smart radios that can provide unique IP addresses and their locations are an important requirement of the system. In this context, battery-less radios are the ultimate frontier in scaling the size and cost of a communication node. However, there are several key challenges that still need to be addressed in this area. Cost (dominated by antenna board and interface), number of readable transponders (and latency in doing so), data-rate capacity, localization and miniaturization are the issues faced by today's designers. To overcome these challenges, a new approach for miniaturization of smart passive radios is proposed. We demonstrate a single-chip 24GHz/60GHz radio implemented in 65nm CMOS. This millimeter-sized chip radio is fully self-sufficient with no pads or any external components (e.g. power supply).

In order to address capacity demands, the 60GHz band with 7GHz of unlicensed spectrum provides a unique opportunity. A shift to mm-wave systems also enables beamforming by phased-array systems, which is critical in interfacing with the proposed passive radio. However, integration, energy efficiency, and system cost remain to be obstacles in this space. In the second part of this thesis, we propose new system and circuit solutions and strategies for this application, and demonstrate an example design with a 4-element 60GHz phased-array receiver in 65nm CMOS. Energy and area efficiency is achieved by utilizing a baseband phase shifting architecture, holistic impedance optimization, and lumped- element based design. Finally, to enable scalability to larger arrays and to support hybrid RF/IF phase shifting, a lumped-element architecture for 60GHz phase-shifters is proposed and is shown to achieve low loss in a compact form factor.

# Contents

| Co       | Contents |                                                                          |      |
|----------|----------|--------------------------------------------------------------------------|------|
| Li       | st of    | Figures                                                                  | iv   |
| Li       | st of    | Tables                                                                   | viii |
| 1        | Intr     | oduction                                                                 | 1    |
|          | 1.1      | Capacity Requirements and mm-Wave Spectrum                               | 2    |
|          | 1.2      | Highly Miniaturized Radios for Sensors and IoT                           | 3    |
|          |          | 1.2.1 Recent Market Trends                                               | 5    |
|          |          | 1.2.2 Connectivity for IoT                                               | 6    |
|          | 1.3      | Organization                                                             | 6    |
| <b>2</b> | mm       | -Wave Miniaturized Radio                                                 | 8    |
|          | 2.1      | System-Level Observations for Wireless Connectivity                      | 8    |
|          | 2.2      | Present State of Knowledge in the Field                                  | 10   |
|          | 2.3      | Design of a mm-Sized Radio                                               | 11   |
|          |          | 2.3.1 Friis Propagation Law                                              | 12   |
|          |          | 2.3.2 Effect of Wavelength Match in Communicating with Miniaturized Sen- |      |
|          |          | sors                                                                     | 13   |
|          |          | 2.3.3 Dielectric Loading of Antennas                                     | 21   |
|          |          | 2.3.4 Choice of Downlink Frequency                                       | 22   |
|          |          | 2.3.5 Uplink Communications and Bandwidth Requirements                   | 23   |
|          | 2.4      | Link Budget Calculations                                                 | 26   |
|          |          | 2.4.1 Uplink                                                             | 26   |
|          |          | 2.4.2 Downlink                                                           | 27   |
| 3        | Ultr     | a Low-Power Transponder Design                                           | 28   |
|          | 3.1      | Power Recovery and Regulation                                            | 29   |
|          |          | 3.1.1 Voltage Multiplier                                                 | 29   |
|          |          | 3.1.1.1 Voltage Multiplier Theory                                        | 29   |
|          |          | 3.1.1.2 Input Matching Network                                           | 31   |

|       | 3.1.1.3 24GHz AC-DC Convertor Design                                                                              |
|-------|-------------------------------------------------------------------------------------------------------------------|
|       | 3.1.2 Limiter                                                                                                     |
|       | 3.1.3 Reference Current and Voltage Generator                                                                     |
|       | 3.1.4 Series Voltage Regulator (LDO)                                                                              |
| 3.2   | Power On Reset (POR)                                                                                              |
| 3.3   | Data Recovery and Demodulation                                                                                    |
|       | 3.3.1 Encoding and Modulation Type                                                                                |
|       | 3.3.2 ASK Demodulator                                                                                             |
|       | 3.3.3 PPE Decoder                                                                                                 |
|       | 3.3.4 State Machine                                                                                               |
| 3.4   | Timing, Synchronization, and Multi-Access                                                                         |
|       | 3.4.1 Multiaccess Algorithm                                                                                       |
|       | 3.4.2 Modified M-PPM Modulation                                                                                   |
|       | 3.4.3 Notch Period or Duration of Time slot                                                                       |
|       | 3.4.4 Ring Oscillator Frequency                                                                                   |
|       | 3.4.4.1 Accumulated Jitter                                                                                        |
|       | $3.4.4.2$ Power Consumption $\ldots \ldots \ldots$ |
|       | 3.4.4.3 Start up Time of 60GHz Pulser                                                                             |
|       | 3.4.5 Notch Detector $\ldots$                                                                                     |
|       | 3.4.6 Ring Oscillator and Calibration Block                                                                       |
|       | 3.4.7 M-PPM Modulator                                                                                             |
| 3.5   | 60GHz Transmitter                                                                                                 |
| 3.6   | On-Chip Antenna Design                                                                                            |
|       | 3.6.1 24GHz Folded Dipole Antenna                                                                                 |
|       | 3.6.2 Half-wave 60GHz Dipole Antenna                                                                              |
| 4 Ex  | perimental Results 7                                                                                              |
| 4.1   | Power Recovery and Demodulator Measurements                                                                       |
| 4.2   | Uplink Measurements                                                                                               |
| 4.3   | Wireless Measurement                                                                                              |
| 4.4   | Single-Chip Measurement                                                                                           |
| 5 600 | GHz Phased-Array TRX 8                                                                                            |
| 5.1   | Introduction                                                                                                      |
| 5.2   | Theory of Phased-Array Antennas                                                                                   |
| 5.3   | mm-Wave Phased-Array Architectural Choices                                                                        |
|       | 5.3.1 Architecture Choice                                                                                         |
|       | 5.3.2 Number of Elements                                                                                          |
|       | 5.3.3 Phase Resolution                                                                                            |
| 5.4   | Link Budget Calculation                                                                                           |
| 5.5   | Transceiver                                                                                                       |
| 5.6   | Receiver                                                                                                          |

|    |       | 5.6.1   | LNA                           | 94  |
|----|-------|---------|-------------------------------|-----|
|    |       | 5.6.2   | Mixer                         | 96  |
|    |       |         | 5.6.2.1 Mixer Architecture    | 96  |
|    |       |         | 5.6.2.2 Mixer Design          | 96  |
|    |       | 5.6.3   | Baseband Phase Rotator        | 99  |
|    | 5.7   | Experi  | imental Results               | 100 |
|    | 5.8   | Hybrid  | l Array                       | 106 |
| 6  | 60G   | Hz Ph   | ase Shifters                  | 108 |
|    | 6.1   | RF Ph   | ase Shifters                  | 108 |
|    | 6.2   | Reflect | tive Type Phase Shifter       | 110 |
|    | 6.3   | Hybrid  | l Design                      | 111 |
|    | 6.4   | Cascad  | le RTPS                       | 114 |
|    | 6.5   | RLT R   | Reflective-Type Phase Shifter | 116 |
|    | 6.6   | Experi  | imental Results               | 118 |
| 7  | Con   | clusior | a                             | 124 |
| Bi | bliog | raphy   |                               | 126 |

# List of Figures

| $1.1 \\ 1.2 \\ 1.3$ | Global mobile yearly data traffic (from [5])                                                                                                                                   | $2 \\ 3 \\ 5$ |
|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|
| 9.1                 | Concrel constraints of the ministurized radio                                                                                                                                  | 0             |
| $\frac{2.1}{2.2}$   | Mismatch between wavelength and the available aperture size                                                                                                                    | 9<br>12       |
| 2.2                 | Cross-sectional area of TX and RX antennas (a) TX and RX are both large                                                                                                        | 14            |
|                     | antennas (b) TX is large antenna but RX is a small antenna                                                                                                                     | 13            |
| 2.4                 | Tradeoff between power recovery efficiency and available received power                                                                                                        | 14            |
| 2.5                 | The received power in terms of frequency and range for a fixed aperture size on                                                                                                |               |
|                     | the transmitter.                                                                                                                                                               | 17            |
| 2.6                 | The received power in terms of frequency and range for a fixed transmitter's EIRP.                                                                                             | 18            |
| 2.7                 | First-order model for the chip, antenna, and L-matching network.                                                                                                               | 19            |
| 2.8                 | Input voltage of multiplier as a function of frequency for different $R_s$ values                                                                                              | 20            |
| 2.9                 | Input voltage of multiplier as a function of frequency, for frequency dependent $R_s$ value (blue trace) and constant $R_s$ value (red trace), for different inductor quality  |               |
|                     | factors                                                                                                                                                                        | 21            |
| 2.10                | Input voltage of multiplier as a function of frequency for frequency dependent $R_s$ value (blue trace) and constant $R_s$ value (red trace), for different chip input         |               |
|                     | capacitance                                                                                                                                                                    | 22            |
| 2.11                | SNR as a function of spectral efficiency                                                                                                                                       | 25            |
| 3.1                 | Block diagram of the passive RFID                                                                                                                                              | 29            |
| 3.2                 | Block diagram of the $V_{DD}$ generator $\ldots \ldots \ldots$ | 30            |
| 3.3                 | Schematic of one stage Dickson voltage multiplier (a) Diodes (b) Using CMOS                                                                                                    |               |
|                     | diode connected devices                                                                                                                                                        | 31            |
| 3.4                 | Equivalent model for input impedance of voltage multiplier                                                                                                                     | 32            |
| 3.5                 | Equivalent model for input transformer                                                                                                                                         | 32            |
| 3.6                 | Thevenin equivalent of input matching network                                                                                                                                  | 33            |
| 3.7                 | Voltage gain of matching network and voltage multiplier as a function of number                                                                                                |               |
|                     | of stages for $450 \text{k}\Omega$ load and 0.9V output voltage                                                                                                                | 34            |
| 3.8                 | Schematic of the DC limiter                                                                                                                                                    | 35            |

| 3.9<br>3.10 | Output voltage of the limiter as a function of rectifier's input power                 | $\frac{36}{37}$ |
|-------------|----------------------------------------------------------------------------------------|-----------------|
| 3.11        | Reference current and voltage as a function of both temperature and $V_{dd}$ variation | 01              |
|             | for SS, TT, and FF corners                                                             | 38              |
| 3.12        | Schematic of the Low-Drop-Out (LDO) voltage regulator                                  | 39              |
| 3.13        | Open loop gain and phase of the regulator for nominal $1.4\mu$ A output current and    | 10              |
| 0.1.1       | InF load capacitance                                                                   | 40              |
| 3.14        | The closed loop bandwidth and phase margin as a function of output load current        | 41              |
| 3.15        | Schematic of the power on reset circuit                                                | 43              |
| 3.16        | Transient output of POR for TT, SS, and FF corners                                     | 43              |
| 3.17        | Block diagram of the demodulator                                                       | 44              |
| 3.18        | Baseband waveform of (a) NRZ encoding (b) Manchester encoding (c) Pulse pause          |                 |
|             | encoding                                                                               | 45              |
| 3.19        | Schematic of the ASK Demodulator                                                       | 47              |
| 3.20        | Schematic of the PPE Decoder                                                           | 48              |
| 3.21        | Command signals from reader to tag                                                     | 49              |
| 3.22        | Schematic of the timing block                                                          | 51              |
| 3.23        | Using electronic beamforming to communicate with radio tags sector by sector           | 52              |
| 3.24        | ALOHA anti-collision algorithms (a) PA (b) SA (c) FSA                                  | 53              |
| 3.25        | M-PPM system with (a) uncorrected fix CLK frequency on the reader, and (b)             |                 |
|             | Modified version with 3 pulses and a variable corrected CLK frequency on the           |                 |
|             | reader                                                                                 | 55              |
| 3.26        | (a) Block diagram of the notch detector (b) schematic of the synchronous binary        |                 |
|             | counter (c) schematic of the JK flip-flop                                              | 58              |
| 3.27        | (a) The schematic of the ring oscillator and the calibration block, (b) simulation     |                 |
|             | result of the ring oscillator for FF process corner                                    | 60              |
| 3.28        | Simulation of the ring oscillator output frequency before and after calibration (a)    |                 |
|             | FF corner (b) SS corner                                                                | 61              |
| 3.29        | Block diagram of the M-PPM modulator                                                   | 62              |
| 3.30        | Block diagram and schematic of the TX blocks                                           | 64              |
| 3.31        | Oscillation frequency as a function of supply voltage                                  | 65              |
| 3.32        | Ouput pulse of the 60GHz pulser                                                        | 66              |
| 3.33        | Overview of the final chip architecture                                                | 69              |
| 3.34        | Overall simulated pattern and gain for the 24 GHz and 60 GHz chip antennas .           | 69              |
| 4.1         | Chip micrograph of the pad-less transponder.                                           | 71              |
| 4.2         | Measured chip sensitivity and regulator output.                                        | 72              |
| 4.3         | Down-converted measurements for the single ID pulse mode. The red traces show          |                 |
|             | the envelope of the downlink data sequence to the chip. The black traces are the       |                 |
|             | downconverted measurements of the single 60GHz uplink pulses from the chip.            |                 |
|             | The bottom row shows the zoomed version of the same measurement                        | 73              |
| 4.4         | Direct measurement of output pulses using sampling oscilloscope                        | 74              |

| 4.5<br>4.6<br>4.7<br>4.8 | Downconverted output pulse for $N = 2$ and $D = 3$ (red: DL, black: UL)<br>Downconverted output pulses with multiple tag ID number Downconverted output pulses with multiple output data | 74<br>75<br>76<br>77 |
|--------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|
| 4.9<br>4.10              | Wireless setup for measuring of single-chip tag                                                                                                                                          | 78<br>80             |
| $5.1 \\ 5.2$             | An N-element time-array system                                                                                                                                                           | 83<br>84             |
| 5.3                      | Block diagram of 2-element RX, LO distribution, and TX for (a) BB phase shift-<br>ing (b) RF phase shifting (c) LO phase shifting architectures                                          | 86                   |
| 0.4                      | ber of elements                                                                                                                                                                          | 80                   |
| 5.5                      | Maximum allowable bandwidth as a function of number of elements for incident angle of $\pm 90^{\circ}$                                                                                   | 90                   |
| 5.6                      | SNR improvement as a function of incident angle for multiple phase resolutions<br>(a) $N = 4$ (b) $N = 16$                                                                               | 91                   |
| 5.7                      | BER as a function of SNR for BPSK, QPSK, and 16QAM modulations                                                                                                                           | 92                   |
| 5.8                      | Block diagram of 4-element phased array transceiver                                                                                                                                      | 94                   |
| 5.9                      | Schematic of one receiver element                                                                                                                                                        | 95                   |
| 5.10                     | Schematic of the 3-stage LNA                                                                                                                                                             | 95                   |
| 5.11                     | Simulated power gain, noise figure, and input $S_{11}$ of LNA for four different bias                                                                                                    | 07                   |
| F 10                     | currents                                                                                                                                                                                 | 97                   |
| 5.12                     | Schematic of single balanced mixer                                                                                                                                                       | 98                   |
| 5.13<br>5.14             | Transceiver die photo (left side: RX, center: LO generation and distri- bution,                                                                                                          | 100                  |
|                          | right: TX).                                                                                                                                                                              | 101                  |
| 5.15                     | Measured $S_{11}$ of the 4-element receiver                                                                                                                                              | 101                  |
| 5.16                     | Single-element RX gain and BW measurement setup (a) Overal BW measurement<br>(b) BF BW measurement                                                                                       | 102                  |
| 5.17                     | Single-element RX measured gain and BW: Overall BW (red) and RF front end<br>BW (black).                                                                                                 | 102                  |
| 5.18                     | Measured receiver noise figure                                                                                                                                                           | 103                  |
| 5.19                     | Measured RX phase constellations for all four elements                                                                                                                                   | 104                  |
| 5.20                     | Measured RX's two-element pattern                                                                                                                                                        | 104                  |
| 5.21                     | Synthesized array patterns with array steered to: (a) $30^{\circ}$ (b) $-45^{\circ}$                                                                                                     | 105                  |
| 5.22                     | Block diagram of a 16-element RF-IF hybrid array                                                                                                                                         | 107                  |
| 6.1                      | Block diagram of vector modulator-based phase shifter                                                                                                                                    | 109                  |

| 6.2  | Schematic of varactor loaded transmission line phase shifter (a) Distributed model                                                                                  |     |
|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
|      | (b) Lumped model                                                                                                                                                    | 109 |
| 6.3  | Block diagram of reflective type phase shifter                                                                                                                      | 110 |
| 6.4  | Schematics of different reflective loads (a) Varactor load (b) SRL (c) RLT                                                                                          | 111 |
| 6.5  | Circuit schematic of the transformer-based hybrid.                                                                                                                  | 112 |
| 6.6  | Equivalent circuit for even/odd mode excitation (a) $\Gamma_{ee}$ (b) $\Gamma_{eo}$ (c) $\Gamma_{oe}$ (d) $\Gamma_{oo}$ .                                           | 113 |
| 6.7  | Simulation results for $L, C_1$ , and $C_2$ as a function of $C_3$ and transformer coupling                                                                         |     |
|      | factor $(k)$                                                                                                                                                        | 115 |
| 6.8  | Circuit schematics of the cascaded RTPS                                                                                                                             | 116 |
| 6.9  | Required varactor tuning range (TR) as a function of minimum capacitance for                                                                                        |     |
|      | different number of cascade stages.                                                                                                                                 | 117 |
| 6.10 | Total and per stage loss for different number of cascade stages                                                                                                     | 117 |
| 6.11 | Circuit schematics of the RLT RTPS                                                                                                                                  | 118 |
| 6.12 | (a) Phase shift as a function of $C_v$ for different values of $C_T$ (b) Phase shift as a                                                                           |     |
|      | function of $C_T$ for $C_{min} = 48 f F$ and $TR = 3.3 \dots \dots$ | 119 |
| 6.13 | Chip micrographs (Top: RLT, Bottom: Cascade)                                                                                                                        | 121 |
| 6.14 | Measured cascaded RTPS performance (a) phase shift vs. control voltage, (b)                                                                                         |     |
|      | phase shift vs. frequency, (c) Loss vs. control voltage, (d) Loss vs. frequency, (e)                                                                                |     |
|      | $S_{11}$ vs. control voltage, and (f) $S_{11}$ vs. frequency.                                                                                                       | 122 |
| 6.15 | Measured RLT RTPS performance (a) phase shift vs. control voltage, (b) phase                                                                                        |     |
|      | shift vs. frequency, (c) Loss vs. control voltage, (d) Loss vs. frequency, (e) $S_{11}$                                                                             |     |
|      | vs. control voltage, and (f) $S_{11}$ vs. frequency. $\ldots$ $\ldots$ $\ldots$ $\ldots$                                                                            | 123 |

# List of Tables

| $2.1 \\ 2.2$ | Summary of transmitter link budget   | 27<br>27  |
|--------------|--------------------------------------|-----------|
| 4.1          | Summary of chip performance          | 79        |
| $5.1 \\ 5.2$ | Link budget analysis                 | 92<br>106 |
| 6.1          | Summary of measured chip performance | 120       |

## Acknowledgments

During my time at Berkeley I've had the privilege of interacting with and learning from many brilliant and talented individuals who made my PhD experience so much richer. First, I would like to thank my adviser Professor Niknejad. He is kind and generous, and has always been a constant source of encouragement and support. I am thankful for his advice as well as wonderful teaching in EE142 and EE242.

I would like to also thank professors Jan Rabaey and Paul Wright for taking part in my dissertation and qualifying exam committees, and their time and feedback. I am grateful to professor Bahai for being a member of my qualifying exam committee and his support. I am thankful for having the chance to work with professor Elad Alon in the 60GHz project and for his support.

Most important of all, I would like to thank my best friend, colleague, husband, and my love, Amin, who has been with me in every step of the way in the past twelve years. I'd like to thank him for his endless support and unconditional love. We've had one hell of a ride together and I look forward for many exciting experiences to come!

My special thanks goes to Professor Haideh Khorramabadi, a terrific teacher and more importantly, an infinite source of never-ending kindness and support. She is just an amazing person. I am very fortunate to know Haideh.

I would like to thank my friends and colleagues at BWRC. Jiashu Chen and Cristian Marcu were brilliant collaborators and great friends. I enjoyed our discussions on various topics. I interacted with a great group of students at BWRC including Lingkai Kong, Shinwon Kong, Steven Callender, Rikky Muller, Siva Thyagarjan, Milos Jorgovanovic, and Simone Gambini. I would like to thank them all. I am thankful to Ehsan Adabi and Bagher Afshar for many discussions and great feedback.

My warmest thanks go to fantastic friends I have been fortunate to have in my life, including Ashkan, Sanaz, Alireza, Shadi, Arash, Pedram, Maryam Ziaei, and Maryam Vareth. Life was way more enjoyable with such amazing friends.

My gratitude goes to former and current BWRC faculty, staff and students and in particular Tom Boot, Gary Kelson, professor David Allstot, Brian Richards, Bira, Sarah, Leslie, and Olivia.

I am also thankful to Ali Tassoudji and Bo Sun, my mentors for the Qualcomm Innovation Fellowship. They both provided important and helpful feedback on the progress of this project. My greatest thanks and gratitude go to my parents Fereshteh and Sirous and also my awesome brother Saman. This work is the result of their endless love and constant support.

# Chapter 1 Introduction

Since the early experiments of Marconi and the fascinating demonstration of the first cross-Atlantic transmission with "Hertzian waves", wireless technology has come a long way, and is now an integrated part of our lives. A century after Marconi's first cross-Atlantic message in 1901, today, wireless technology provides coverage to billions of people across the planet. Major milestones that reflect the tremendous growth of wireless technology, for example the number of wireless subscriptions passing landline, are already behind us.

The explosive growth of mobile connectivity in the past decade is clearly visible in numbers: in 1990 mobile subscriptions numbered 12.4 million while in 2011 this rocketed to four billion. CISCO reported an 81% increase in mobile data usage in 2013, and a projection that global mobile data traffic is on track to increase 11-fold between 2013 and 2018 [1].

Projections reflect a continuing strong growth in connectivity worldwide, and new factors that could surpass previous growth catalysts (e.g., mobile devices) are seen to be the main drivers for ubiquitous wireless connectivity. Some of the major factors in this space include the fast-growing rate of mobile data traffic that necessitates better portal devices to access and process this huge data, as well as a strong communication backbone to enable the growth in data transfer. Another trend, with potential of offering a new inflection point in number of connected devices, is the technology to connect everyday objects to the Internet and to facilitate machine-to-machine networks. This falls under the umbrella of what is widely known as Internet-of-Things (IoT) or Internet-of-Everything (IoE) [2] [3] [4] [5]. These two trends are shown in Fig. 1.1 and Fig. 1.2. The first figure shows the exponential growth in mobile data traffic, roughly doubling every two years, and how it is now dominated by large tablet devices. This trend is projected to grow further, and will exert pressure on capacity of current wireless networks. Figure 1.2 portrays the IoT trend, displaying the growth in connecting objects and "things" to the Internet. Both of these trends will be discussed in this chapter.



Figure 1.1: Global mobile yearly data traffic (from [5])

## 1.1 Capacity Requirements and mm-Wave Spectrum

With the ever-increasing demand for connection capacity in various electronic systems, from data servers, to mobile devices, and personal area networks (PAN), there has been a constant quest for high-throughput wireless links that can facilitate current needs as well as to enable new applications and future growth [1] [6]. In many ways, current systems have optimized spectral usage to great extents, and have utilized available degrees of freedom in modulation schemes, network topology, and spatial diversity, and we're now at a point where the future needs in capacity have to be (at least partially) fueled by new spectrum. Significant gains in available bandwidth can only be achieved by migrating to higher carrier frequencies for several reasons. First, the frequencies below 6 GHz are already very populated by current wireless radio and radar standards. Second, with a fixed fractional bandwidth, moving to higher carrier frequencies will lead to large available absolute bandwidths. Higher frequencies will also enable better electronic beamforming with a given aperture size, and that can improve range and capacity by providing access to spatial diversity and point-to-point wireless links. For these and other reasons that will be described later, frequency bands into the mm-wave are currently being investigated for a variety of communication applications.

Millimeter-wave technology is projected to have a strong growth in the mobile sector in the coming years. For example the global mm-wave technology revenue market is expected to grow from \$208.12 million in 2014 to \$1,941.23 million in 2020 [7]. Mobile devices, tablets, personal computers, TV sets, and many other electronic devices are expected to carry mm-



Figure 1.2: The IoT trend [2]

wave phased-array systems within the next 5 years. Several industry-led standards, including WirelessHD and IEEE802.11ad/WiGig (wireless Gigabit Alliance), have already been released [8] [9] [10]. These standards are directly addressing wireless capacity needs and are expected to deliver data rates at several Gbps over distances of 1-10 meters. The critical requirement, especially for proliferation into mobile devices, is the energy efficiency of the mm-wave link. Solutions that consume several watts of power for these links cannot be readily integrated into portable devices.

In this dissertation, we address fundamental challenges in the design of efficient siliconbased mm-wave phased arrays, and propose a CMOS design that provides the required energy efficiency for mobile applications.

# 1.2 Highly Miniaturized Radios for Sensors and IoT

For over a hundred years, starting from Marconi's experiment of connecting "stations", several generations of wireless devices have connected people with stations and with each other, resulting in over 6 billion mobile subscribers in the world. Much like the previous inflection point of crossing over from connecting "places" to connecting people led to unprecedented growth in mobile connectivity, the next inflection point will be just as extraordinary, if not more. This next exponential growth in connectivity is in connecting objects and machines in the age of "Internet of Things (IoT)". This is where Internet moves beyond our cell phones, tablets and laptops, and enters everyday objects including consumer electronics, household appliances, engines, oil rig drills, entire factories, warehouses, wearable electronics, thermostats, light bulbs, coffee-makers and a million other "things". Although discussions of IoT began over two decades ago, it's only now that we have the right recipe of technology, eco-system, and market to make this a reality. In fact, former PARC chief scientist Mark Weiser wrote about "ubiquitous computing" in a 1991 article that appeared in Scientific American [11].

Today, most of the information created on web is captured/created/manipulated by human beings. This is through recording audio/video/photographs, typing, communicating (audio and visual), and all other actions taken by human beings. Even though this data by itself is enormous and growing rapidly (see Fig. 1.1), it is still generated and processed by humans. If we had machines that would "know" and sense information about the environment and the things, and without direct supervision from us would take action, communicate, and process the complexity of the world, we would live in a more potentially efficient and organized world. The nature of the information will also be different, being mainly contributed by machines for the consumption of other machines and systems. This is the premise of IoT.

Looking into a future with ubiquitous connectivity and IoT, a world of inexpensive wireless sensors –embedded everywhere from wearables, clothing, and home lighting system, etc. –would work in concert to monitor and measure everything from diet to sleep behavior, and then use this information to modify future actions in valuable ways. For example, the system could provide feedback on sugar intake, caffeine use at night, or control light patterns to provide better sleep. We're already seeing snippets of these technologies in form of "apps" popping up here and there. With IoT we'll enter a new realm of devices taking action based on available data.

In another commonly envisioned scenario, move towards the design of smarter homes can provide adequate monitoring capability for the elderly. Billions of dollars will be saved in transferring patients from hospitals to homes. If 25% of the people in nursing homes went home we could save \$12B annually in US alone [12]. This becomes more important once we consider that by 2030 people 65+ represent 19% of the population –up from 12.4% in 2000 [12].

Projections show sensor demand growing from billions in 2012 to trillions within the next decade and this is largely fueled by emergence of smart sensors that combine computation, communication, and sensing (Fig. 1.3) [13]. Ultra-low power smart radios that can provide unique IP addresses and their locations are the requirement for IoT.



Figure 1.3: Projected growth of sensors in the next few decades (from [13])

## 1.2.1 Recent Market Trends

According to McKinsey Global Institute, the Internet of Things has the potential to create an economic impact of \$2.7 trillion to \$6.2 trillion annually by 2025 [14]. There is a huge financial component to this technology and with scaling behavior that surpasses anything from the past. This has led companies and organizations to come together to understand the market and need, and to provide solutions. Consortiums are formed and alliances initiated.

Samsung, Intel, Dell, Atmel and Broadcom have joined forces to launch the Open Interconnect Consortium (OIC), an organization that will set standards for connecting billions of household gadgets and appliances <sup>1</sup>.

Haier, LG Electronics, Panasonic, Qualcomm, Sharp, Technicolor, Silicon Image and TP-LINK announced the AllSeen Alliance in December, which now has a total of 51 members <sup>2</sup>. Apple announced a new smart home framework called HomeKit, which can be used for controlling connected devices inside of a user's home.

<sup>&</sup>lt;sup>1</sup>http://www.openinterconnect.org

<sup>&</sup>lt;sup>2</sup>https://allseenalliance.org

Google Glass provides fast access to information by speaking commands to the microphone built into the smart eyewear device. This year, Google acquired smart thermostat company Nest for \$3.2 billion and WiFi-enabled security camera company Dropcam for \$555 million.

In 2012 GE unveiled an "industrial Internet" campaign. GE's industrial-Internet pitch was structured around the huge economic gains that improvement in efficiency might bring to a number of industries if they used more analytics software [15].

CISCO has made several projections for future trends and applications of IoT [1] [2]. They project that the Internet of Everything (IoE) would create \$14.4 trillion in Value at Stake, which combines the increased revenues and lower costs that is created or will migrate among companies and industries, from 2013 to 2022 [16].

## 1.2.2 Connectivity for IoT

A major research and technological challenge lies in the scalable connectivity requirements in the context of Internet of Everything. In a world where objects are "smart", connected to the Internet, make local decisions, and provide unimaginable amount of sensory data to the cloud, the wireless connection seems to be of paramount importance. The infrastructure for the IoE sector will be mostly wireless and designed to be scalable. That is the only way to scale to trillions of sensors. Can our current wireless standards address the need to connect trillions of objects to the cloud? Do we want to include a WiFi or Bluetooth radio on every light bulb? Obviously, for a lot of everyday objects, current wireless solutions are simply not applicable once we look at the scaling requirements, energy efficiency, complexity, and most importantly cost. CISCO has indicated sensor energy to be one of the top three most important challenges and barriers for deployment of IoT devices (together with deployment of IPv6 and Standards) [2].

Wearable devices are another example of an emerging area in IoT. In this space power is a big problem. For example, currently, with Google Glass you'll get a few hours of use before it needs to be recharged. And the biggest power draw is usually the wireless chip that lets these devices communicate. It is therefore essential to investigate scalable and sustainable wireless solutions that can work in this space. Today, BLE (Bluetooth Low Energy) is the main technology to address this space and for many of the applications in IoT the efficiency of BLE is not sufficient in terms of cost and energy.

## **1.3** Organization

Chapter 2 will focus on system-level design for a highly-miniaturized wireless radio that would enable many of the IoT applications, as well as connectivity for a variety of sensor needs in home networking, gaming, entertainment, and applications that would benefit from aggressive sensor miniaturization. The target radio is passive, grain-sized, and disposable due to the low fabrication cost. We make the case for moving to mm-wave frequencies as a necessary step for scaling the size of the proposed passive radio. Benefits of this migration will be outlined, and various tradeoffs examined. As argued in this chapter, future wireless portal devices (e.g. smart phones, tablets, and some of the wearable devices) will already have energy-efficient mm-wave technology integrated and this proposal leverages that trend to transform the mobile device to the access point of the IoT.

Chapter 3 provides the detail design of the passive radio "tag". Various design tradeoffs are examined and details of energy recover, frequency selection, circuit architecture, and block diagrams are discussed. In this chapter the design of actual circuits in 65nm CMOS is also presented.

Chapter 4 covers the measurement of the passive radio.

Chapter 5 focuses on the design of an energy-efficient 60GHz phased-array receiver. The need for pushing energy-efficiency in mm-wave phased-arrays, in both IoT applications as well as in providing Gbps links to address capacity needs, was previously discussed. The architecture choice as well as the specifics of the circuit design will be presented.

Chapter 6 provides a detailed design of a passive mm-wave phase shifter architecture that could be used for scaling the mm-wave array to larger number of elements.

Finally, Chapter 7 concludes this dissertation, and provides a summary of presented designs as well as a list of possible future directions.

# Chapter 2

# mm-Wave Miniaturized Radio

Based on the discussions in chapter 1, the design of an energy efficient, passive, miniaturized, and low-cost wireless node that could provide scaling to the IoT applications will be a major challenge in terms of technology adoption. In that context, a major research goal is to overcome fundamental challenges related to the miniaturization of smart electronic sensors, and to develop disruptive technologies to scale dimensions down to the millimeter regime while providing the capability of highly asymmetric communication and also multi-access to extremely dense networks. While great progress has been made in the design of extremely complex digital systems for a variety of sensing, communication, and computation functions, relatively little has been done to reduce the size and cost of intelligent passive sensor radios. Miniaturization often leads to major disruption in terms of sensor function, and more specifically, the operating range [17] [18] [19] [20].

This chapter outlines the design philosophy and framework for a true single-chip radio that meets the range and functionality goals. This radio will be designed on a single silicon chip that contains all major functional blocks including the antennas, energy storage, computation, and communication. No external connections will be required- and hence in electronic terms, the chip is designed to be pad-less (no wire connections coming into or out of the chip). To enable this, wireless power delivery to a small sensor node is a crucial first step. Figure 2.1 shows the general constraints for the proposed miniaturized radio.

# 2.1 System-Level Observations for Wireless Connectivity

The IoT space is an application area unlike anything from the past. New questions emerge on a regular basis and cover topics from reliability, coverage, interoperability, all to way to security (and of course the privacy component). Even with this level of complexity, several observations could be made in regards to the underlying wireless connectivity needs in this application:



Figure 2.1: General constraints of the miniaturized radio

#### Massive Scale

Cellular and wireless technology standards have so far addressed the needs for connecting people, and have scaled to millions and billions of subscribers. That has been no small feat and one of the major technological advances of recent times. Nevertheless, connecting the toaster and the mug to the web is a whole new story, both in terms of scale as well as the density of the network. Current design methods do not scale to the trillion-radio scenario. In a world in which data is primarily generated and processed by machines, autonomously or with minimal supervision, we would have to rethink the wireless system architecture in a fundamental way.

#### Asymmetric Links for Hierarchical Radios

Multiple solutions exist, but based on past experiences, it is imaginable that to provide scaling we will need to design asymmetric links. In other words, the data from these trillion radios passes through central access points (e.g., cellular phones, personal computing devices, WiFi access points, base stations, wall plugs, etc.). These modules do not have the same power, size, and cost constraints. This asymmetry has some interesting system consequences that will be discussed later.

#### Cost and Complexity Constraints are Critical

As with the previous point, realizing that we have a new scale in number of radios and acknowledging the need to design asymmetry in the link leads to designs that push complexity, cost and system size to central units rather than the sensors. The radio has to be designed for minimum footprint, cost, and overhead to gain traction in this area. This requires new design techniques to achieve synchronization, multi-access, and high data rates. For example, individual crystal references and large optimal antennas may be a luxury that cannot be afforded in the trillion-radio world. Antennas, the PCB board and components, and various interfaces including batteries and supply regulation often dominate cost. Therefore, we propose a single-chip solution without any external components including antenna or battery.

#### Functional Requirements

In addition to the minimum wireless capability, there are additional functional requirements for the IoT radio. At the minimum, the radio will have a bi-directional wireless link and a unique IP address. The basic units will have several bits to turn devices ON or OFF and to provide feedback on operation. Additional features may include various forms of sensing (e.g., temperature, humidity), position-awareness to provide localization with varying level of precision depending on application, and actuation. To achieve these goals, a smart radio with a minimum level of computation and communication is required. Many of today's passive transponders, including most RFID systems, will not meet the functional requirements in this respect.

#### Reliability and Lifetime

IoT applications work in a scale that is unprecedented in terms of current wireless standards. The numbers are projected to reach 50 billion within a few years, and a quick scaling to trillions in the next decade. In addition to cost, this has important implications in terms of reliability and lifetime. Fully passive systems are proposed that work with wirelessly recovered RF energy to operate. This provides a virtually unlimited lifetime for the radio.

#### Localization

Many of the IoT applications share the common requirement to provide localization-aware radio links. Precise geographic location of a thing, and perhaps the dimension and movement trajectory will be critical. Automation, navigation, sensing, and automated actuation are all examples in this area. The time-space context of a decision is vital once the human supervision is completely removed. Future standards will have to address this geospatial component of IoT.

# 2.2 Present State of Knowledge in the Field

Miniaturization of passive sensor nodes (i.e. no explicit local power source) or RFID systems has received considerable attention in the past decade or more. Unfortunately, most of the attention has been paid to actual packaging and mechanical scaling rather than the fundamental physical limitations in size reduction. To date, none of the proposed solutions can scale down to millimeter dimensions while maintaining a "range-to-size ratio" that gets even close to 100. There is also no clear path for this scaling.

One of the more comprehensive examples is a recent effort from UC Berkeley, called Picocube, with an initial goal of integrating several functional sensor blocks into a sub-cm single package that operates from harvested energy of  $\langle 6\mu W$  [18] [21]. In this design, the final system was extended in size (occupying 3.8cmx2.5cmx0.85cm) and operates from a battery that greatly reduces lifetime/reliability and renders it unable to address a variety of applications in the passive sensor domain. The key problem is that a 1.8GHz TX frequency initiates a 17cm wavelength, which results in a huge discrepancy to antenna dimension. Therefore, although much effort has been put into optimizing the efficiency of the transmitter circuits, the overall system is far from optimal. A cm-sized antenna cannot, by physical laws, couple energy efficiently to 17 cm wavelength signals. The group tried adding high-permittivity material to get a closer match in wavelength but, as predicted by theoretical analysis, this resulted in smaller bandwidth and lower efficiency due to impedance/wavelength mismatch to air.

A more recent sensor design example uses micro-machined and advanced 3D packaging technologies to reduce the total size to a few millimeters [22] [23]. Power delivery is through a solar harvester as well as a battery, which will make it less applicable to long-term and autonomous readout. Again, this approach only concentrates on reducing the package size and fails to address the aperture coupling discrepancy and, as a result is not effective at communicating to reasonable distances.

In 2003, Hitachi released the mu-chip which at  $0.4 \text{mm}^2$  was the smallest RFID designed to date [17]. The chip operates at 2.4GHz and has an embedded antenna. However, with a 12cm wavelength interacting with the 0.4mm aperture, the chip was extremely inefficient and obtained a communication range of only 1.2mm. To extend this range to 30cm, Hitachi proposes an external resonant antenna of 6cm dimensions. Again, the range/dimension ratio of this passive sensor is roughly 5 and there are no clear paths to scale size below 6cm.

# 2.3 Design of a mm-Sized Radio

The fundamental issue associated with designing a passive mm-sized wireless radio is the power delivery to the sensor and communication with it. The main challenge is the large mismatch between wavelength and the available aperture of the node. This limits the energy transfer efficiency for small radios and renders the system incapable of higher-level functionality. Figure 2.2 shows the concept.



Figure 2.2: Mismatch between wavelength and the available aperture size.

#### 2.3.1 Friis Propagation Law

Choice of wavelength affects propagation properties in a wireless power or data link. From a first look, Friis's equation may seem to suggest that higher frequencies incur larger path loss, and are therefore unsuitable for communication with or power delivery to passive sensor nodes. We will start with examining the first-order Friis equation

$$P_R = \frac{P_T}{4\pi R^2} \frac{G_T G_R \lambda^2}{4\pi} = P_T G_T G_R (\frac{\lambda}{4\pi R})^2$$
(2.1)

where  $P_T$  is the transmitter power,  $G_T$  is the transmitter gain,  $G_R$  is the receiver gain, and R is the distance between the transmitter and the receiver.

This is the most widely used form of the Friis equation. It shows that for fixed antenna gain on the transmitter and the receiver, using lower frequencies, or equivalently a longer wavelength, will lead to a lower path loss. One caveat with this form of the equation is that it assumes a fixed antenna gain, and therefore, larger antennas for lower frequencies. However, in many of the practical applications, the physical area of the transmitter and/or receiver is limited. For example, in mobile devices, the available real-estate is limited to the size of the handheld device. More severely, in sensors and IoT applications, many of the radios would have to work with a limited footprint. In order to address this issue this formula can be revised to take into account the limited area of the transmitter/receiver antennas. In doing that we will start with the following relationship between gain and aperture of an antenna [24]

$$A = G \frac{\lambda^2}{4\pi} \tag{2.2}$$

A is the aperture of the antenna and this assumes that the antenna size to be on the order of or larger than the wavelength, in which case the aperture is proportional to actual physical cross-sectional area (Fig. 2.3(a)). For example, the transmitter or receiver could be either



Figure 2.3: Cross-sectional area of TX and RX antennas (a) TX and RX are both large antennas (b) TX is large antenna but RX is a small antenna

large antennas, or antenna arrays. The case for small antennas will be addressed in the next section (Fig. 2.3(b)).

Under these assumptions, the received power can be described as

$$P_R = \frac{P_T A_T}{R^2 \lambda^2} A_R \tag{2.3}$$

Which suggests that with a limitation imposed on the physical size of the antennas (or arrays) on either side, resorting to higher frequencies is beneficial. This does not however include the effect of atmospheric absorption, or actual circuit losses, which eventually limit the frequency of operation. The conceptual tradeoff is shown in Fig. 2.4. We will show that the optimal frequency range falls in the microwave/mm-wave regime.

## 2.3.2 Effect of Wavelength Match in Communicating with Miniaturized Sensors

A major goal is to provide the ability to power up and communicate with aggressively scaled wireless sensor nodes. For energy to couple efficiently to the small sensor, its wavelength, regardless of the nature of the incoming wave (EM in form of RF or optical, or mechanical), must match the dimensions of the sensor aperture. The physical basis of this requirement is widely documented and we refer to it as aperture matching.

For example, in the RF/microwave regime, the EM wave cannot couple efficiently to a "small" antenna [25] [26] [27] [24] [28] [29] [30] [31]. A small antenna is defined as an antenna



Figure 2.4: Tradeoff between power recovery efficiency and available received power

that occupies a small fraction of the radian sphere in space. Radiansphere is a physical space with a sphere that has a radius of wavelength divided by  $2\pi$ . This space is mainly occupied by electric and magnetic stored energy. It can be shown that reduction in size will inevitably be at the expense of bandwidth and/or efficiency. In a practical circuit, this inefficiency, among other reasons, is due to the highly reactive impedance with an extremely low radiation resistance of the antenna (i.e. high quality factor) [24]. For example, the radiation resistance of a small dipole is  $80\pi^2(l/\lambda)^2$ , where l is the dipole size and  $\lambda$  is the wavelength. For a 1mm antenna interacting with a 5cm wavelength, this results in a radiation resistance of 0.3  $\Omega$  and a quality factor in excess of 4,000. Obviously, this results in a 1) low radiation efficiency (due to intrinsic losses in the antenna itself), and 2) very poor power transfer efficiency in presence of matching losses.

Here, we aim to provide a quantitative framework for these losses, and to demonstrate fundamental challenges in delivering RF power to a highly miniaturized radio. We approach the problem from first principles.

The Friis's equation can be modified to take into account matching and radiation losses. The following basic assumptions are used for this first-order analysis:

1. The receiver is working in the small antenna regime (e.g.,  $L < \frac{\lambda}{5}$ ) (Fig. 2.3(b)). This is a reasonable assumption assuming millimeter-sized sensors that work with frequencies up to mm-wave.

2. Component quality factors are compatible with integrated circuits (e.g., Q < 20, @ GHz). 3. The required bandwidth is smaller than available bandwidth of the antenna. This assumption is justifiable for power delivery links that are narrowband and do not pass information.

4. The TX aperture is fixed, but works in the  $L_{TX} > \lambda$  regime (i.e. beamforming is possible). 5. Operation conditions are such that the far-field approximations are valid. This assumption holds for many of the IoT applications. 6. Radiation efficiency due to conductor losses in the antenna will be modeled for a small loop antenna with a copper conductor.

For small antennas, assuming perfect matching and efficiency conditions, the effective aperture is equal to 3/2 times the area of a circle with radius  $r = \lambda/2\pi$  [24] [28]. In other words, the directivity of a small antenna is fixed at 3/2. Although the radiation pattern and directivity remain the same for varying the size of a small antenna, the radiation resistance is affected (e.g. in proportion to the square of the wavelength/size ratio in a small electric dipole, as discussed above). In a practical circuit with actual lumped components that have finite quality factors, this has a substantial effect on overall efficiency since other circuit resistances do not scale the same way. With these working assumptions, the power available to the receiver electronics for power recovery can be calculated as

$$P_{R} = \frac{P_{T}}{4\pi R^{2}} \frac{G_{T}G_{R}\lambda^{2}}{4\pi} = \frac{P_{T}G_{T}}{4\pi R^{2}} A_{eff,R}$$
(2.4)

The expression on the right uses the effective receiver aperture  $(A_{eff,R})$  instead of gain. To calculate the effective aperture, here, only matching losses and core antenna efficiency (limited for example by antenna conductor losses) are taken into account. This will therefore serve as an upper bound in the achievable efficiency.

$$A_{eff,R} = 3/2 \frac{\lambda^2}{4\pi} \underbrace{\left[\frac{R_{rad}}{R_{rad} + R_{loss}}\right]}_{\eta_a} \underbrace{\left[\frac{Q_C^2 + 1}{Q_C(Q_C + Q_A)}\right]}_{\eta_m}$$
(2.5)

Here,  $\eta_a$  and  $\eta_m$  are the antenna and matching efficiencies, and  $Q_C$  and  $Q_A$  are the component and antenna quality factors, respectively. For matching losses a first order bound is used (similar to [32] and [33]). The effects of  $\eta_a$  and  $\eta_m$  will be discussed separately.

The quality factor (Q) of a small antenna scales inversely with the normalized volume to wavelength. In other words, smaller antennas have smaller radiation resistance and become more reactive, therefore presenting a higher Q factor. While the radiation resistance scales down, other loss resistances in the circuit do not scale the same way, and therefore the overall efficiency of the circuit drops. It can be shown that the radiation resistance of a small dipole scales with  $80\pi^2(l/\lambda)^2$  and for a small loop this will be proportional to the fourth power of the circumference [24]. Chu and McLean have derived the fundamental relationship between antenna size and Q [30] [27]. It can be shown that the lower bound of the Q of a small single-mode antenna can be calculated as

$$Q_A \ge \frac{1}{k^3 a^3} + \frac{1}{ka}$$
(2.6)

where k is the wavenumber  $(2\pi/\lambda)$  and a is the radius of the smallest sphere that contains

the antenna. To quantify the effect of antenna radiation efficiency  $(\eta_a)$ , mainly limited by the conductive losses, a small loop antenna is taken as an example. The radiation resistance of a small loop (diameter  $< 0.1\lambda$ ) is given by

$$R_{rad} = 320\pi^2 \left(\frac{\pi a}{\lambda}\right)^4 \tag{2.7}$$

The antenna series loss resistance, limited by skin effect, can be calculated as

$$R_{loss} = \sqrt{\frac{f\mu_{\circ}}{\pi\sigma}} (\frac{\pi a}{a_{wire}})$$
(2.8)

where  $a_{wire}$  is the equivalent radius of the wire for the loop antenna,  $\sigma$  is the DC conductivity of the wire, and  $\mu_{\circ}$  is the free space permeability. Under these conditions, and assuming a fixed wire diameter and small loop size where the conductive loss dominates in the  $\eta_a$  term, the radiation efficiency ( $\eta_a$ ) would be proportional to the volume divided by  $\lambda^{3.5}$ .

$$\eta_a \propto \frac{a^3}{\lambda^{3.5}} \tag{2.9}$$

For matching losses, we have used first order loss approximations [32] and [33]. Matching a component with higher Q using finite-Q components incurs extra losses.

Summarizing all the loss and efficiency terms for the assumptions described above, the received power on a millimeter-sized radio can be calculated from Equation (2.4) and Equation (2.5). This is an approximation since the interactions of the two terms are not taken into account (e.g. antenna losses lower effective Q for matching). Figure 2.5 shows the plot of received power in terms of frequency and range for a fixed aperture on the transmitter. In calculating the received power in terms of frequency, the following parameters have been used: a=1mm (radius of sphere containing the antenna),  $a_{wire}=1.8\mu$ m, R(range)=1m,  $A_T=10$ cm by 5cm, and  $P_T=36$ mW (which leads to EIRP<40dBm for highest frequency in the range). The small antenna assumptions approximately hold for all the frequency range plotted in this figure. The plot demonstrates that getting closer to an aperture match situation, where antenna size matches the operation wavelength, is beneficial for power recovery. For the second graph, where the variable is range, we have assumed a component quality factor  $Q_C=15$ . These two plots demonstrate efficiency improvement with moving to higher frequencies.

Figure 2.6 demonstrates the same effect but this time with a fixed transmitter EIRP (instead of  $A_T$ ). In this case the EIRP is set to 10W or 40dBm while keeping all the other parameters for size, range, and wire diameter. The received power is also plotted as a function of range. The same trends are observable. Higher frequencies will be beneficial, since matching and radiation efficiencies improve drastically, all from an improvement in the radiation resistance.



Figure 2.5: The received power in terms of frequency and range for a fixed aperture size on the transmitter.



Figure 2.6: The received power in terms of frequency and range for a fixed transmitter's EIRP.

For higher frequencies than what is plotted here, several other effects take place. Most importantly, the small antenna assumption breaks down and the radiation resistance no longer scales with the same cubic term (as the antenna approaches its "resonance"size). Once the frequency is high enough so that a resonant antenna with a reasonable resistance is obtained, the trend starts to reverse. This is due to the insertion loss when matching a resistive source to the input impedance of the chip, which has a large capacitive susceptance.

To show this effect, the circuit in Fig. 2.7 will be examined. Here, the chip input impedance



Figure 2.7: First-order model for the chip, antenna, and L-matching network.

consists of a parallel RC network. This is the impedance of the AC-DC converter seen at the input terminal. To simply the analysis, an L-match network is utilized. The antenna impedance is assumed to be purely real, reflecting the fact that we're now operating in the antenna resonance regime. The real part of the secondary side of the matching network is determined by both the chip input shunt resistance  $(R_H)$  and the loss from the shunt inductor in the L network  $(R_p)$ . This shunt resistance is given by

$$R_{tot} = R_H ||R_p = R_H ||Q_L \omega L_x \tag{2.10}$$

where  $Q_L$  is the inductor Q. Under these assumptions, the shunt susceptance  $(B_{tot})$  on the secondary can be calculated as

$$B_{tot} = \omega C_H - \frac{1}{\omega L_x} = \frac{-1}{\omega L_{tot}}$$
(2.11)

After the shunt-series transformation, the equivalent series resistance of this network has to be equal to  $R_s$ . We therefore have the following relationship

$$R_s = \frac{R_{tot}}{1 + \left(\frac{R_{tot}}{\omega L_{tot}}\right)^2} \tag{2.12}$$

Finally, the input voltage to the circuit, which determines the AC-DC converter efficiency, can be calculated as

$$V_H = \frac{V_s}{2}\sqrt{1+Q^2}$$
(2.13)

where

$$V_s = \sqrt{8P_{av,s}R_s} \tag{2.14}$$



Figure 2.8: Input voltage of multiplier as a function of frequency for different  $R_s$  values

and

$$Q = \frac{R_{tot}}{\omega L_{tot}} \tag{2.15}$$

Figure 2.8 plots  $V_H$  as a function of frequency for different  $R_s$  values. Here,  $P_{av,s}$  is assumed to be -15dBm, quality factor  $(Q_L)$  is 15, capacitor C is assumed to have  $Q >> Q_L$ ,  $R_H=1.5\mathrm{K}\Omega$ , and  $C_H=40\mathrm{fF}$ . As can be seen here, for a purely real antenna impedance, the output voltage drops with frequency. The same trend holds for all resistor values, while the absolute voltage decreases by reducing the antenna resistance (which again emphasize the significance of avoiding the small antenna regime). This together with Fig. 2.5 and Fig. 2.6 demonstrate the existence of an optimal frequency for given antenna size and circuit parameters, as predicted in the conceptual Fig. 2.4.

To include the first-order effect of the size of the antenna in Equation (2.12), we can include the frequency dependency of antenna resistance for a small loop antenna of a=1mm (Equation (2.7)). This leads to the blue traces in Figure 2.9 which show  $V_H$  as a function of frequency. This assumption of course only holds in the small antenna regime, and close to resonance the impedance will no longer scale with the same rate. If a constant radiation resistance ( $R_S=30\Omega$ ) is assumed, the red traces in the figure will be obtained. Above the intersection point, the red trace will take over and show the resulting reduction in voltage.


Figure 2.9: Input voltage of multiplier as a function of frequency, for frequency dependent  $R_s$  value (blue trace) and constant  $R_s$  value (red trace), for different inductor quality factors.

Using larger antenna impedance values does not change the general trend and conclusions from this analysis. Figure 2.10 shows the same concept for different chip input capacitance values. Increasing the input capacitance reduces the achievable voltage.

It should be emphasized that Figures 2.8 and 2.9 only takes matching losses from a real input impedance to a chip load into account and do not consider other loss terms described in the calculations related Fig. 2.5 and Fig. 2.6.

### 2.3.3 Dielectric Loading of Antennas

Changing the dielectric material that surrounds the antenna would affect the wave velocity and hence the resonant frequency of the antenna. It may then be conceivable that by heavily loading the antenna with a dieletric material, it would be possible to achieve miniaturization. This, however, has negative loss consequences and will not lead to efficiency improvements. Essentially, the high quality factor of small antennas implies a large reactive power component surrounding the antenna, and with the addition of this loading, the dielectric losses in the medium considerably degrade efficiency. This can be described by the following first order efficiency term [34]



Figure 2.10: Input voltage of multiplier as a function of frequency for frequency dependent  $R_s$  value (blue trace) and constant  $R_s$  value (red trace), for different chip input capacitance

$$Eff \propto \frac{1}{1 + Q.tan(\delta)} \tag{2.16}$$

where Q is the antenna quality factor and  $\tan(\delta)$  is the loss tangent of the dielectric. Also, Wheeler's original paper [25] showed that increasing the permittivity  $\epsilon$  inside antenna increases Q roughly in proportion to  $\epsilon$ , but increasing the permeability  $\mu$  can decrease Q by up to a factor of 3. Therefore, dielectric loading of the antenna will not help with the overall efficiency or efficiency bandwidth product.

## 2.3.4 Choice of Downlink Frequency

Based on the analysis provided in previous sections, it is clear that operating the radio at a frequency where the wavelength matches the dimensions of the antenna or aperture has significant benefits for the overall recovered power. Given the goal of implementing a radio in a millimeter-sized footprint, our operation frequency will end up in the millimeter-wave regime. On the other hand, as previously pointed out, higher frequencies impose higher atmospheric losses as well as lower circuit efficiency. For example, the power recovery circuit becomes less efficient at higher frequencies, mainly from the loss coming from matching the input capacitance of the chip. For these reasons, an optimal frequency exists, and this is a function of several factors including 1) radio dimensions, 2) quality factor of passives, and 3) transistor speeds and process technology. Practical limitations arising from FCC regulations will of course also play a role. Atmospheric absorption bands (e.g., associated with Oxygen absorption) will not be a limiting factor at short distance (e.g. 60GHz Oxygen absorption results in 10dB/km of extra losses [35] [36] and this will not affect budget for several meters).

For our specific case of limiting the size of antenna to a few millimeters, and working with a 65nm digital CMOS process, we chose 24GHz to balance various tradeoffs presented in this section. The link budget, using our assumptions and 24GHz as the carrier, will be calculated in the following sections.

### 2.3.5 Uplink Communications and Bandwidth Requirements

After powering up the passive radio, an uplink data path strategy is needed to enable communication from sensors to the reader. Conventional RFIDs communicate back using backscatter modulation [37]. However, backscatter communication poses several major limitations: 1) it is inefficient due to relying on a two-way path loss  $(1/r^4$  rather than the  $1/r^2$  losses), 2) it limits the number of simultaneously readable tags due to significant signal collision [37], and 3) it does not enable any form of ranging and localization on the radio due to the low bandwidth dictated by receiver constraints. We will, therefore, avoid backscatter techniques for the proposed mm-wave radio.

The alternative to passive backscatter is to use active transmission. In this regard, our aim is to maximize the energy efficiency from the transmitter point of view. Assuming an asymmetric link, all the emphasis is placed on the energy efficiency of the transmitter (radio) and not the receiver (reader). A brief description of the design choice is provided. The case for using mm-wave frequencies for the radio has been presented in previous sections. It will be argued that larger bandwidth leads to improved transmitter energy efficiency, among other benefits that will be explained later.

Channel capacity, determined by the Shannon equation, places an upper bound on the information rate through the wireless link from the miniaturized radio [38]. Therefore, the maximum data rate is given by

$$R < B \log_2[1 + \frac{P}{N_0 B}]$$
 (2.17)

where R is the maximum data rate, B is the bandwidth, P is the average transmitted power, and  $N_0$  is the thermal noise density.

Our goal is to minimize the consumed energy per information bit transmitted from the radio. Assuming that the system has an efficiency of  $\eta$ , the energy consumption per bit of

actual information transmitted can be calculated as

$$E_{TX} = \frac{E_b}{\eta} = \frac{1}{\eta} \left( \frac{E_s}{\log_2(M)} \right) = \frac{1}{\eta} \frac{P}{\frac{\log_2(M)}{T}} = \frac{1}{\eta} \frac{P}{R}$$
(2.18)

Here,  $E_b$  is the energy transmitted per bit of information,  $E_s$  is the energy per symbol, M is the symbol length in M-ary modulation, and  $T_s$  is the symbol length. For example, for a M-PPM modulation where one pulse is transmitted in M possible slots ( $T_s = MT_b$ , where  $T_b$  is the bit length), the information flow rate is  $\log_2(M)/T_s$ , since  $\log_2(M)$  bits are transmitted per symbol length of  $T_s$ .

At this point we can define a spectral efficiency factor (S), which is the ration of R/B. If we replace for S = R/B in previous calculations we will arrive at the following equation.

$$S < \log_2[1 + \frac{SE_b}{N_0}]$$
 (2.19)

After rearranging terms, we find the following relationship between spectral efficiency and energy efficiency

$$\frac{E_{TX}\eta}{N_0} = \frac{2^S - 1}{S}$$
(2.20)

Therefore, improving energy efficiency (i.e., minimizing  $E_{TX}$ ) can be potentially achieved by sacrificing spectral efficiency. At the limit of S goes to 0, equation (2.20) shows that we would need a minimum  $E_b/N_0$  of -1.59dB to enable communication. This general trend points us to the direction of finding larger available bandwidths in order to achieve better energy efficiency. Figure 2.11 shows this tradeoff.

The modulation scheme will also affect this tradeoff. Quadrature Amplitude Modulation (QAM), Phase Shift Keying (PSK), or other bi-orthogonal modulations enable larger spectral efficiency with added number of symbols (i.e. larger M) at the cost of increasing SNR per bit, and hence are not suitable for our application. On the other hand, M-ary orthogonal modulations (e.g., Frequency Shift Keying (FSK), or Pulse Position Modulation (PPM)) with large number of dimensions can achieve the energy efficiency goal. The upper limit in M stems from practical constraints, as with issues regarding jitter and timing.

In addition to improved energy efficiency, use of larger bandwidth improves localization capability of the system and, as was pointed out before, this is an important requirement in the sensor application space. We would like to be able to estimate the position of the radio with sub-centimeter accuracy. The resulting estimation problem, related to measuring the arrival time of the incoming pulse from the radio, can be analyzed by looking at one of the theoretical lower bounds on the variance of the estimator, for example the Cramer-Rao



Figure 2.11: SNR as a function of spectral efficiency

Lower Bound (CRLB) [39]. Assuming white noise, or equivalently colored noise with sampling at nulls of the auto-correlation function, the lower bound on the timing uncertainty can be derived as [39]

$$\overline{t_0^2} = \frac{1}{\beta^2 \frac{E}{N_0/2}}$$
(2.21)

where  $\beta$  is the effective bandwidth and the other term represents the SNR. The range estimator can be calculated with R = ct and hence the variance on the range estimator is given by

$$\sigma_R^2 = \frac{c^2}{\beta^2 \frac{E}{N_0/2}}$$
(2.22)

Therefore, increasing the signal bandwidth will lead to a better estimation on range.

In order to meet these requirements, this work uses a M-PPM modulation scheme in the 60GHz frequency range, where a large 7GHz band is available. Using pulsed M-PPM has the added benefits of enabling heavy duty-cycling of the transmitter as well as to reduce collisions.

## 2.4 Link Budget Calculations

### 2.4.1 Uplink

As discussed in the previous section, we propose the 24GHz frequency for downlink and the 60GHz unlicensed band for uplink. Using mm-wave frequencies will help match the aperture of the sensor to the wave and improves efficiency. More importantly, for the uplink, the 60GHz band allows for a large 7GHz bandwidth pulse that, in turn, enables a high aggregate data rate, accurate time-of-flight ranging, and extreme duty cycling of transmitter down to sub-nanosecond windows. In this section a first order link budget analysis will be provided to demonstrate the feasibility of mm-wave power up and data communication.

We will start with the energy consumption of the mm-wave transmitter, which is an important part of the system. To transmit a pulse at 60GHz, in addition to the pulse width itself, a setup time is required for the oscillator and other circuits to start up ( $T_{startup}$ ). Assuming that the receiver chain can recover a DC power of  $2\mu$ W from which around  $1\mu$ W is available to the transmitter, the following relationship between various time windows, power, and efficiencies must hold

$$P_{TX} = \frac{NT_{active}(\frac{P_{rad}}{\eta_{TX}}) + T_{startup}P_{tx,oh,off}}{T_{cucle}} < 1\mu W$$
(2.23)

Here,  $\eta_{TX}$  the transmitter efficiency,  $P_{tx,oh,off}$  the transmitter overhead in OFF mode due various startup circuits including timing,  $T_{cycle}$  the duty cycle window, and N is the number of pulses transmitted for a symbol. We can use this as the basis to calculate total energy consumption of the transmitter.

Table 2.1 summarizes the assumptions and the parameters used for calculation of the required transmitter power from the radio. Here, we are designing the uplink for a longer range than downlink (2 meters), intended to also cover the applications where the transmit and receive array are not co-located on the reader (e.g., periodic 24GHz repeaters distributed across the field). With the parameters in Table 2.1 an EIRP of -6.5dBm is required from the chip. Assuming a -1dBi antenna gain, this leads to an effective transmit power of -5.5dBm or approximately 0.3mW. For our power budget calculations we assume an equal partition between the two components in the numerator of Equation (2.23), each of which takes 15pJ of energy. For example, for a 0.5ns pulse width and N=3, this requires that the transmitter efficiency ( $\eta_{TX}$ ) be better than 3%. With these assumptions, a total 30pJ energy will be required to transmit  $\log_2(M)$  bits. With a  $T_{cycle}$  of  $30\mu$ s, this leads to an average power consumption of  $1\mu$ W which is within our budget. The rest of the available power will be dedicated to other non-duty cycled blocks, as well as any additional duty-cycled sensing circuits.

The required capacitor size to hold the energy for this intermittent transmission can also be

| Frequency | 60GHz          |
|-----------|----------------|
| Range     | 2m             |
| $NF_{RX}$ | 6dB            |
| SNR       | 12dB           |
| BW        | 2GHz           |
| $A_{RX}$  | 1.5cm by 1.5cm |
| Margin    | 3dB            |
| $P_T G_T$ | -6.5dBm        |

Table 2.1: Summary of transmitter link budget

Table 2.2: Summary of receiver link budget

| Frequency             | 24GHz  |
|-----------------------|--------|
| Transmitter EIRP      | 40dBm  |
| Receiver Antenna Gain | -1dBi  |
| RX Sensitivity        | -14dBm |
| Range                 | 89cm   |

calculated. Assuming we can only tolerate a 5% ripple in the voltage of this capacitor, and for 30 pJ energy, the required capacitor size is given by

$$\Delta E = 0.5C(V_{sup}^2 - V_2^2) = 0.5CV_{sup}^2(1 - (1 - \frac{\Delta V_{sup}}{V_{sup}})^2) = 30pJ$$
(2.24)

The capacitor size is calculated as 600pF. This will ensure that with the burst of energy being drawn for pulse transmission, the supply voltage ripple stays below 5%.

## 2.4.2 Downlink

In the downlink, the mm-wave signal with frequency of 24GHz is transmitted from the reader (e.g., cellular phone) and will be picked up by the radio for power and data. Our focus in the link budget calculations will be on the power recovery component, which is the dominant factor in downlink. In the worst case conditions of the radio being at maximum distance from the reader, we assume a receiver sensitivity of -14dBm for the power recovery circuits, and also assume a DC power consumption of  $1.5\mu$ W for the sensor node. Under these conditions, and for a nominal 40dBm available EIRP from the base, we can calculate the operational range of the radio. Table 2.2 summarizes our working assumptions and the calculated range of the radio.

# Chapter 3

# Ultra Low-Power Transponder Design

This chapter focuses on describing the design of the miniaturized radio, building upon the analysis from Chapter 2. The description will start with an overview of the system and the outline regarding the block diagram. Design details related to individual blocks will follow.

The block diagram of the RFID system is shown in Fig. 3.1. The mm-wave 24GHz onchip antenna feeds the harvested signal to a step-up 1:2 transformer-matched 24GHz 6-stage Dickson AC-DC rectifier. The rectified voltage is regulated with an LDO fed by a first-order temperature compensated reference. It will activate the Power-On-Reset (POR) circuit which generates the activation signal for the whole chip. The transponder also incorporates a data recovery path with an envelope detector, decoder, and state machine. After decoding, the chip communicates back its data using a 3-pulse modified M-PPM on 60GHz.

For multi-access readout from 1,000's of transponders, a minimum synchronization between tags as well as with the reader is required. The timing block does this synchronization between tags and reader.

TX is a highly duty-cycled 60GHz pulsed base system with programmable pulse width. The 60GHz TX optimizes the total consumed energy by reducing overheads and using a fast-starting oscillator. The pulses are transmitted using a half wave 60GHz dipole. In this chapter a brief description of each block is presented.

This chapter will be divided to the following six sections: 1) power recovery and regulation 2) power on reset 3) data recovery and demodulation 4) timing, synchronization, and multi-access 5) 60GHz transmitter and 6) on-chip antenna design. In each section, a description of the circuit architecture and system choices will be followed by a detailed explanation of the circuit design.



Figure 3.1: Block diagram of the passive RFID

## 3.1 Power Recovery and Regulation

The block diagram of the  $V_{DD}$  generator is shown in Fig. 3.2. It includes the 24GHz transformer-based input matching network, voltage multiplier (rectifier), DC-limiter, reference generator, and regulator. The voltage multiplier recovers part of the incoming RF signal power to DC for the power supply of all active circuits on the chip. One of the most popular voltage multipliers are charge pump circuits.

The output voltage of the voltage multiplier experiences large variations based on the RF input voltage. To limit the output voltage range of the multiplier, a DC limiter is used. The regulator circuit following the limiter performs two major functions: First is to regulate the front-end output voltage to a pre-defined value and maintain variations within an acceptable range. The second is to protect the inner circuits from breakdown at high RF input power. The output of the voltage regulator will be used as the circuit's  $V_{DD}$  voltage.

## 3.1.1 Voltage Multiplier

## 3.1.1.1 Voltage Multiplier Theory

The voltage multiplier circuit converts the received 24GHz input signal to a stable DC voltage. A popular voltage multiplier topology is the Dickson charge pump circuit [40]. A



Figure 3.2: Block diagram of the  $V_{DD}$  generator

single stage Dickson voltage multiplier is shown in Fig. 3.3. Here,  $V_{in}$  is the input DC voltage from previous stages,  $V_{RF}$  is the RF input voltage, C is the clock coupling capacitor, and  $C_p$  is the parasitic capacitor. When the input RF signal of a single stage voltage multiplier is in the negative cycles, the voltage across capacitor C is charged through diode  $D_1$  to the peak of the input signal minus diode drop voltage. In the positive cycles,  $D_1$  is OFF and  $D_2$  rectifies the voltage sum of the  $V_C$  and  $V_{in}$  and produces a DC voltage ( $V_{out}$ ) equal to twice the peak of the input signal minus the ON voltage drop ( $V_D$ ) across the two diodes. Therefore, for the  $n^{th}$  stage of the voltage multiplier and during the negative cycle of the input voltage, the voltage at node A would be given by

$$V_{An} = V_{out,n-1} + \frac{C}{C + C_p} V_{RF} - V_{D1}$$
(3.1)

where  $V_{RF}$  is the peak of the input signal. Then in the positive cycle, the voltage at the output is

$$V_{out,n} = V_{out,n-1} + \frac{2C}{C+C_p} V_{RF} - V_{D1} - V_{D2}$$
(3.2)

The equation above is for the unloaded case. For the case that there is a load current equal to  $I_{out}$ , the equation is rewritten as [40]

$$V_{out,n} = V_{out,n-1} + \frac{2C}{C+C_p} V_{RF} - V_{D1} - V_{D2} - \frac{I_{out}}{(C+C_p)f}$$
(3.3)

where f is the frequency of operation and the last term is the voltage that the capacitor is charged and discharged to when supplying  $I_{out}$  to the load. Therefore, by looking at the above equation, the output voltage of an N stage voltage multiplier is given by

$$V_{out} = N(\frac{2C}{C+C_p}V_{RF} - V_{D1} - V_{D2}) - \frac{NI_{out}}{(C+C_p)f}$$
(3.4)

In CMOS technology, The diodes will be implemented using diode connected devices instead of Schottky diodes so the diode voltage drops will be replaced by  $V_{GS}$  of the diode connected



Figure 3.3: Schematic of one stage Dickson voltage multiplier (a) Diodes (b) Using CMOS diode connected devices.

devices or in the extreme case by their threshold voltage (Fig. 3.3).

$$V_{out} = N(\frac{2C}{C+C_p}V_{RF} - V_{GS1} - V_{GS2}) - \frac{NI_{out}}{(C+C_p)f}$$
(3.5)

Equation 3.5 shows that the output voltage is a strong function of transistors threshold voltage. For that reason, low threshold voltage CMOS transistors were used.

#### 3.1.1.2 Input Matching Network

The input impedance of voltage multiplier can be modeled as a parallel RC network (Fig. 3.4). Here, the shunt capacitance is mainly due to the  $C_{GS}$  of diode connected devices in parallel with the bottom plate parasitic capacitors of clock coupling capacitors, and the shunt resistance is caused by the ON resistance of diodes during conduction [41]. To use the transformer for the input matching network, the simplified model shown in Fig. 3.5 is assumed. In this transformer model,  $R_{s1}$  and  $R_{s2}$  are the series resistances of primary and



Figure 3.4: Equivalent model for input impedance of voltage multiplier

secondary windings, K is the coupling coefficient of transformer, and  $C_s$  is the series capacitor added to bring up the input impedance and to resonate with the primary inductance (similar to an L-match network). Based on the analysis in [42], to find the conjugate matching conditions, we can derive the Thevenin equivalent of the circuit (Fig. 3.6) where  $R_T$  and  $V_T$ are given as



Figure 3.5: Equivalent model for input transformer

$$R_T = R_{s2} + \frac{(\omega M)^2}{R_{ant} + R_{s1}}$$
(3.6)

and

$$V_T = \frac{j\omega M}{R_{ant} + R_{s1}} V_{ant} \tag{3.7}$$

where  $M = K\sqrt{L_1L_2}$  and we assume that  $\omega L_1 = \frac{1}{\omega C_s}$ . To have the conjugate matching condition

$$\omega L_2 = \frac{1}{\omega C_p} \tag{3.8}$$



Figure 3.6: Thevenin equivalent of input matching network

and

$$R_T = \frac{R_p}{(Q_L)^2} \tag{3.9}$$

where

$$Q_L = R_p C_p \omega \tag{3.10}$$

By using the above equations, the voltage gain from  $V_{ant}$  to  $V_L$  can be found as [42]

$$G_{matching} = \left| \frac{V_L}{V_{ant}} \right| = \frac{\omega M}{2(R_{ant} + R_{s1})} \sqrt{1 + \left(\frac{R_{s2}}{\omega L_2} + \frac{(\omega M)^2}{\omega L_2(R_a + R_{s1})}\right)^{-2}}$$
(3.11)

Equation 3.11 shows that increasing the secondary winding inductance or in other word decreasing the input capacitance of the chip would increase the voltage gain, as expected. Also, lower series resistance in the secondary side (higher transformer quality factor) increases the voltage gain as well.

#### 3.1.1.3 24GHz AC-DC Convertor Design

In the design of the power recovery unit, several competing effects have to be balanced to achieve optimal overall efficiency. In the multistage AC-DC convertor, increasing the number of stages, with a fix load current, would increase the voltage gain of the voltage multiplier, but on the other hand, due to increase of the input capacitance (based on Equation 3.11), it will also reduce the matching network gain. This presents a basic tradeoff in selecting the optimal number of stages based on the available load and source impedances. Also, the selected number of stages in the charge pump affects the size of the input matching transformer. Taking the size of the transformer and practical limitations on area and loss into account, there will be an optimum point for the number of stages. Figure 3.7 shows the

voltage gain of matching network and voltage multiplier as a function of number of stages for  $450 \text{k}\Omega$  load and 0.9V output voltage. In this simulation we are assuming a transformer quality factor Q=12 and coupling coefficient k=0.8.

The efficiency of the charge pump also depends on the size of diode-connected transistors, through two effects: first, increasing the size of the transistors will increase the gain of voltage multiplier through reducing the diode's drop-out voltages. On the other hand, increasing the size of transistors will add to the input capacitance, which will reduce the voltage gain of the matching network (i.e. increases its power loss) [41].

In this design, a six stage Dickson charge pump multiplier with transistor size of  $W = 5\mu m$ has been used. The size of the clock-coupling capacitor should be large enough to provide the required average load current. Here, the clock coupling capacitor size is 150fF. For the matching network, a series capacitor and a 1:2 transformer has been used. The transformer insertion loss and coupling coefficients are 1.7dB and 0.8 respectively. The simulations show that with the current parameters and antenna impedance, the AC-DC convertor can work with a minimum available power of -11dBm.



Figure 3.7: Voltage gain of matching network and voltage multiplier as a function of number of stages for  $450k\Omega$  load and 0.9V output voltage.



Figure 3.8: Schematic of the DC limiter

#### 3.1.2 Limiter

The output voltage of the voltage multiplier depends on the RF input voltage/power, and therefore, it varies across a wide range. To limit this variation, a DC limiter is used. Ideally, in the low input power regime, the limiter should not load or affect the output of the voltage multiplier or the efficiency. In the high RF energy periods, the limiter makes a bypass path for current and limits the output voltage to a certain value. This circuit is designed to limit the rectifier output voltage variation to 1.4V.

The limiter circuit is shown in Fig. 3.8. The circuit shown uses a fraction of the generated output voltage to control the gate voltage of the transistor  $M_6$ . This transistor acts very similar to a tunable load where the load current is controlled through the gate voltage of  $M_6$ . If we neglect the body effect, then we can safely say that transistor  $M_6$  can be turned on when the voltage of node A goes above four threshold voltages. With the body effect taken into account, the turn on voltage will be even higher. The load current of transistor  $M_6$  can be calculated as

$$I_{limiter} = \frac{1}{2}\mu_n C_{ox} (\frac{W}{L})_6 (V_B - V_{th})^2$$
(3.12)

where 
$$V_B = V_A - V_{gs1} - V_{gs2} - V_{gs3} - V_{gs4} \approx \frac{V_A}{5}$$
.

With the given analysis, one can now choose the transistor sizes based on the system specifications. A good staring point would be the input RF dynamic range, which is given by regulatory issues and the power consumption of the system. The maximum operation range of the tag determines the minimum chip power consumption. In the next step, one needs to determine the maximum tolerable output voltage. This value is determined by the number of stacked NMOS transistors. After deciding the turn-on voltage of the limiter and the output voltage variation that can be tolerated, the size of transistor  $M_6$  can be found. Output voltage of limiter as a function of rectifier's input power is shown in Fig. 3.9.



Figure 3.9: Output voltage of the limiter as a function of rectifier's input power

### 3.1.3 Reference Current and Voltage Generator

In passive systems like RFIDs, the chip reference voltage should be designed to be independent of temperature and power supply variations, since there are no external accurate references to correct for variations. The essence of the operation of the proposed reference voltage generator is similar to bandgap references: it uses a PTAT current reference that drives the current into a stack of diode-connected devices and resistors. The schematics of the PTAT current source is shown in Fig. 3.10. PMOS diode-connected devices in subthreshold region are used instead of PNP transistors, since they have smaller  $V_{gs}$  drop compared to



Figure 3.10: Schematic of the PTAT current source and the reference generator

the  $V_{be}$  drop of BJT devices, and this helps enable the stack up devices with a 0.9V supply. To achieve a higher supply rejection while maintaining  $V_x = V_y$ , an op-amp is used. To make sure that  $V_x = V_y$  within the temperature range, the op-amp is using a PTAT current source with current mirror on top for the bias circuit. The current of the PTAT current source in the sub-threshold region can be found as

$$I_{ref} = \frac{nV_T}{R_1} \ln(\frac{(W/L)_2}{(W/L)_1})$$
(3.13)

where  $V_T$  is the thermal voltage and  $n = 1 + \frac{C_{dep}}{C_{ox}}$  [43].

To minimize the power consumption of the regulator, the reference current should be in the order of 10-20 nA. To reduce  $I_{ref}$ , either resistor  $R_1$  can be increased or the width to channel length ratio of the transistors can be selected to be close to unity. Although the latter looks attractive in terms of layout area reduction, it may not be desirable in terms of the sensitivity of  $I_{ref}$  to this ratio. In this project, the  $(W/L)_2/(W/L)_1$  is chosen to be 3 and  $R_1$  is 2M $\Omega$ , leading to a current of 18nA. To generate  $V_{ref}$ , the current is mirrored and flows through a resistor and a diode connected transistor (Fig. 3.10).

$$V_{ref} = V_{gs3} + anV_T \ln(\frac{(W/L)_2}{(W/L)_1})(\frac{R_2}{R_1})$$
(3.14)



Figure 3.11: Reference current and voltage as a function of both temperature and  $V_{dd}$  variation for SS, TT, and FF corners

where  $a = \frac{(W/L)_6}{(W/L)_5}$ . An important aspect for the reference voltage is the temperature dependency. The variations in the reference voltage as a function of temperature is given by

$$\frac{\partial V_{ref}}{\partial T} = \frac{\partial V_{gs3}}{\partial T} + \left(\frac{aR_2}{R_1}\right) \frac{\partial \triangle V_{gs12}}{\partial T}$$
(3.15)

Here, the sensitivity of  $V_T$  to temperature is positive and the sensitivity of  $V_{gs}$  is negative, and they tend to cancel each other to first order.

Figure 3.11 shows the reference current and voltage as a function of both temperature and  $V_{dd}$  variation for SS, TT, and FF corners. As can be seen, the reference voltage increases in the SS corner due to the larger threshold voltage leading to a higher  $V_{gs3}$ . The variations in  $V_{ref}$  are 40ppm/°C for temperature variation from  $-20^{\circ}$ C to 100°C and 18mV when  $V_{dd}$  changes from 0.9 to 1.8V.



Figure 3.12: Schematic of the Low-Drop-Out (LDO) voltage regulator

## 3.1.4 Series Voltage Regulator (LDO)

The output voltage of the limiter circuitry (Fig. 3.8) varies significantly with threshold voltage variations. However, analog and digital circuits require the supply voltage to be constant for proper function, and therefore, a voltage regulator is required to stabilize the supply voltage. Here, an LDO circuit is used to generate the 0.7 V regulated voltage.

The series voltage regulator, shown in Fig. 3.12, is simply a differential amplifier with feedback. The feedback senses the output voltage and compares it with the  $V_{ref}$  voltage provided by the voltage reference. For the differential amplifier, a two-stage PMOS amplifier is chosen. An NMOS current mirror acts as the load of the first stage and copies the current to the second stage. Second stage is using an active PMOS current mirror as the load. The regulator is designed to provide output currents in the range of 0 to  $2\mu$ A.

Stability is a critical issue, especially in the case of large load currents where the pole associated with the output node (dominant pole), given by  $\frac{1}{R_{load}C_{load}}$ , moves closer to the pole at the output of the differential amplifier. Figure 3.13 shows the open loop gain and phase of the regulator for nominal  $1.4\mu$ A output current and 1nF load capacitance. It shows that for  $1.4\mu$ A of output current, the closed loop bandwidth and phase margin are 87kHz and 83°, respectively.

Our system uses an M-PPM modulation scheme for the uplink path from the radio to the reader unit. N pulses will be transmitted every  $T_{cycle}$  to send  $\log_2(M)$  bits. The loop bandwidth of the regulator (speed of regulator) will therefore be determined by the required



Figure 3.13: Open loop gain and phase of the regulator for nominal  $1.4\mu$ A output current and 1nF load capacitance

dynamic settling error of the supply voltage. This is to ensure fast recovery after the transient voltage drop that occurs after the transmission of N pulses. This cycle repeats every  $T_{cycle}$ , therefore, the regulator should recover to a given error voltage within this window. As will be discussed later in this chapter, our proposed  $T_{cycle}$  is  $50\mu$ s. The settling error is given as

$$\varepsilon = e^{\frac{-\iota}{\tau}} \tag{3.16}$$

where  $\tau = \frac{1}{BW}$ . To achieve a settling error of less than 3% (< 1mV), the loop bandwidth should be higher than 70kHz for the nominal current setting. The closed loop bandwidth



Figure 3.14: The closed loop bandwidth and phase margin as a function of output load current

and phase margin as a function of output load current are shown in Fig. 3.14 for 1nF of load capacitance. The total power consumption of the reference voltage generator and the LDO is 230nA from the 0.9V unregulated supply voltage.

## 3.2 Power On Reset (POR)

The power on reset (POR) circuit performs two essential functions for the correct operation of the chip. The first is to generate the reset signal for the digital section of the chip, and the second function is to disconnect the chip from the supply when the required input power level falls below a critical level. The POR circuit measures the power supply level and then compares this level to a certain threshold. Once the supply voltage exceeds the required threshold, the POR circuit generates the required command signals so that the operation of the chip would begin. The chip will be disabled as soon as the supply voltage drops below the threshold. To avoid oscillatory behavior, two thresholds are used for activating and deactivating the chip, much like a Schmitt trigger.

The schematic of the POR circuit is shown in Fig. 3.15 [44]. The first stage consists of series diode-connected devices which are connected to the regulated supply voltage through  $M_{n1}$  and  $M_{p1}$ , which have their sources and gates connected. These two devices will limit the current consumption of the POR circuit. At the beginning, when the supply voltage is low, the diode-connected devices are OFF and the input of the inverter will follow the supply voltage and therefore the POR is low. By increasing the supply voltage, the diodes turn on and at some point the input of the inverter will pass its threshold voltage, and from that point the POR output will follow the supply voltage. The threshold of the inverter will determine the OFF-to-ON point of the POR circuit. After the output goes high, transistor  $M_{n2}$  will short one of the diodes so the ON-to-OFF threshold of circuit would be lower than its OFF-to-ON threshold.

The POR transient simulation is shown in Fig. 3.16 for TT, SS, and FF corners. The ON and OFF threshold of the circuit are 640mV and 480mV, respectively.

## **3.3** Data Recovery and Demodulation

The block diagram of the demodulator circuit is shown in Figure 3.17. The input to the demodulator is a Pulse-Pause-Encoding (PPE) signal, which is ASK modulated. The bit period of the PPE signal in nominal case is 100ns for a "0" signal and 200ns for "1", respectively, when the zero part (notch) is 50ns. In the demodulator section, the envelope of the PPE signal is extracted using an envelope detector and an averaging filter. The comparator then produces a constant envelope PPE signal ready for detection. The integrator and comparator measure the duration of each pulse and decide whether the pulse is "0" or "1". A reset circuit denotes the arrival of the new bit and resets the integrator circuit. The state machine will then decode the command signal from the reader, and activates the related operation mode accordingly.



Figure 3.15: Schematic of the power on reset circuit



Figure 3.16: Transient output of POR for TT, SS, and FF corners



Figure 3.17: Block diagram of the demodulator

## 3.3.1 Encoding and Modulation Type

The demodulator receives and detects the command signals and the synchronization beacons from the reader. This data link uses the same 24 GHz channel that is used for power delivery. Therefore, the type of data encoding for the downlink (reader to radio) communication should minimize the effect on the wireless power flow from the reader to the tags during data transmission. Furthermore, the data rate of the downlink is also important for two main reasons: 1) the communication speed should not be limited by the downlink rate, and 2) the wireless synchronization signals are received through the demodulator. Given that there is no internal reference clocks available to the demodulator, and that these synchronization beacons are the only mechanisms for calibrating and correcting the local clock, the frequency of these beacons should be high enough to allow the required accuracy for the radio.

The simplest type of encoding is non return to zero (NRZ) shown in Fig. 3.18(a). Here, the signal "1" is represented by a full- power voltage across the entire bit period and a signal "0" is represented by a zero voltage during the whole bit period. NRZ encoding has two problems. First, if the input data contain a long stream of "0"'s, there will be no power received for a long time and the required supply voltage can drop below the critical limit, which will limit the operation range. Second, since there are no transitions between consecutive "1"'s or "0"'s, the chip will not be able to pick up the CLK signal from rising or falling edges at the input.

Another possibility is Manchester encoding (Fig. 3.18(b)) [42] [45]. Here, the signal "1" is represented by a high-to-low transition occurring in the middle of the bit period while a signal "0" is represented by a low-to-high transition. For Manchester encoding, since there is always a transition in the middle of the bit stream, the CLK can be extracted from the input data. Here, the power up efficiency of the system is 50%. However, we could increase the peak power since regulations are concerned mainly with the average transmitter power.

Another alternative is pulse pause encoding (PPE) [42]. In this scheme, the pulse starts with a short off interval followed by a long pulse for signal "1" and a short pulse for signal "0" (Fig. 3.18(c)). An advantage of PPE is that it maximizes the duration of the full-power interval of both binary "1" and binary "0" to maximize the amount of the harvested power.



Figure 3.18: Baseband waveform of (a) NRZ encoding (b) Manchester encoding (c) Pulse pause encoding

Also, the CLK can be extracted from the notches at the beginning of each data bit. However, the data rate is data dependent and can be less than Manchester encoding for the same bandwidth.

For this design the PPE encoding was chosen since it will maximize the power delivery to the radio and also provides self-clocking as well as a reasonable data rate.

For the downlink modulation, there are three different options: amplitude shift keying (ASK), frequency shift keying (FSK), and phase shift keying (PSK). The drawback of the ASK modulation compared to PSK and FSK is that it doesn't have constant amplitude so the amount of the power harvested by the tag fluctuates with data sent by reader. Also, since the data is on the amplitude of the signal, it is more susceptible to noise and disturbances. Despite this, the demodulator architecture for ASK is much simpler and lower power compared to PSK and FSK schemes. ASK modulation has therefore been used for its simplicity in the design of the demodulator.

For the ASK modulation, the modulation index (depth) is defined as

$$m = \frac{A_1 - A_0}{A_1} \tag{3.17}$$

where  $A_1$  and  $A_0$  are the amplitude of the signal when the data is "1" and "0". Larger

modulation depth means higher fluctuation on the harvested power but, on the other hand, larger signal to noise ratio for the data path.

## 3.3.2 ASK Demodulator

A simple ASK demodulator consists of an envelope detector and a comparator/ hysteresis circuit. The main problem with only using an envelope detector is the variation in the tracking time constant in the case of minimum and maximum input voltages. The input RF voltage range can vary from approximately 250mV to 1V. If the envelope detector is designed to follow the input envelope with the required precision of the 250mV case, it will fail to track a one-to-zero transition in the case of the maximum input envelope. This is due to the hysteresis; with a high gain in the envelope detector rectifier, the maximum voltage of the "1"state would be too high to be discharged down to the threshold within the time window for the zero state. So the high-to-low transition will be too slow. On the other hand, if the detector is designed to meet the high-to-low requirements by reducing the maximum voltage in the "1"state, then in the minimum voltage case, a low-to-high transition would fail to reach the threshold. The lower gain in the rectifier would not be enough to bring up the voltage in a zero-to-one transition under the the minimum RF input voltage.

To solve this problem, an envelope detector and an averaging filter is used along with the comparator (Fig. 3.19) [46]. The time constant of the envelope detector should be small enough so that it can follow the envelope of the signal and high enough to reduce the high-frequency ripples on the circuit. Furthermore, to reduce the high frequency ripples, and for the correct functionality of the comparator, a low-pass filter is used after the envelope detector so that it will not be tracking the envelope of the signal, only the average.

Since the diode-connected devices are connected to the 24GHz input, their capacitances will add up to the input capacitor which will lead to an increase in matching losses in the input network. So the size of the devices should be kept as small as possible. On the other hand, during a zero input signal, the capacitor is discharged through the reverse current of the diode-connected device, which will determine the minimum size device to rapidly discharge the capacitor. Also, for the averaging filter, we want the reverse current to be as small as possible, to reduce the ripple on the average signal. To accomplish this, the device size on the averaging filter must be much smaller than the envelope detector, but the capacitor size of the averaging part must be much larger than the envelope detector section.

The important parameters for the comprator are its sensitivity, speed, and power consumption. The sensitivity is the minimum voltage that the comparator can detect at its input. For the comparator this value also depends on the input offset of the comparator. The speed of the comparator for the slew-limited case is given by



Figure 3.19: Schematic of the ASK Demodulator

$$t_{switching} = \frac{C\Delta V}{I_{bias}} \tag{3.18}$$

where C is the capacitance at the output of the comparator,  $I_{bias}$  is the bias current of the second stage of amplifier and  $\Delta V$  is equal to half of the supply voltage where the inverter will switch its output. Therfore, the bias current of the comparator is proportional to its speed.

The notch width in nominal case is set to 50ns, and the width of the signal for the 0 and 1 are 50ns and 150ns, respectively. The measurements shows that the notch width can be reduced to 35ns without any degradation in the performance. The modulation index of the system can vary from 0.6 to 1. The power consumption of ASK demodulator is 110nA from 0.7V supply.

#### 3.3.3 PPE Decoder

The PPE decoder consists of an integrator, reference generator, comparator, and a flip-flop (Fig. 3.20). The CLK of the system is provided by the falling edges of notches. The integrator consists of a constant current source which flows through a capacitor during the pulses



Figure 3.20: Schematic of the PPE Decoder

for signal "0" and "1". The output of the integrator will reset by the notch of the next data signal. The maximum output voltage of integrator is determined by

$$\Delta V = \frac{I\Delta t}{C} \tag{3.19}$$

where  $\Delta t$  is the pulse width of "0" or "1" signals.

The reference generator is a voltage divider from the supply voltage by using a stack of diode-connected devices, which, to first order, lead to a reference voltage that is independent of the temperature and process variations. In this case, the reference value should be designed so that  $\Delta V_0 < V_{ref} < \Delta V_1$  and  $\Delta V_1 - V_{ref} > V_{comparator}$  for all temperature and process corners. Here,  $V_{comparator}$  is the sensitivity of the comparator. The output of the comparator is sampled at the end of each data period to define the data at the output. The total power consumption of integrator and comparator is 220nA from 0.7 V supply.



Figure 3.21: Command signals from reader to tag

## 3.3.4 State Machine

After decoding the PPE signal, the state machine determines the operation mode and other parameters (e.g., pulse width for the 60GHz transmitter) based on the data received from the reader. In this system there are two operation modes: ID pulse and data transmission. The ID pulse mode is used in the initial multi-access algorithm to find out the number of tags in the system as well as assigning the random ID number to the tags. This will be described later. In the data transmission mode, the tag compares the notch count with the tag ID number and activates the ring oscillator one slot before the actual position. This single slot is used for calibration of the ring oscillator. After the calibration sequence is complete, the tag sends the local ring oscillator's frequency as well as its data with the 3-pulse M-PPM scheme.

The command sequence from the reader consists of a 5 bit data followed by a notch. The first bit is always zero to activate the state machine. Bits 2 and 3 will distinguish between the operation modes. With a "01" sequence, the state machine will generate a 3 ns-5 ns pulse and sends it to the 60GHz transmitter to generate the ID pulse. If the signal is "10", then the output of the ASK demodulator will be connected to the notch detector to send the M-PPM pulses to the 60GHz transmitter. The  $4^{th}$  bit will activate the pulse width signal. The last bit is always "1" to reset the system. The Command signals are shown in Fig. 3.21. After the command sequence, a series of periodic notches will be sent to act as beacons for calibration of local radio clocks.

## 3.4 Timing, Synchronization, and Multi-Access

In conventional RFID systems, at the onset of interrogation, the reader has no information about the number of tags or their IDs. In our application a similar situation exists, the central node (reader) will start the communication protocol without any information on nearby radios or their tag numbers. This may be the case for many of the sensing or IoT applications. In this section we will describe the design of the timing circuitry, and the multi-access functionality of the radio. Before getting into the details, a brief overview is provided below.

There are multiple algorithms for accessing tags' IDs. Some of these algorithms are deterministic as with the Tree-base protocol, which is basically a binary search scheme [47] [48]. Other algorithms are stochastic as with ALOHA [45]. Our proposed protocol is similar to the Framed Slotted ALOHA (FSA) [49]. In the first phase of the proposed multi-access algorithm, the reader identifies the number of tags by asking for transmission of a single identification pulse (ID pulse) in a random position. After that, the reader will broadcast a slot counter (Q) that shows the number of slots to the individual tags. Then, tags will choose a random slot between  $0 - 2^{Q-1}$ . The reader will monitor the slots and if a slot is idle or occupied by more than one signal, the slot counter will be modified accordingly.

After fixing the frame size (number of slots), the reader establishes timing slots by broadcasting a periodic notch signal (with nominal period of  $T_{slot}=500$  ns). Tags are assigned slots ( $T_{slot}$ ) randomly and communicate with a modified M-PPM algorithm. In the data transmission mode, the tag compares the notch count with the tag ID and activates the ring oscillator (RO) one slot before the actual position for calibration.

For communication and multi-access readout from 1,000's of transponders, a certain level of synchronization between tags as well as with the reader is required. However, the tag does not have any explicit timing reference (e.g. crystal reference) to generate an accurate clock. Therefore, this design incorporates feedback and calibration to combat drift and jitter. Two strategies are utilized: First, the local RO is activated one  $T_{slot}$  before the actual transmission and a counter-based calibration loop compares the local clock frequency with that of the notch period for a first order correction of the RO. Second, a modified M-PPM scheme is used to reduce dependency on absolute clock accuracy. In this scheme, for each data bit, three pulses are transmitted; the first two are used to communicate the local clock on the tag and the third represents data in a 6bit M-PPM scheme. For this particular setting, the aggregate data rate mounts to 12Mpbs, which could be divided evenly between an arbitrary numbers of tags (limited by the number of assigned addresses). The schematic of the timing section is shown in Fig. 3.22.

### 3.4.1 Multiaccess Algorithm

In RFID systems, at the beginning of the communication, the reader does not know the number of tags in its communication sector. Furthermore, the tags operate asynchronously and do not share a common reference. Therefore, to enable individual data transfer between each tag and the reader, there should be a multi-access algorithm to avoid data collision.

The first proposed step to reduce collisions is to utilize a directional reader, which probes the target volume sector by sector (Fig. 3.23). This is an important part of the proposed system, and by limiting the number of tags being interrogated simultaneously, one can significantly simplify the multi-access algorithm. Since downlink and uplink frequencies in our radio are in the mm-wave range (24GHz and 60GHz), and due to the resulting small wavelength, the



Figure 3.22: Schematic of the timing block

reader can use a phased-array architecture for electronic beamforming<sup>1</sup>. Even with dividing the space into smaller sectors, an anti-collision protocol would still be required to address the radios that fall inside each section. This however will be communicating with fewer radios and therefore could be addressed with more efficient algorithms.

The most common channel division in RFID systems is time domain multiple access (TDMA). This is adopted as the multi-access algorithm of this system. The most common TDMA techniques are the binary search, ALOHA, slotted ALOHA, and the framed slotted ALOHA algorithms [45] [50] [51].

In the binary search algorithm, the tags are randomly separated into two subgroups, by being assigned a "0" or "1" by a one bit local random generator, until all tags are identified. The tags with the random "0" will transmit their IDs to the reader right away. If a collision occurs again, the collided tags are split again by selecting "0" or "1" (and hence the binary progression). The tags that select "1" will wait until all the tags with "0" are successfully identified by the reader. This procedure is repeated until there is no further collision. The disadvantage of this algorithm is the large number of iterations and commands between tags and the reader. This scheme cannot readily scale to larger number of tags [52].

<sup>&</sup>lt;sup>1</sup>The design of millimeter-wave beamforming arrays needed for this system is described in later chapters.



Figure 3.23: Using electronic beamforming to communicate with radio tags sector by sector.

In the pure ALOHA system (PA), a tag responds with its ID randomly after being powered up by the reader. If the collision happens, the tag will back off for a random time window and retransmit the ID again. This algorithm has smaller number of reader to tag commands compared to the tree-based protocol and it can adapt dynamically to varying tag population. But the disadvantage of this algorithm is that it has a high probability of collision (partial or complete).

In the Slotted Aloha (SA) algorithm, tags can only transmit their ID in predefined, synchronous time slots. If there is a collision, tags retransmit after a random delay. The period in which a collision can occur is only half of the ALOHA algorithm so the maximum throughput is doubled.

In ALOHA or SA systems, if a tag has high response rate, it will frequently collide with responses from other tags. To solve this problem, framed slotted ALOHA (FSA) systems force each tag to response only once in a frame [50]. Figure 3.24 shows the difference between PA, SA, and FSA systems.

Here, the proposed protocol is similar to Framed Slotted Aloha. In the first phase of the multi-access algorithm, the reader identifies the number of tags by asking for transmission



Figure 3.24: ALOHA anti-collision algorithms (a) PA (b) SA (c) FSA

of a single identification pulse (ID pulse) in a random position. Since the pulses are narrow, the probability of collision is low and after a few iterations the reader determines the number of tags. After that, the reader will broadcast a slot counter (Q) that shows the number of slots to the tags. Then, tags will choose a slot randomly from  $0 - 2^{Q-1}$ . The reader will monitor the slots. If a slot is idle or occupied by more than one tag, the reader will decrease or increase the slot counter. After fixing the slot counter, each tag will communicate with the reader on the  $N^{th}$  slot where N is the number from their random generator.

#### 3.4.2 Modified M-PPM Modulation

For reasons that were discussed in Chapter 2, a high bandwidth pulse-based modulation scheme is used to send data to the reader. Three kind of pulse modulations are possible: pulse width modulation (PWM), pulse amplitude modulation (PAM), and pulse position modulation (PPM).

In PWM the data will be encoded in the width of the pulse sent to the reader. Here since the bandwidth is data dependent, the localization accuracy will be data dependent, which is not suitable for a lot of applications. Longer pulse durations also lead to larger total energy drawn from the supply.

Another option is PAM where the information is in the amplitude of the pulse. Here the BW is kept constant but there are two disadvantages. First, since the information is in the amplitude, it is more sensitive to noise and non-idealities in the channel. Second, to generate the pulses, we need a linear PA in the tag which will be power hungry.

In PPM, the data is sent by a pulse in one of  $2^n$  positions where n is number of bits sent to the reader. This system is more immune to noise and also has a data independent BW. For this project PPM is chosen as the baseline modulation method.

In PPM, since the information is in the position of pulses, an accurate CLK in required in the tag and reader. The clock in the tag is generated by a duty cycled free running ring oscillator which will not be stable under process and temperature variations. Therefore, an algorithm is proposed in which the local tag clock frequency is also communicated with reader for every transmitted symbol. To accomplish this, a modified M-PPM system is used where each symbol consists of three pulses, only one of which is encoding the actual data load. The first two pulses are set one clock cycle apart to communicate the local reference clock to the reader. Data is encoded in the position of the third pulse. Figure 3.25 shows the M-PPM system for the conventional case as well as the proposed system in which the clock frequency is communicated and the reader uses this to decode the data sequence.

### 3.4.3 Notch Period or Duration of Time slot

The selection of the notch period is a trade-off between the ripple on the power supply and the aggregate data rate of the transponder. The 24GHz notches disrupt power transfer to the radios and therefore increasing the notch period will reduce the drop on the supply voltage, but on the other hand it will also decrease the data rate. Therefore, in order to achieve the maximum data rate, the notch period is selected as the minimum value that would be tolerable for the power recovery circuits.

If the frequency of the ring oscillator for the M-PPM modulator is fixed  $(f_0)$ , then the number of bits transmitted in each time slot is given by

$$n = \log_2(T_{slot}f_0) \tag{3.20}$$

If we have N tags in the system, the data rate for each tag is



Figure 3.25: M-PPM system with (a) uncorrected fix CLK frequency on the reader, and (b) Modified version with 3 pulses and a variable corrected CLK frequency on the reader

$$Datarate_{tag} = \frac{\log_2(T_{slot}f_0)}{NT_{slot}}$$
(3.21)

where N is the number of tags in the system.

As previously mentioned, increasing the notch period will reduce the data rate. For this design, the width of the notch signal is limited by the speed of the demodulator circuit. If we assume that the duty cycle of the notch is D, where D is the ratio of active part of the time slot to the whole slot, then the average reduction on the input received power of the circuit can be found as

$$P_{reduced} = 10\log(1 - \frac{1}{D}) \tag{3.22}$$

To limit the reduction in power to less than 0.5dB, the time slot should be at least 10 times longer than the notch width. In this design the notch period can change between 350ns to 500ns, based on the required data rate and modulation index.

#### 3.4.4 Ring Oscillator Frequency

The next design parameter is the frequency of the local ring oscillator. Increasing the frequency of RO will increase the number of available PPM positions (M) in each time slot leading to an increase in the data rate. However, practical constraints limit the frequency of the RO. The main limitations are from the error from the jitter of the ring oscillator, power consumption, and start-up time of the 60GHz pulser.

#### 3.4.4.1 Accumulated Jitter

One of the trade offs in the frequency of RO is the accumulated jitter from the ring. Since M-PPM is used to transmit the data to the reader, the worse case jitter would be the accumulated jitter after M cycles, where  $M = 2^n$  and n is the number of bits sent in one time slot. The accumulated jitter after M cycles is

$$\sigma_{T_{accumulated}}^2 = M \sigma_T^2 \tag{3.23}$$

where  $\sigma_T^2$  is the variance of period jitter for ring oscillator. For a current starved NANDbased ring oscillator with K stages, the variance of period jitter is

$$\sigma_T^2 = K(\sigma_{t_{dN}}^2 + \sigma_{t_{dP}}^2) \tag{3.24}$$

where  $\sigma_{t_{dN}}^2$  and  $\sigma_{t_{dP}}^2$  are the variance of the delay of pulldown and pull-up parts which can be calculated as [53]

$$\sigma_{t_{dN}}^2 = \frac{S_{in} t_{dN}}{2I_N^2} + \frac{kTC}{I_N^2}$$
(3.25)

and

$$\sigma_{t_{dP}}^2 = \frac{S_{ip} t_{dP}}{2I_P^2} + \frac{kTC}{I_P^2}$$
(3.26)

Here  $t_{dN}$  and  $t_{dP}$  are the delay of the pull-down and pull-up circuits,  $I_N$  and  $I_P$  are the current of the NMOS and PMOS current starved, and  $S_{in}$  and  $S_{ip}$  are the drain current noise spectrum for NMOS and PMOS devices. Since the current is set so that the devices are in the sub-threshold region, the noise spectrum of NMOS and PMOS devices are

$$S_{in} = 2qI_N \tag{3.27}$$

and

$$S_{ip} = 2qI_P \tag{3.28}$$

If instead of  $t_d$  in equation 3.25 and 3.26 we replace it with  $t_d = \frac{CV_{DD}}{2I}$ , the accumulated
jitter will be proportional to M, number of stages in ring oscillator (K), rising and falling delay  $(t_d)$ , and is inversely proportional to I. If we assume that  $I_P$  and  $I_N$  are equal, the accumulated jitter based on the equation above is

$$\sigma_{T_{accumulated}}^2 = \frac{T_{slot}}{I} \left(q + \frac{2kT}{V_{DD}}\right) \tag{3.29}$$

where  $T_{slot}$  is the notch period  $(T_{slot} = \frac{M}{f_0})$  and  $f_0 = \frac{1}{K(t_{dN}+t_{dP})}$  is the frequency of the ring oscillator. From this relationship we see that for a constant M, to increase the notch period the power consumption should be increased or on the other hand if the frequency is constant, to increase the number of bits transmitted on each time slot, the power consumption should increase to keep the same accumulated jitter.

To achieve a bit error rate (BER) of  $10^{-3}$ ,  $3\sigma_{T_{accumulated}}$  should be less than half of the slots for each bit  $(\frac{1}{2f_0})$ . Increasing  $f_0$  means that we have less tolerance to jitter so for a fix  $T_{slot}$ , we have to use more power to keep the error in the acceptable range.

#### 3.4.4.2 Power Consumption

Increasing the RO frequency leads to the PPM modulator and calibration circuits working at a higher frequency. Operation at higher frequency increases the power consumption of these circuits  $(CV^2f_0)$ . The RO is a part of the duty-cycled section of the circuit that activates during this startup period, and therefore the total energy used by the circuit is important in calculating the size of the storage capacitor. The duty-cycled energy is only provided by the capacitor and the size determines the voltage ripple. This was explained in chapter 2.

#### 3.4.4.3 Start up Time of 60GHz Pulser

The pulser circuit has a digital block to generate the control sequence based on the input M-PPM command to generate the 60GHz pulses. The sum of the delays on the control block and the start up time of the oscillator and the 60GHz pulse width should be smaller than the period of the RO. This therefore places another limit on the upper frequency of oscillation.

#### 3.4.5 Notch Detector

In the Notch detector block, the tag will count the number of input notches (indicating the time slots number) and compares it with the assigned number N (e.g., from its random generator). If slot number is equal to N-1, one slot before the actual transmission, the tag activates the local clock generator (RO) and the calibration block to run a first-order calibration for process and temperature variations. Then, in slot N the M-PPM modulator block will be activated. After that it again deactivates the timing blocks until the next time frame.

Here, the notch detector is a 7-bits synchronous binary counter using JK flip-flops [54].



Figure 3.26: (a) Block diagram of the notch detector (b) schematic of the synchronous binary counter (c) schematic of the JK flip-flop

The CLK of the counter is from the incoming notches and the reset signal is from the demodulator state machine. The schematic of the counter and JK flip-flop is shown in Fig. 3.26. The notch detector can detect up to 128 tags in each time frame and its speed is 2 MHz. The power consumption of Notch detector is 220nW from the 0.7V supply.

### 3.4.6 Ring Oscillator and Calibration Block

As previously described, the clock of the system is generated by an oscillator once the notch detector activates the enable signal. This part of the circuit is duty cycled and its ON for  $1\mu$ s in the  $64\mu$ s section. First order calculations of the total allowed energy of the startup circuits was performed in Chapter 2. Here, we will refine the calculations based on the actual

circuits and the power consumption. In order to keep the ripple on the supply due to the RO and calibration blocks to less than 20mV (less than 3%), the energy consumption of this part is calculated to be less than 14pJ which dictates an average current of  $20\mu$ A for these blocks. The calculations are given by

$$E_{consumed} = 0.5C_{reg}[V_{DD}^2 - (V_{DD} - V_{ripple})^2]$$
(3.30)

and

$$I = \frac{E_{consumed}}{t_{on}V_{DD}} \tag{3.31}$$

where  $C_{reg}$  is the supply capacitor at the output of regulator.

A NAND based three stage current starved ring oscillator has been designed to meet the fast startup time requirements (Fig. 3.27(a)). Based on the overall budget and the power assigned to the counters, if we assume that the current of the ring oscillator is limited to 500nA, based on equation 3.29 the frequency should be set below 230MHz. We chose the nominal operation frequency to be 200MHz. In this design because of the current starved transistors, the swing at the output of RO is not rail-to-rail and it is possible that on some process corners or temperature it would not pass the threshold of the buffer afterward. To solve this problem, an AC coupled buffer is used after the RO, which sets the DC value to half of the supply voltage. This way the swing is always around the threshold voltage of the buffer and the output of the buffer would have rail-to-rail swing which can be used as the clock frequency (Fig. 3.27(a) and (b)).

Since the system is running open loop, the frequency can change by temperature and process variations. The minimum tolerable frequency is defined by the number of bits being transmitted in each time slot. To send 6 bits in each slot, the minimum frequency is 140 MHz. The maximum frequency is limited by the power consumption and also the start up time of 60GHz pulser, as was previously pointed out, which would set an upper bound of 250MHz. Therefore, the calibration block should modify and correct the current setting of the RO to keep its frequency in the range of 140MHz-250MHz.

A counter-based calibration with the output of RO as the clock signal is used for the correction and calibration of the RO frequency. If the counter output is higher than a preset limit during the calibration cycle, it means that the frequency is higher than expected and the current of the RO will be reduced to decrease the frequency. The opposite action takes place if the counter output is lower than expected. Simulation of RO output frequency for FF and SS corners are shown in Fig. 3.28, before and after calibration.

Another important factor in design of current starved RO and its bias circuitry is the frequency variation as a function of ripple on the supply, since the calibration block cannot account for this. Frequency variation as a function of ripple on the supply can cause an error



Figure 3.27: (a) The schematic of the ring oscillator and the calibration block, (b) simulation result of the ring oscillator for FF process corner



Figure 3.28: Simulation of the ring oscillator output frequency before and after calibration (a) FF corner (b) SS corner

in the transmitted data. The worst case frequency variation can be found as

00

$$\sum_{i=1}^{2^n} \left(\frac{1}{f - \frac{i\Delta f}{2^n}} - \frac{1}{f}\right) = \frac{1}{2f}$$
(3.32)

In our case with f=200 MHz and with 10mV supply ripple during 500ns of data transmission, the maximum variation on the supply should be less than 3MHz. To first order make the bias independent of the supply voltage, a Widlar current source with large length transistors has been used. The simulated maximum frequency variation is 1.56MHz.

#### 3.4.7 M-PPM Modulator

The M-PPM block will turn on in the second half of  $1\mu$ s duty cycle period when N is equal to the output of the notch detector and calibration block is turned off. The output of M-PPM block consists of three pulses, the first two have a constant distance (one CLK period) and are used to send the reference clock to the reader and the third pulse's position shows the output data.

The block diagram of M-PPM modulator is shown in Fig. 3.29. It consists of a 7-bits counter, digital comparator blocks, and a 7-bits adder. The clock of the counter is coming from the output of the ring oscillator. The first and second pulses are generated when counter output is equal to 1 or 3. The last part would be generated when counter output is equal to data+4 and is used for sending 6-bits of information to reader. When all three pulses are sent, the enable would be deactivated and M-PPM modulator and ring oscillator go to idle mode until the next time frame. The power consumption of this part is the same as the calibration block.



Figure 3.29: Block diagram of the M-PPM modulator

## 3.5 60GHz Transmitter

The TX is a highly duty-cycled pulsed-based system operating at 60GHz. It consists of a digital control block, 60GHz oscillator, and 60GHz output buffer. The 60GHz TX optimizes the total consumed energy and, to that end, the overhead time to start and shut down the oscillator is reduced and the TX core is designed for energy efficiency. Figure 3.30 shows the block diagram and schematic of the TX blocks.

The digital control block generates the short 400ps-800ps pulses at the rising edge of the 3.5ns-6.5ns incoming command pulses from the PPM modulator. The pulse width can be wirelessly programmed by changing the capacitor bank  $C_x$ . The block consists of a level convertor to change the signal level of the pulses from 0.7V to the unregulated voltage (0.9V and higher). At the end, this block will generate the offset pulse sequences to control the start up time of the 60GHz oscillator (Fig. 3.30). The nominal offset time is 70ps for 0.9V supply voltage.

For the 60GHz pulser, a cross-coupled oscillator with current switching has been used for faster start up time [55]. A cross-coupled topology was selected for the core to achieve a low-power and robust design. The start up time of the oscillator is a function of quality factor of resonator and the initial condition, and is inversely proportional to the frequency of operation and the open loop gain. The overall start up time of the oscillator can be reduced by forcing a more suitable initial condition. The proposed fast-starting oscillator uses an offset pulse sequence for the tail current to introduce asymmetry in the core, which can generate an initial voltage across the capacitor  $C_s$ . The value of  $C_s$  is a trade off between the amplitude of the initial condition and the  $g_m$  degeneration. Smaller value of  $C_s$  will increase the initial condition but on the other hand will reduce the open loop gain. At 60GHz, a 200fF capacitor leads to an impedance value of  $13\Omega$ . Extra assist switches are used for an abrupt and stable ON to OFF transition. A two-turn inductor with inductance of 180pH and quality factor 12 is used at the output. The oscillator is AC coupled to the 60GHz buffer stage. To increase the operation range of the transmitter, the oscillator uses the unregulated supply that is at a higher voltage compared to the LDO output. The oscillation frequency versus unregulated supply voltage is shown in Fig. 3.31. The oscillation frequency remains in the 57.9-60.25 GHz range for supply variations of 0.8V to 1.8V. The frequency will decrease by increasing the supply voltage since the  $g_m$  of the cross-coupled transistors increase, and so the tank moves its resonance frequency to lower value to compensate for negative resistance of the cross-coupled pair.

A current-switched 60GHz buffer provides the interface to the antenna. Cascode devices are used for their higher Maximum Stable Gain (MSG) and unconditional stability at this frequencily range. The size of the cascode devices are selected to be larger than the input devices to keep the input transistors in saturation for the minimum supply voltage. Optimum output impedance of the buffer is calculated to be  $500\Omega$  in parallel with  $300j\Omega$  by a



Figure 3.30: Block diagram and schematic of the TX blocks



Figure 3.31: Oscillation frequency as a function of supply voltage

load-pull analysis to maximize the output power. A two-to-one transformer is used to match this impedance to the 60GHz output antenna. The matching network presents 3.3dB of insertion loss. The  $S_{11}$  of the output matching stays better than -10dB for 58 to 62GHz. Peak output power of the buffer is -3dBm for a 0.9V supply voltage. Figure 3.32 shows the output pulses of 60GHz pulser.

## 3.6 On-Chip Antenna Design

The goal of this project is to integrate the entire radio on a single chip and therefore, two onchip antennas at 24GHz and 60GHz are integrated. On-chip antennas are known for their low radiation efficiency, primarily due to the power loss in the low resistivity silicon substrate. The excitation of TM/TE surface wave modes in the lossy silicon substrate significantly degrades the antenna performance [56] [57]. The energy coupled into surface waves can radiate out of the backside, diffract from the chip edges (causing sidelobes in the radiation pattern) or be resistively lost as heat within the conductive p-type doped Si substrate. In [58], it was shown that the surface wave losses can be mitigated by employing thinner substrate, shrinking the chip size, or a proper consideration in choosing the antenna type and its excitation. A dielectric lens (with the same permittivity as the silicon) attached to the backside of the chip can mitigate the excitation of the surface modes in the lossy silicon [59]. Alternatively, introducing a superstrate material on top of the chip could enhance the efficiency [60]. This is due to increasing amount of the radiation into the top side

Thanks to Mustafa Rangwala and Nemat Dolatsha for assistance in the simulation and design of the on-chip antennas.



Figure 3.32: Ouput pulse of the 60GHz pulser

rather than the lossy silicon substrate (in the case of infinite substrates this amount of radiation is proportional to  $(\epsilon_{top}/\epsilon_{Si})^{3/2}$ . However, adding extra substrate layers adds cost and complexity and requires additional processing steps. Another approach is using on-chip wirebond antennas, which present relatively higher efficiency. Wirebonds could be from one side of the chip to the other. Still, this comes with a penalty in terms of added cost and complexity. Our aim is to eliminate any extra processing steps and to reduce the cost and size of the proposed radio. In our design, we have chosen on-chip integrated antennas due to the lower cost. We have tried to address the typical challenges of the on-chip antenna. In the following sections, particular challenges and concerns regarding the 24GHz and 60GHz antenna designs are presented and addressed.

#### 3.6.1 24GHz Folded Dipole Antenna

The main constraints in design of the 24GHz on-chip antenna are the size, efficiency, and high input impedance (required to generate a larger voltage swing to achieve high efficiency for the AC-DC circuit). To this aim, resonant type antennas are the primary choice due to their potentially smaller sizes. Although on-chip slot dipole antennas integrated on relatively thin chip substrates demonstrate slightly higher radiation efficiency in comparison to electric dipoles, they require a large area due to their ground plane. In the case of thicker grounded substrate, the slot dipole suffers from lower efficiency due to propagation of strong TM surface waves along the chip. A folded electric dipole is used due to the higher input impedance than an electric dipole (while sharing the same loss and efficiency mechanism). It should be noted that the design of the 60GHz antenna is less challenging because of its smaller dimension due to shorter wavelength. A simple electric dipole can simply fulfill the requirements.

The potential challenges in this design are substrate and metal conductive losses, the large size of the 24GHz antenna, mutual coupling issues between antennas and metal pieces around, as well as chip storage capacitor size and location. Here, the amount of substrate loss due to surface wave modes propagating on the silicon substrate is highly mitigated by the thin 120 $\mu$ m chip thickness and the shrinkage of the chip size [58]. As the current distribution along a folded dipole is most pronounced around the feed point in the center, and gradually vanishes at both dipole ends, the current distribution at the end of the arms has a negligible effect on the radiation gain and pattern. Therefore, both ends of the dipole can be bent into the chip to shrink the total chip size without degrading the antenna performance. Simulations confirm that for up to 1000 $\mu$ m for the bend section, there is no significant drop in the antenna gain. In order to reduce the conductive losses, the antenna traces are chosen wide enough (200 $\mu$ m) and are realized in the top metal layer, which offer a relatively thick metal layer (~1 $\mu$ m).

Another challenge for the 24GHz antenna is the placement of the large storage capacitors and the 60GHz antenna, both of which could potentially disrupt the current distribution and present significant mutual coupling effects. Metal sections in the proximity of the folded dipole simply resemble a director in Yagi-like antennas. The mutual coupling between the folded dipole and these metal sections can degrade the gain of the antenna in the broad side direction. In addition, this mutual coupling drops the input impedance of the folded dipole. We have investigated different locations and architectures for the capacitors (e.g., in the space inside the folded dipole at the center or on both ends). The mutual coupling can degrade both radiation performance and input impedance. For example, simulations show that long capacitor sections along the folded dipole significantly degrade the 24GHz antenna broadside gain. This induces large amount of current to the capacitor structures due to proximity to resonance in the capacitor structures. The optimum location and shape of the capacitors is seen in Fig. 3.33. The final dimensions of the active circuit and the capacitors are 0.2mm by 0.5mm and 0.2mm by 0.635mm respectively. After optimization, by adding the active circuit model and capacitors, the resonance impedance of the folded dipole antenna drops by  $\sim 7\Omega$ . This also leads to an efficiency drop that is partially due to extra conductive loss of the induced current on the storage capacitors and the 60GHz dipole. Another loss mechanism is from the larger amount of power are oriented towards the side of the chip due to capacitors and the 60GHz antenna which leads to the radiated power to the sides that bears larger substrate loss. We have also investigated the effect of various metal filling squares for chip density requirements (seen in Fig. 3.33 in the area between the folded dipole lines). These small metal pieces are unconnected with no large current induced on them, and if placed far enough from the dipole lines to minimize direct coupling, the overall effects can be minimized. Simulations confirm that their influence on antenna performance is negligible.

Figure 3.33 demonstrates the proposed architecture of the chip. It includes a half-wavelength folded electric dipole, metal plates mimicking the active circuit and storage capacitors, and the 60 GHz half-wavelength electric dipole, all integrated on the chip. The effective wavelength (which is seen by the average permittivity of air and the silicon dioxide) at 24 GHz is  $\sim$ 8mm.

#### 3.6.2 Half-wave 60GHz Dipole Antenna

The 60GHz antenna is less challenging because of the smaller size due to shorter wavelength at 60GHz. The initial length of the dipole is chosen  $\lambda_{eff}/2 = 1.6$  mm (where  $\lambda_{eff}$  is the average wavelength of air and silicon dioxide). Simulations show that in the absence of the folded dipole and the storage capacitors, its radiation pattern is more oriented towards -X direction and the efficiency can approach 50%, which is due to the high permittivity of the Si substrate. However, in the presence of the folded dipole and the storage capacitors, the radiation pattern becomes more directive in the opposite direction. Here, the folded dipole acts as a big reflector for the 60GHz dipole. The overall efficiency is close to 40% in that case.



Figure 3.33: Overview of the final chip architecture



Figure 3.34: Overall simulated pattern and gain for the 24 GHz and 60 GHz chip antennas

Figure 3.34 shows the overall simulated pattern and gain for the 24GHz and 60GHz chip antennas when the chip is placed on a 1mm substrate (RO5880 selected here as example). The overall efficiency of the 24GHz antenna is 28% with a antenna impedance of ~60 $\Omega$ . The 60GHz antenna has an efficiency of 44% and input impedance of ~165 $\Omega$ .

## Chapter 4

## **Experimental Results**

Two version of the transponder were fabricated in 65nm CMOS process: one with access for directly probing the mm-wave input and output, and second, the pads-less version with on-chip antennas. The direct-probing version characterizes the chip prior to the antenna and the wireless testing. We use these initial measurements to characterize the front-end performance, power recovery efficiency, transmitter frequency and pulse shape, and chip efficiency.

The technology is a 65nm digital CMOS process with no extra RF options. The two antennas take a large portion of the chip area and care was taken to minimize density rule violations despite the large antenna modules. Dummy filaments were inserted throughout the chip, including under both antennas and around circuit elements. Some of these filaments are visible inside the larger (24GHz) antenna element.

The die photo of the pad-less version is shown in Fig.4.1. The chip occupies a footprint of 3.7mm by 1.2mm including both on-chip antennas at 24GHz and 60GHz. As previously discussed, this chip does not have any RF or DC pads and all connections, both for signal and power, are wireless and through the two antennas.

## 4.1 Power Recovery and Demodulator Measurements

The downlink power measurements use a Rohde & Schwarz SMF100A (100KHz-43.5GHz) RF signal generator to deliver the 24GHz input signal and ultra low leakage voltmeters to measure the DC signal at the output of rectifier and regulator. Losses in the path are carefully de-embedded to provide an accurate reading of power delivered to the chip at the GSG pads. A power sweep is performed to characterize the power recovery efficiency. The output DC voltages are carefully monitored during this power sweep. At the point that the power-on-reset (POR) is turned ON, the circuit would be activated and this loads the rectifier so there would be a slight decrease on the output voltage of the rectifier. The measurement shows that POR is turned ON when output voltage of rectifier is around 680mV, which



Figure 4.1: Chip micrograph of the pad-less transponder.

matches the expected value from simulations.

Fig. 4.2 shows the measured sensitivity and the low-dropout regulator (LDO) range. The measurement shows that the chip requires -10.5dBm input power to generate 0.9V of unregulated voltage at the output of the rectifier stages. For a 40dBm EIRP 24GHz transmitter and with 0dB peak broadside gain at the receiver antenna, this -10.5dBm power sensitivity translates to a 32cm range. Our actual circuit operates with voltages slightly lower than 0.9V and this is not the absolute lower bound. Also, the wireless version has a different source impedance than the  $50\Omega$  here, and efficiency is slightly better.

To measure the functionality of the demodulator, the 24GHz signal source is programmed to generate the data command sequence that activates the chip in a particular mode to transmit one 60GHz pulse with a nominal pulse width (single ID pulse command). The input data sequence uses Pulse-Pause Encoding (PPE). This command setting is modulated by the 24GHz carrier and delivered to the chip. The chip is powered and activated with the same sequence and responds on the 60GHz output. This output is extracted and down-converted through an external RX chain consisting of LNA, a downconverter (IF frequency from 0.1 to 3GHz), and IF amplifiers and is displayed on the Agilent DSA90804A oscilloscope (8GHz of BW and 40GSa/sec sampling rate).

As previously discussed, for multi-access capability, the reader broadcasts periodic notches to synchronize all the sensors in the network. The frequency of these notch signals is programmable and dependent on network parameters. Nevertheless, these notch signals will



Figure 4.2: Measured chip sensitivity and regulator output.

inevitably present a lower available power to the chip, as well as to result in supply ripples. A faster demodulator can handle shorter notch signals and therefore reduce the supply ripples that result from the unavailability of input power during the notch. We use a nominal notch width of 50ns in testing. However, measurements confirm that this width can be reduced to 35ns without effecting the performance of the receiver. The downconverted downlink and uplink measurements are shown in Fig. 4.3 for the single ID pulse command.

## 4.2 Uplink Measurements

The probe-testing version of the chip enables direct measurements of the TX signal waveform. A sampling oscilloscope (Agilent 86100C with 70GHz sampling heads) directly probes the 60GHz uplink waveform to characterize TX pulse properties. Figure 4.4 shows the direct measurement of the three output pulses for the nominal case. The measurement shows that the bandwidth of the 60GHz pulse can be programmed between 1.25 to 2.5GHz with the nominal case of 1.6GHz.

The timing, multi-access, and PPM modulation functions are tested by sending commands to activate the data communication mode followed by a train of 50ns wide notches with 500ns period. The width and period of notches can be programmed between 35-50ns and 350-500ns, respectively. The tags will activate their timing circuits and modulation block after the  $N^{th}$  input notch (where N is the tag ID number). After activation the tag trans-



Figure 4.3: Down-converted measurements for the single ID pulse mode. The red traces show the envelope of the downlink data sequence to the chip. The black traces are the downconverted measurements of the single 60GHz uplink pulses from the chip. The bottom row shows the zoomed version of the same measurement.

mits three 60GHz pulses. As previously discussed, the position of the third pulse encodes the output data. Here, in each time slot (500ns time slot), 6 bits are encoded, so there are 64 different positions for the third pulse. The down converted output data for N = 2 and D = 3 is shown in Fig. 4.5.

The chip was tested for 128 different tag ID numbers. Figure 4.6 shows the measured uplink and downlink sequence for N equals to 4, 16, and 48. As is shown in the figure, tag ID determines the time slot in which the tag with ID equal to N is activated.

The chip was also tested for all 64 output data positions. Examples of three different output data sequences is illustrated in Fig. 4.7. The measurement shows that the chip achieves the expected throughput in all operation modes.



Figure 4.4: Direct measurement of output pulses using sampling oscilloscope



Figure 4.5: Downconverted output pulse for N = 2 and D = 3 (red: DL, black: UL)

The external 60GHz receiver used for testing presents some non-idealities into the system. These non-idealities include higher noise level as well as the non-quadrature detection that introduces timing errors into the chain. More specifically, the low-IF non-coherent singlepath receiver presents an offset error in measuring the arrival time of the envelope, leading to decoder timing errors. A quadrature down-converter driven with accurate clock would eliminate this problem. Measurements were performed to characterize the actual achievable



Figure 4.6: Downconverted output pulses with multiple tag ID number

bit rate and BER. The measured average and standard deviation of timing errors normalized to bit period are shown in Fig. 4.8. These measurements do not de-embed any errors and include all RX measurement errors in addition to the chip clock jitter and all other nonidealities. To have a BER better than  $10^{-3}$  equation 4.1 should be correct for all possible data positions. The measurement shows that although we have timing error from the external RX path, the error rates stays within acceptable bounds and confirms better than  $10^{-3}$ BER for 5bit/slot modulation (equivalent to 14Mbps with  $T_{slot} = 350ns$ ).

$$-0.5 < \mu_{error} \pm 3\sigma_{error} < 0.5 \tag{4.1}$$



Figure 4.7: Downconverted output pulses with multiple output data

## 4.3 Wireless Measurement

Wireless tests with the pad-less chip use external low-IF down-converter connected to a real-time oscilloscope to observe chip functionality and performance. Figure 4.9 shows the measurement set up for wireless testing. The chips are glued on a board using double-sided tape. The 24GHz uplink signal is generated using the SMF signal source, external PA, and a 24GHz horn antenna. End-to-end full wireless testing shows the pad-less chip is fully functional and achieves expected throughput in all operation modes. The external 60GHz



Figure 4.8: (a) Measured average of pulse arriving timing error normalized to bit period and (b) measured standard deviation of pulse arriving timing error normalized to bit period

#### CHAPTER 4. EXPERIMENTAL RESULTS

receiver used for testing presents some non-idealities into the system such as higher noise level and non-coherent detection (previously discussed). The chip achieves a measured range of 28cm and 50cm with EIRP of 40dBm and 45dBm, respectively. The 60GHz transmitter range was also measured separately when the external 24GHz uplink was at a distance of 20cm. Under these conditions, the measured 60GHz transmitter range is 85cm.

Table 4.1 summarizes the performance of the single-chip tag.



Figure 4.9: Wireless setup for measuring of single-chip tag

| Frequency [GHz]            | 24(DL)/60(UL)                                                |  |  |
|----------------------------|--------------------------------------------------------------|--|--|
|                            | DL: Folded/Meandered Dipole                                  |  |  |
| Antenna Type               | UL: Dipole                                                   |  |  |
| Modulation Type            | DL: Pulse Pause Mod.<br>UL: Wideband M-PPM                   |  |  |
|                            | DL: 6.5Mbps                                                  |  |  |
| Data Rate                  | UL: $>12Mbps$ (aggregate)                                    |  |  |
|                            | -10.5dBm (Typ.)                                              |  |  |
|                            | -11.5dBm (Functional)                                        |  |  |
| Min Required Power for DL  | -9.3dBm (w/Antenna)                                          |  |  |
| Standby Power              | $< 1.5 \ \mu W \ (0.9V)$                                     |  |  |
| UL Pulse BW                | 1.25-2.5 GHz                                                 |  |  |
| Uplink $P_{out}$           | -3dBm                                                        |  |  |
|                            | 28 cm (w / 40 dBm EIRP)                                      |  |  |
| Measured Range             | $50 \mathrm{cm} (\mathrm{w}/ 45 \mathrm{dBm} \mathrm{EIRP})$ |  |  |
| Overall Size               | 3.7mm by 1.2mm                                               |  |  |
|                            | 19 (28 cm)                                                   |  |  |
| Energy Efficiency $(pJ/b)$ | 7.9 (20.5 cm)                                                |  |  |

Table 4.1: Summary of chip performance

## 4.4 Single-Chip Measurement

The transponder can be used as a low power radio by itself. In that case, there would be a trade-off between the data rate and operation range. Higher data rates require a lower duty cycle, which translates to a higher average power requirement (lower sensitivity). The energy efficiency of single chip radio as a function of range and for multiple data rates is shown in Figure 4.10.



Figure 4.10: Wireless communication energy efficiency with range

## Chapter 5

# Low-Power Scalable 60GHz Phased Array Receiver for Mobile Applications

## 5.1 Introduction

The need for an energy efficient mm-wave phased-array system that would act as the gateway for the proposed IoT devices was previously discussed. Our main focus in this chapter will be on the design of a scalable silicon architecture for 60GHz mm-wave phased-arrays. We will begin with an overview of the system and then discuss architecture details, circuit design, and finally, measurements.

Due to the availability of 7GHz of unlicensed bandwidth, the 60GHz band provides an attractive solution for multi-Gb/s short range communication [61] [62] [63] [64] [65] [66]. Commercial solutions in silicon for wall-powered applications are now available [67]. Portable devices, e.g. mobile phones and tablets, are much more sensitive to system cost and energy efficiency compared to non-mobile devices. This represents an important challenge in the realization of an integrated 60GHz solution suitable for such low power mobile applications.

Energy and area efficient phased array transceivers are key to enabling such multi Gb/s communications in mobile devices at 60GHz. Phased arrays enable beam-steering, which provides an agile means of overcoming path-loss, fading, and security issues, as well as allowing spatial power combining in order to ease the design of the power amplifier at mm-wave frequencies [68] [69] [70] [71] [72]. An N element phased array solution (on both the transmitter and the receiver) improves the link budget by a factor of  $N^2$  on the transmitter due to coherent spatial combining and by a factor of N on the receiver due to improved signal to noise ratio. This overall  $N^3$  improvement in the budget is critical in overcoming the large inherent path loss at the 60GHz carrier frequency.

Despite the low gain of CMOS at mm-wave frequencies, lower manufacturing costs and higher integration can be achieved compared to other semiconductor processes. Integrating the RF with baseband (BB) and Built-In-Self-Test (BIST) circuitry significantly reduces assembly and testing costs, which at mm-wave can dominate the overall costs of the final system.

Although significant progress has been made in the design of silicon based phased arrays, current implementations at mm-wave frequencies have not yet simultaneously achieved sufficient performance and area/power efficiency. For example, in [68] and [70] an integrated 16 element phased array transmitter and receiver were implemented in a  $0.12\mu$ m SiGe process with excellent performance, but the power consumption of these designs (which were targeted to support video streaming with the wirelessHD standard) is too high for portable devices. In order to also address our particular application of interfacing to IoT sensors, the efficiency of the system is even more important. As another example, [69] demonstrated the lowest power CMOS phased array transceiver to date using an RF phase shifting architecture. However, this low power consumption was obtained at the expense of somewhat reduced per-element performance (e.g. gain, noise figure, and output power).

In this chapter the design of a scalable, integrated 4 element BB-phase shifting transceiver is described which is implemented in a 65nm standard CMOS process. This design achieves high level of performance per-element while maintaining low area and power consumption. By utilizing essentially the entire available bandwidth (BW) at 60GHz as a single channel, this transceiver was designed to allow 10Gbps communication using QPSK modulation.

### 5.2 Theory of Phased-Array Antennas

In a phased array architecture, the phase/ delay of the signal feeding to the array of antennas is varied in a way that the effective radiation pattern of the array is reinforced in a desired direction and suppressed in undesired directions (Fig. 5.1). For an N element array with d as the distance between elements, the relative delay between any two consequent element is  $\frac{dsin(\theta)}{c}$  where  $\theta$  is the incident angle and c is the speed of light in free-space.

Array Factor (AF) is an important parameter in phased arrays and is defined as the additional power gain achieved by the phased array architecture over the power gain of a single element. Therefore, the array factor for an N element phased array antenna is given by [73] [74]

$$AF(\theta,\tau) = \left(\frac{\sin\left(\frac{N}{2}\left(\frac{\omega d \sin\theta}{c} - \omega\tau\right)\right)}{\sin\left(\frac{1}{2}\left(\frac{\omega d \sin\theta}{c} - \omega\tau\right)\right)}\right)^2 \tag{5.1}$$

The above equation shows that if  $\tau = \frac{dsin(\theta)}{c}$ , then the power of the received signal at the output of the array is increased  $N^2$  time relative to power of the received signal at an indi-



Figure 5.1: An N-element time-array system.

vidual antenna element. To calculate the SNR boost at the output, the array factor should be divided by the noise power at the output. Here, assuming that the noise at each element is uncorrelated, the noise power at the output is N times the noise of each element. So the SNR of array architecture would improve by number of elements (N).

Since implementing of true time delay incurs a large area penalty, most of the practical systems use phase shifters instead of true time delay [73] [74] [75]. In this case, equation 5.1 can be rewritten as

$$AF(\theta,\tau) = \left(\frac{\sin(\frac{N}{2}(\frac{\omega d \sin \theta}{c} - \phi))}{\sin(\frac{1}{2}(\frac{\omega d \sin \theta}{c} - \phi))}\right)^2 \tag{5.2}$$

where

$$\phi = \frac{\omega_{\circ} dsin\theta}{c} \tag{5.3}$$

Here,  $\omega_{\circ}$  is the center frequency.

This is a narrowband approximation and fails when the instantaneous bandwidth is large since as the frequency deviates from the center frequency, the beam angle moves from incident angle and would be equal to  $\sin^{-1}(\frac{f_{\circ}}{f}\sin\theta)$ . This will cause an Array-induced Inter-Symbol Interference effect and will reduce the output SNR [75]. The maximum BW of uniform arrays for narrowband approximation is [73] [74]

$$BW = \frac{0.866c}{Ndsin\theta} \tag{5.4}$$

The other important factor in phased array antenna is the distance between antennas, which essentially determines the spatial sampling frequency. The distance between antennas depends on desired position of the grating lobe with respect to the main lobe. The grating lobe is the unintended strong radiation in an undesired angle and results from aliasing if the spatial sampling frequency is lower than the equivalent Nyquist rate. To avoid grating lobes, the maximum distance between antennas should remain below the following limit [76]

$$d_{max} = \frac{\lambda}{1 + \sin\theta} \tag{5.5}$$

Here  $\theta_{max}$  is the maximum incident angle. Figure 5.2 shows the maximum antenna spacing as a function of desired incident angles. It shows that to cover  $\pm 90^{\circ}$ , the antenna spacing should be  $\frac{\lambda}{2}$ .



Figure 5.2: Maximum antenna spacing normalized to wavelength as a function of desired incident angle

## 5.3 mm-Wave Phased-Array Architectural Choices

#### 5.3.1 Architecture Choice

Multiple architecture choices are available for the implementation of integrated phased arrays. The three main choices considered in this section RF, LO and BB phase-shifting architectures (Fig. 5.3). It is important to note that all of these topologies require some form of phase-shifting as well as mm-wave combining or splitting. The difference lies in the placement of these components on RF or LO paths.

Phase shifters at RF tend to be large and can introduce substantial losses on RF signal paths, leading to lower output power at the TX and higher NF in the RX. Besides lower losses [77], phase shifting at baseband can achieve improved phase resolution, leading to better control over null placement and depth [78] [79] as well as the ability to perform more precise calibration in order to correct for various impairments.

In an RF architecture, splitting and combining happens at the carrier frequency on the signal path, whereas the baseband or LO architectures have mm-wave splitters on the LO path and IF frequency combiners/splitters. The loss and variation of this mm-wave combining and splitting network are generally easier to tolerate in the LO distribution due to the lower bandwidth of the signal as well as the observation that the overall performance of the transceiver is less sensitive to the amplitude on the LO path as compared to the signal path (as long as the LO amplitude is sufficiently large). On the other hand, the RF architecture eliminates the multitude of mixers and therefore, the ultimate optimal choice also depends on the number of elements in the array.

On the transmitter side (as shown in Fig. 5.3), the power amplifiers (PA) themselves are equivalent in three architectures. However, all else held fixed, the PA's in the RF shifting architecture are provided with lower input power (and hence lower equivalent isotropically radiated power (EIRP) for fixed PA gain) due to the loss of the phase shifters preceding the PA. Also, for a similar power delivered to the PA, the splitter in the RF architecture necessitates larger mixers to deliver more power to the RF network. In the BB or LO architectures, the mixer sizes can be down-scaled proportionally to achieve the same RF power per element.

A similar argument holds for the comparison of the three architectures on the receiver. As shown in Fig. 5.3, similar low noise amplifiers (LNA) are used, but the mixers in the BB/LO architecture can be scaled in device size since they are working with smaller (desired) signal power levels since it is before combining. We will also assume that the RF VGA and BB phase rotators will have similar power consumption and limitations. The noise figure of a single path in the RF architecture can be described by



Figure 5.3: Block diagram of 2-element RX, LO distribution, and TX for (a) BB phase shifting (b) RF phase shifting (c) LO phase shifting architectures.

$$F_{RF} = F_{LNA} + \frac{L_{\phi}}{G_{LNA}} + \frac{F_{VGA} - 1}{G_{LNA}L_{\phi}} + \frac{F_{mixer} - 1}{G_{LNA}\underbrace{L_{\phi}G_{VGA}}_{\approx 1}}$$
(5.6)

While the same path in the BB or LO architectures have a noise figure of

$$F_{IF} = F_{LNA} + \frac{F_{mixer} - 1}{G_{LNA}} + \frac{F_{rot} - 1}{G_{LNA}G_{mixer}}$$
(5.7)

$$F_{LO} = F_{LNA} + \frac{F_{mixer} - 1}{G_{LNA}}$$

$$(5.8)$$

In the calculations it should be noted that the noise of the mixer in the RF architecture directly adds to the RF paths when input referred and is not divided by the number of elements. This is due to the uncorrelated noise signals at the input of the Wilkinson power combiner. The analysis shows that the noise figures of the RF and BB topologies are similar, with the BB topology perhaps being slightly better. While the RF architecture suffers from higher noise due to lossy phase shifters and the compensating VGA (the second and third terms in (3)), the mixer noise component is larger in the BB topology due to reduced device sizes. The noise figures of LO and BB topologies are similar if we have enough gain from the LNA and mixer path.

Finally, in the receiver, the RF power combiner of the RF architecture is replaced by an RF splitter on the LO path. The splitter requires extra buffers to overcome power split loss in addition to power consumption from large signal drive of the splitting network. Therefore, to get the full advantages of using the BB architecture, the design of a low power LO distribution network including quadrature phase generation is critical.

One of the perceived advantages of the RF phase shifting architecture is superior linearity. Since spatial power combining occurs in the RF domain, the mixer is relieved from handling large signal blockers that generally originate from different spatial directions. Nevertheless, due to the losses of the passive RF phase shifters, maintaining low noise figure requires that buffer amplifiers be placed after the LNA, and these buffer amplifiers will therefore limit the linearity. Similarly, if an active RF phase shifter is used, the variable-gain amplifiers within the active phase shifter will determine the linearity of the front-end. In a BB phase shifting architecture, the mixers in each element can be decomposed into an active block and a passive current (or voltage) commutating block. The active block plays a similar role as the buffers or VGAs in the RF phase shifters, so that the overall linearity is roughly comparable in both architectures.

It should be noted that although an LO phase shifting mm-wave phased-array eliminates phase shifters in the signal path, high frequency phase shifters (or phase rotators) and an LO distribution network are still needed. Furthermore, it also requires quadrature downconversion mixers per path and overall this architecture does not offer a substantial advantage over the alternatives for low power phased-array design.

Given that at a high level and for small to medium sized arrays, a BB phase shifting architecture can achieve performance and power comparable to that of the more traditional RF phase shifting architecture, the transceiver described here utilizes BB phase shifting in order to leverage its improved phase control and accuracy.

Note that Wilkinson power combiners are used since typical voltage or current summation structures will lead to significant coupling and interactions between the elements, making such an approach unattractive from a scalability standpoint.

#### 5.3.2 Number of Elements

Besides the method by which phase shifting is achieved, the other architectural decision on the transmitter is related to the choice of the number of elements for a required EIRP. The required EIRP and the effective transmitter efficiency are given by

$$EIRP_{required} = N^2 P_{out/El} \tag{5.9}$$

$$\eta_{overall} = \frac{N^2 P_{out/El}}{N P_{DC/El}} = N \eta_{El} \tag{5.10}$$

where N is the number of elements and  $\eta_{El}$  is the efficiency per element. This first-order analysis does not take routing and distribution losses into account.

For the same EIRP, we can either scale up the number of elements (smaller  $P_{out}/El$ ) or the output power of each PA. Given that due to spatial power combining, the TX EIRP improves by  $N^2$  while the DC power only scales with N, it is often beneficial to use a larger number of elements for better overall efficiency (Eq. (5.10)). However, increasing the number of TX elements is generally limited by one of the following factors: footprint limitations on using larger antenna arrays, the overhead power in the TX path eventually dominating the overall power as the output power of each PA is reduced, and finally the degraded efficiency of each PA as the output power is reduced to the level where the losses from the impedance transformation network (required to interface the high impedance levels of a low-power PA to 50 $\Omega$  antennas) dominate.

Also the maximum allowable bandwidth for an specific incident angle is a function of number of elements for narrow band assumption for phased array architectures(using phase shifter instead of true time delay). The maximum allowable bandwidth is

$$\frac{f_{BW}}{f_{\circ}} < 0.886 \frac{c}{f_{\circ} NdSin(\theta_{in})} \tag{5.11}$$

where c is the free-space speed of light,  $f_{\circ}$  is the central frequency, N is the number of elements, d is the space between antennas, and  $\theta_{in}$  is the incident angle.

Maximum allowable bandwidth as a function of incident angle for different number of elements is shown in Fig. 5.4 for  $f_{\circ} = 60GHz$ . As illustrated, to have the full coverage for all angles with 7GHz of available bandwidth around 60GHz, the maximum number of elements can be N = 16.

Figure 5.5 demonstrates the same concept but for a fixed incident angle of  $\pm 90^{\circ}$ . The obtainable bandwidth is shown in terms of number of elements.



Figure 5.4: Maximum allowable bandwidth as a function of incident angle for different number of elements

#### 5.3.3 Phase Resolution

In practical situations, the resolution of phase shifters is limited and determined by the digital quantization of phase. If phase shifter has n bits, the resolution of phase shifter is  $\Delta\phi_{res} = \frac{360^{\circ}}{2^n}$  and the beam-steering resolution would be  $\theta_{res} = \sin^{-1}(\frac{1}{2^{n-1}})$ . Since the discrete phase of the phase shifters are not able to compensate completely for the phase shift in the path, the array factor and since SNR improvement would degrade. In the receiver, the SNR improvement can be calculated as

$$SNR = SNR_{ideal} \cdot \frac{AF(\Delta\phi, \theta_{in})}{N^2}$$
(5.12)

and

$$AF(\Delta\phi,\theta_{in}) = power[\sum_{i=1}^{N} cos(\omega t - i\pi sin(\theta_{in}) + \Delta\phi_i)]$$
(5.13)

where N is the number of elements and  $SNR_{ideal}$  is the SNR improvement is case of infinite phase shifter resolution.

Figure 5.6 illustrates the SNR improvement as a function of incident angle for different



Figure 5.5: Maximum allowable bandwidth as a function of number of elements for incident angle of  $\pm 90^{\circ}$ 

phase shifter resolutions when N = 4 and N = 16. As is seen, the absolute value of SNR degradation does not change with number of elements but the percentage of error increases as number of elements in the system decreases. Fig. 5.6 shows that 5 bits of phase resolution comes close to ideal SNR improvement.

## 5.4 Link Budget Calculation

We can use the Friis equation to perform link budget calculations. Based on the Friis equation, for a single element receiver, the received power is a function of the transmit power, transmitter and receiver antenna gains, frequency of operation, and range. The received power can be found as

$$P_{RX} = P_{TX} + G_{TX} + G_{RX} + 20\log(\frac{\lambda}{4\pi R})$$
(5.14)

where  $P_{TX}$  and  $P_{RX}$  are the transmitted and received power in dB,  $G_{TX}$  and  $G_{RX}$  are the transmitter and receiver antenna gain,  $\lambda$  is the wavelength at 60GHz, and R is the range. Based on the above equation, the SNR at the output of the receiver would be



Figure 5.6: SNR improvement as a function of incident angle for multiple phase resolutions (a) N = 4 (b) N = 16.

$$SNR_{out} = P_{TX} + G_{TX} + G_{RX} + 20\log(\frac{\lambda}{4\pi R}) + 174dBm - 10\log(BW) - NF \qquad (5.15)$$

where BW is the receiver bandwidth and NF is the receiver noise figure. As we discussed in section 5.2, a phased array architecture with N element (on both the transmitter and the receiver) improves the link budget by a factor of  $N^2$  on the transmitter due to coherent spatial combining and by a factor of N on the receiver due to improved signal-to-noise ratio. So the output SNR would be

$$SNR_{out,Array} = P_{TX} + G_{TX} + G_{RX} + 20\log(\frac{\lambda}{4\pi R}) + 174dBm - 10\log(BW) - NF + 30\log(N)$$
(5.16)

In this design by utilizing the entire available bandwidth at 60GHz as a single channel (5GHz), this transceiver was designed to allow 10Gbps communication using QPSK modulation. The BER as a function of output SNR is shown in Fig. 5.7 for BPSK, QPSK, and 16QAM modulations. From Fig. 5.7, the required QPSK SNR for  $10^{-3}$  BER is 10dB. Table 5.1 shows the required number of elements for 1m, 2m, and 5m range for a 10Gb/s 60GHz QPSK link. Based on the discussions in the previous section, for a given EIRP specification, it is beneficial to move to higher number of TX elements instead of increasing TX power. However, at some point, overhead power levels from the TX chain or distribution start to dominate, and therefore, an optimal power per stage exists. Here, a 0dBm transmitter with a four-element array balances this tradeoff.



Figure 5.7: BER as a function of SNR for BPSK, QPSK, and 16QAM modulations

| Distance                 | 1m     | 2m     | $5\mathrm{m}$       |
|--------------------------|--------|--------|---------------------|
| No. of Elements          | 2      | 4      | 6                   |
| $P_{TX}$                 | 0dBm   | 0dBm   | 0dBm                |
| $G_{TX}$                 | 0dBi   | 0dBi   | 0dBi                |
| EIRP                     | 6dBm   | 12dBm  | $15.5 \mathrm{dBm}$ |
| Path Loss                | 68dB   | 74dB   | 82dB                |
| $G_{RX}$                 | 0dBi   | 0dBi   | 0dBi                |
| $P_{RX}$                 | -62dBm | -62dBm | -66.5dBm            |
| RX BW                    | 5GHz   | 5GHz   | 5GHz                |
| Noise Level              | -77dBm | -77dBm | -77dBm              |
| RX NF                    | 7dB    | 7dB    | 7dB                 |
| SNR <sub>out</sub>       | 8dB    | 8dB    | 3.5dB               |
| SNR <sub>out,Array</sub> | 11dB   | 14dB   | 11.3dB              |

Table 5.1: Link budget analysis
### 5.5 Transceiver

The block diagram of the proposed 4-element phased array transceiver is shown in Fig. 5.8 [80]. The transmitter consists of a 7-bit resolution baseband phase rotator, a double-balanced quadrature Gilbert mixer, and a zero-voltage-switching power amplifier. Here since each transmitter delivers a relatively low output power, the overall efficiency is often not dominated by the PA, but rather by the other required blocks in the transmitter chain that can each contribute to a significant portion of the total power consumption. Therefore, optimizing the efficiency of every block in the transmitter is critical to achieving low total per-element power. Maximizing the current efficiency of the mixer is also critical not only in reducing the mixer power, but also in maximizing the impedance at the LO ports in order to reduce the required LO buffer current swing. A combined baseband phase rotator and quadrature mixer is used to reuse the bias current. Also, a new baseband phase rotator architecture is proposed which increases the average efficiency of the phase shifter by 40%. Since QPSK modulation has relatively small peak-to-average power ratio (PAPR), this enables the use of a zero-voltage-switching (ZVS) amplifier for improved efficiency.

The 60GHz LO for both the TX and RX is generated by a fully integrated, integer-N, charge pump based phased locked loop (PLL) that has been optimized for minimum integrated output phase noise. The core of this type-II 3rd order integer-N PLL consists of a fundamental mode VCO with directly coupled buffers to drive the LO distribution chain. In order to achieve a low-power and robust 60GHz VCO design, a cross-coupled topology was selected for the core.

The LO distribution network was designed to minimize power consumption while maintaining scalability for larger phased-arrays. Maintaining a constant impedance with transmission lines and matched power splitters allowed arbitrary routing of the LO signal, which scales well for larger arrays. In this design, in-phase splitting is performed by Wilkinson dividers to each of the 4 elements, followed by local hybrids to generate the quadrature LO.

The direct conversion receiver consists of a three-stage LNA, a single-balanced quadrature down-conversion mixer, and a BB phase rotator. The information about the transmitter and LO generation can be found in [80]. In the following section, the design of 60GHz receiver will be analyzed.

### 5.6 Receiver

A schematic of each receiver element is shown in Fig. 5.9. Each receiver path consists of an ESD protection structure, a three-stage LNA, a single-balanced quadrature downconversion mixer, and a BB phase rotator. The most effective technique applied to reduce power consumption at the receiver is to scale down the device sizes and hence to scale up the operating impedance levels. At the receiver's input, the antenna interface limits the



Figure 5.8: Block diagram of 4-element phased array transceiver

impedance to  $50\Omega$ , but increasing internal impedances is nonetheless still an effective means of reducing the power consumption in the chain.

### 5.6.1 LNA

In the LNA design, device and current scaling were performed under given noise figure and bandwidth constraints, resulting in a three stage, inductively degenerated cascode LNA with integrated ESD protection as part of the input matching network (Fig. 5.10). Cascode devices are used for their higher Maximum Stable Gain (MSG) in this particular process and at 60GHz unconditional stability at mm-wave frequencies, and also to reduce sensitivity to process variations. Additionally, to reduce the area of LNA, lumped matching networks have been used. The total area of LNA is  $370\mu$ m by  $270\mu$ m. This is crucial in the scalability of the LNA to address larger arrays.

The first stage of the LNA is optimized for noise and power matching. The noise figure of the first stage of LNA can be found as

$$F = 1 + \frac{R_g}{R_s} + \left(\frac{\gamma}{\alpha}\right) g_m R_s \left(\frac{\omega}{\omega_T}\right)^2 \left[1 + \frac{\left(\frac{\omega}{\omega_T} + \frac{\omega}{\omega_p}\right)^2}{1 + \left(\frac{\omega}{\omega_T} + \frac{\omega}{\omega_p}\right)^2}\right]$$
(5.17)

where  $R_s$  is the source impedance,  $R_g$  is the gate resistance,  $\omega_T$  is the transient frequency of



Figure 5.9: Schematic of one receiver element



Figure 5.10: Schematic of the 3-stage LNA

device, and  $\omega_p = \frac{g_m}{C_p}$  where  $C_p$  is the capacitance at the source of cascade device. To design the first stage, the device is biased at the optimum  $\omega_T$  which gives the minimum value of noise figure.  $NF_{min}$  has a lower dependency to device size [81]. Then the device is sized so that the optimum noise and power input impedance converge to 50  $\Omega$ .

The input transformer serves both as a part of the input impedance transformation and also as the ESD protection. At low frequencies (transient ESD frequency), the primary winding of the transformer will short the ESD currents to ground. Due to very low magnetic coupling of the 60GHz transformer at these low frequencies, the secondary side will not see the voltage or current spike. The only consideration then is the current handling capacity of the metal traces. The  $9\mu$ m width chosen in this design was previously shown to be adequate for protection against 400V machine model events [64]. The inner radius of the input transformer is  $20\mu m$  and the insertion loss is about 1.5dB.

A 2-to-1 transformer is used at the output of the three stage LNA to achieve the required impedance matching between the LNA and the  $g_m$  stage of the quadrature mixer. The input admittance of the mixers  $(I + Q \text{ of } g_m \text{ stage})$  is 2.9m + j30mS while the LNA output admittance is 1.3m + 8.1mS. The 2-to-1 transformer downconverts the high LNA output impedance to low mixer input impedance while the inductance of the transformer resonates with the total capacitance of the mixers and the LNA. The LNA is designed to provide 15dB of power gain (23dB of voltage gain), 6dB of NF, and -23dBm of input referred 1dB compression point (high-gain mode) and has a measured power consumption of 10mW (8.2mA from a 1.2V supply) under nominal bias settings. It is also designed to provide gain tunability of 9dB (by utilizing 4-bit programmable bias DACs on each of the three stages) with less than 1dB increase in NF. Figure 5.11 shows the simulated power gain, noise figure, and input  $S_{11}$ of LNA for four different bias currents.

### 5.6.2 Mixer

#### 5.6.2.1 Mixer Architecture

The mixer can either be designed as a passive mixer or active. Passive mixers provide a low power option for achieving linearity and eliminating flicker noise. Since transistors act like switches in a passive mixer, in the ideal case they don't contribute to flicker noise. Also the linearity in passive mixers, in contrast to their active counterpart, is not limited to the transistor nonlinear properties; the limitation comes mainly from the switching nonlinearities. On the other hand, passive mixers require a large voltage drive to provide reasonable gain and noise figure and this could be costly at millimeter-waves. Also, since the power gain is negative, later stages could contribute to the overall noise figure.

#### 5.6.2.2 Mixer Design

Impedance and device scaling is applied to the mixer stage to reduce the power consumption. The schematic of the active mixer is shown in Fig. 5.12. The gain and noise figure of mixer



Figure 5.11: Simulated power gain, noise figure, and input  $S_{11}$  of LNA for four different bias currents.



Figure 5.12: Schematic of single balanced mixer

can be found as [82]

$$G_{mixer} = \frac{2}{\pi} g_{m3} R_L \tag{5.18}$$

$$F = \left[1 + \frac{r_{g3}}{R_s} + \frac{\gamma}{g_{m3}R_s}\right] + \left[\frac{4\gamma}{\pi\alpha R_s g_{m3}^2} (\frac{I_B}{V_{LO}})\right] + \left[\frac{4r_{g1}(G^2)}{R_s \alpha g_{m3}^2}\right]$$
(5.19)

$$G(t) = \frac{2g_{m1}g_{m2}}{g_{m1} + g_{m2}}$$
(5.20)

where  $I_B$  is the bias current,  $V_{LO}$  is the LO swing and  $\alpha$  is 1 if the switching function is square wave and is  $1 - (\frac{4}{3})\Delta f_{LO}$  otherwise when  $\Delta$  is the time that both switches are ON.

By looking at the above equations we see that increasing the bias current and swing on the LO device would increase the gain and reduce noise figure. But both of these methods would increase the power consumption since the power on the LO side is given as

$$P_{LO} = \frac{\pi}{Q_L O} C_{sw} V_{LO}^2 f_{LO} \frac{1}{\eta}$$

$$(5.21)$$

where  $Q_{LO}$  is the quality factor of the LO input,  $C_{sw}$  is the input capacitance of the switching

devices, and  $\eta$  is the efficiency of LO buffer.

On the switch side, to reduce the power, downscaling of device sizes is used. This downscaling is ultimately limited by the quality factor and self-resonant frequency (SRF) of the matching network on the LO side, since progressively smaller switch sizes require larger inductors for matching at the LO port. In addition, further reduction in switch size will result in a higher required overdrive and hence larger LO power for the same gain. To break this limiting tradeoff on device size and LO power, a single balanced mixer with  $10\mu$ m transistors was chosen. This choice will however lead to LO leakage into the mixer output. Therefore, a series LC LO trap has been implemented using a 2-turn inductor in order to reduce the LO feedthrough on the IF side by 25dB. The mixer requires -5dBm of LO power to provide 700mV of differential input swing at the switching transistor gates, which is provided by the mixer buffers in the LO distribution network.

Another effect of device and current scaling is the increase in the impedance of the internal mixer node. A series gain enhancement tuning network was therefore implemented with  $120\mu$ m of  $80\Omega$  Coplanar Waveguide (CPW) transmission line in order to prevent loss on the high-frequency signal [83]. Each I/Q mixer can provide a simulated voltage gain of 6dB, and the total measured power consumption of the quadrature mixer is 5mW (2.1mA from a 1.2V supply for each of the I and Q mixers). The mixer has a simulated input voltage at 1dB compression point and NF (referenced to 50 $\Omega$  input impedance) of 123mV and 9.5dB, respectively.

#### 5.6.3 Baseband Phase Rotator

The baseband phase shifter consists of a bank of current-summed stages whose polarities and input sources are controlled digitally. The relative gains of the I and Q channels are controlled in steps of 1/8. Combined with the polarity controls to set the phase quadrant, this leads to 11° of phase resolution. To reduce power consumption, a partial I/Q sharing structure [64] is implemented here, resulting in 12 cells in each phase shifter. The output current from different elements are summed in the current domain at the center of the chip and then converted to voltage through load resistors. The combined signal is then fed into a baseband buffer to drive the outputs off chip. Each phase rotator achieves a simulated bandwidth of 3.5GHz while consuming 3mW of power. The input referred noise density and input voltage at 1dB compression point of the phase rotator are  $4nV/\sqrt{(Hz)}$  and 220mV, respectively.

Thanks to Lingkai Kong for his collaboration in design of the baseband phase rotator.



Figure 5.13: Schematic of the baseband phase rotator

# 5.7 Experimental Results

The transceiver was realized in a 65nm standard CMOS process with no special RF options. The die photo of the 4-element phased array transceiver including on-chip PLL and LO distribution is shown in Fig. 5.14. Due to the use of lumped-element based design, the area of each TX and RX element is  $0.3mm^2$  (TX)/ $0.416mm^2$  (RX) (including mixer buffers and hybrids), and the overall TRX occupies an area of 2.5mm by 3.5mm. All measurements were performed by direct on-chip probing of mm-wave signals in a chip-on-board assembly which allowed DC and IF signals to be bonded out to connectors on a PCB.

Because of the use of a transformer at the input of the RX, the matching is wideband and remains better than -10dB from 56 GHz-to-65 GHz. Measured  $S_{11}$  for all four elements is shown in Fig. 5.15.

The RX gain and bandwidth measurements were performed by using a 60GHz signal generator and a 25GHz spectrum analyzer. The overall bandwidth was measured by maintaining a fixed LO frequency of 61GHz and sweeping the RF frequency (Fig. 5.16). The overall gain and bandwidth of a single-element receiver were 24dB and 1.8GHz respectively (Fig. 5.17). In addition, the RF bandwidth was measured separately by sweeping the LO and RF signals together at a constant offset of 500MHz resulting in a constant 500MHz IF frequency (Fig. 5.16). As shown in Fig. 5.17, the RF 3-dB bandwidth is higher than 6.5GHz and is limited on the high side at 65GHz by the instrument capabilities (VNA). Given the expected high RF bandwidth and our simulations of the bandwidth of the baseband chain, we believe that the limitation on the overall bandwidth is most likely due to filtering from bond wires



Figure 5.14: Transceiver die photo (left side: RX, center: LO generation and distribution, right: TX).



Figure 5.15: Measured  $S_{11}$  of the 4-element receiver.



Figure 5.16: Single-element RX gain and BW measurement setup (a) Overal BW measurement (b) RF BW measurement

and PCB routing of the baseband signals. These limitations would of course be removed if the baseband were integrated onto the same die as the RF transceiver.

Measurement of a single-element receiver's NF vs. IF frequency is illustrated in Fig. 5.18. The NF measurement was performed using a 60GHz noise source and an Agilent N8974A noise figure meter. An average noise figure of 6.8dB was measured over 2GHz of IF bandwidth, with a minimum NF as low as 6.3dB.

The measured RX phase constellations of the four elements is shown in Fig. 5.19. In the RX, due to the ability to control the I and Q phase shifters independently, direct measurement of I and Q phase and amplitude is performed. The RX achieves  $360^{\circ}$  of phase shifting range with worst-case phase steps of  $11^{\circ}$ . The gain variation across all phase settings and all elements is less than  $\pm 0.5$ dB.

The two-element phased array normalized gain is shown in Fig. 5.20 as a function of phase shifter rotation angle. This two-element measurement was performed by using dual GSG probes and off-chip power splitters allowing simultaneous probing of both elements. The phase setting of one element was held constant while the phase shift on the other element was swept over its entire range. Due to the high phase resolution and low gain mismatch, the measured peak-to-null ratio on RX is 29dB. This confirms that the on-chip isolation



Figure 5.17: Single-element RX measured gain and BW: Overall BW (red) and RF front end BW (black).



Figure 5.18: Measured receiver noise figure

between elements is sufficient to obtain a good peak-to-null ratio.

As a further characterization of the performance of the array, 4-element synthesized RX patterns were constructed for  $30^{\circ}$  and  $-45^{\circ}$  (Fig. 5.21). These patterns are based on mea-



Figure 5.19: Measured RX phase constellations for all four elements



Figure 5.20: Measured RX's two-element pattern

surements of each of the 4 elements phase and gain characteristics and assume a  $\lambda/2$  uniform array. In such array patterns, the peak gain is not as sensitive as the nulls to the mismatch



Figure 5.21: Synthesized array patterns with array steered to: (a)  $30^{\circ}(b) - 45^{\circ}$ 

in gain and phase between elements.

Table 5.2 shows the summary of the transceiver's performance.

| Technology                   | 65nm CMOS                         |  |
|------------------------------|-----------------------------------|--|
| Array size                   | 4                                 |  |
| RX Gain/ Element             | 24dB                              |  |
| RX NF/ Element               | $6.8\mathrm{dB}$                  |  |
| Phase Resolution (RX)        | 5bits                             |  |
| Phase Resolution (TX)        | 6bits                             |  |
| 3dB BW (RX)                  | $> \pm 1.8 GHz$                   |  |
| 3dB Power BW (TX)            | 8GHz                              |  |
| $IP_{-1dB}$ @ RX Gain (/El.) | -29dBm                            |  |
| TX Output Power/El.          | -1.5dBm                           |  |
| Total Power                  | $137 \mathrm{mW}(\mathrm{TX/RX})$ |  |
| Synthesizer Power            | 29mW                              |  |

Table 5.2: Summary of the TRX performance

### 5.8 Hybrid Array

As discussed in section 5.3.1, different architectures have their own advantage and disadvantages. For small array sizes, in term of power and control over phase resolution, IF phase shifting seems like a promising choice. On the other hand, for large array sizes, the quadrature LO generation/distribution for IF phase shifting architecture would be area and power hungry. Although in RF phase shifting architecture the LO distribution is much more power/area efficient, but the performance and null controls would be degraded due to phase shifter and combiners in the signal path. To get the benefits of both architectures, a combined RF-IF architecture would be beneficial in term of performance and area/power consumption. Figure 5.22 shows an example of RF-IF hybrid architecture for 16 element array. Here each 4-elements is phase shifted and combined in RF domain. Then the four combined signals are phase shifted and combined in IF domain. Another extension of this hybrid architecture is an array of sub-arrays topology in which each of the sub-array elements is connected to an independent digital baseband chain. In this system, digital beamforming can be performed on the subset of the receiver paths. This array of sub-arrays topology can be scaled up to address large-scale arrays with multiple beams.

Low loss, linear and compact phase shifters compatible with nanometer digital CMOS processes are essential in realizing these large-scale mm-wave hybrid phased array systems. Receiver noise figure/linearity as well as transmitter efficiency/ output power are affected by the performance of the phase shifter. We discuss mm-wave phase shifting in the next chapter.



Figure 5.22: Block diagram of a 16-element RF-IF hybrid array

# Chapter 6

# 60GHz Low-Loss Compact Phase Shifters

As discussed in the previous chapter, to enable a large array size we need to design a hybrid of RF-IF phase shifting architecture. Low loss, linear and compact phase shifters compatible with nanometer digital CMOS processes are essential in realizing large-scale mm-wave phased array systems. Receiver noise figure/linearity as well as transmitter efficiency/output power are affected by the performance of the phase shifter. This chapter discusses the concepts and challenges of RF phase shifter design, as well as two design examples of a low loss, compact RF phase shifters at 60GHz.

### 6.1 RF Phase Shifters

RF phase shifters can be implemented either as active or passive circuits. The advantage of the active phase shifters is their lower loss and as a result better noise performance. On the other hand, active phase shifters usually have large power consumption and lower linearity compared to their passive counterparts.

The active phase shifters can be implemented as a vector modulator-based circuit. Here, the RF path will be divided to I and Q paths and weighted using two variable gain amplifier and summed at the end (Fig. 6.1). The phase shift at the output is given as [75]

$$\theta_{out} = tan^{-1} \left(\frac{A_Q}{A_I}\right) \tag{6.1}$$

The most common passive RF phase shifters are varactor-loaded transmission line and reflective type phase shifters [84] [85] [86].

Varactor-loaded transmission line phase shifter consists of multiple  $\Pi$  section of transmission line sections as shown in Fig. 6.2(a). A lumped model of transmission line using inductors



Figure 6.1: Block diagram of vector modulator-based phase shifter



Figure 6.2: Schematic of varactor loaded transmission line phase shifter (a) Distributed model (b) Lumped model

can be used for implementation (Fig. 6.2(b)). The area of this architecture is large due to number of inductors (TR sections) require for the phase resolution. Also, since the characteristic impedance of the line depends on the phase shifter state  $(Z_{\circ} = \sqrt{\frac{L}{C}})$ , keeping reasonable input match for high phase resolution is difficult. Here, the phase shift per stage is given by

$$\theta_{1stage} = \omega \sqrt{LC} \tag{6.2}$$

Reflective type phase shifters (RTPS) use a 90° hybrid and a variable complex load to obtain



Figure 6.3: Block diagram of reflective type phase shifter

the phase shift. Fig. 6.3 shows a RTPS structure with a variable load. Next section will describe the RTPS phase shifter.

### 6.2 Reflective Type Phase Shifter

Since the RF phase shifter resides in the signal path, loss and linearity considerations are extremely important. Noise figure and dynamic range of the receiver front-end are affected by phase shifter performance. On the transmitter, phase shifter losses play an important role in determining the chain efficiency. In addition, we aim to reduce the footprint to allow a scalable solution for large arrays. The total required phase shift is only 180° since phase inversion in the chain can provide another 180° to obtain full 360° range.

Reflective type phase shifters (RTPS) use a 90° hybrid and a variable complex load to obtain the targeted phase shift. The signal is fed to port1 of the hybrid. The reflected component from the reactive identical loads of port 2 and 3 are phase shifted by  $\theta$ . The returned signals on the input port1 are out of phase while the signal at the output (port4) combine in phase. The final signal has a total phase shift of  $\Delta \theta$  as shown in Fig. 6.3. The output phase shift can be computed as

$$\theta_{out} = -\frac{\pi}{2} - 2tan^{-1}(\frac{X}{Z_{\circ}})$$
(6.3)

and

$$\Delta \theta_{out} = 2[tan^{-1}(\frac{X_{max}}{Z_{\circ}}) - tan^{-1}(\frac{X_{min}}{Z_{\circ}})]$$
(6.4)

where X is the reactance of the variable load and  $Z_{\circ}$  is the hybrid characteristic impedance.

The amount of phase shift obtained from RTPS depends on the design of the reflective load. The simplest kind of load is a single varactor. Based on Eq.(6.4), to get 180° phase



Figure 6.4: Schematics of different reflective loads (a) Varactor load (b) SRL (c) RLT

shift, the varactor tuning range should be infinity. To increase the phase shift range of varactor, a series inductance can be added to form a resonant circuit (Single Resonated Load or SRL). To further increase the phase shift at the output, a parallel capacitor can be included at the load to add a pole to the impedance transfer function (Resonant Load with L-match Impedance Transformation or RLT) [87] [88]. Figure 6.4 shows different reflective loads for RTPS. Here, to obtain the full 180° phase shift, two different loads are exploited: a capacitive load and a RLT load.

### 6.3 Hybrid Design

A key component for design of RTPS structure is the hybrid. Loss, bandwidth and area are the three main parameters dictating the hybrid performance. Previous designs have reduced hybrid size by use of coupled line hybrids and capacitive loading or slow- wave effects [64] [89]. However, even with these techniques, transmission line hybrids require a large area. As an example, in [64] capacitive loading and asymmetric coplanar waveguides are used but the final area is still not suitable for a large-scale phased array system ( $280\mu m$   $370\mu m$ ). In addition, in many of these hybrids the area-saving techniques result in higher insertion loss and phase/gain imbalance [64].

To reduce the area of the hybrid, a lumped element based hybrid with inductive and capacitive coupling has been used (Fig. 6.5). The hybrid includes a transformer with coupling factor of k, mutual capacitors  $C_1$  and  $C_2$ , and phase balancing capacitor  $C_3$ . To design the hybrid's parameter, we take advantage of electric and magnetic walls due to specific symmetries in the structure based on excitation [90] [91]. Because of these even or odd mode excitation of ports, the scattering parameters of hybrid can be found as superposition of even /odd mode's reflection coefficient as below

$$S_{11} = \frac{1}{4} (\Gamma_{ee} + \Gamma eo + \Gamma oe + \Gamma oo)$$
(6.5)



Figure 6.5: Circuit schematic of the transformer-based hybrid.

$$S_{21} = \frac{1}{4} (\Gamma_{ee} - \Gamma eo + \Gamma oe - \Gamma oo)$$
(6.6)

$$S_{31} = \frac{1}{4} (\Gamma_{ee} + \Gamma eo - \Gamma oe - \Gamma oo)$$
(6.7)

$$S_{41} = \frac{1}{4} (\Gamma_{ee} - \Gamma eo + \Gamma oe - \Gamma oo)$$
(6.8)

where  $\Gamma_{ee}$  is the reflection coefficient when both  $P_1$  and  $P_2$  are magnetic wall,  $\Gamma_{eo}$  is the reflection coefficient when  $P_1$  is magnetic wall and  $P_2$  is electric wall,  $\Gamma_{oe}$  is the reflection coefficient when  $P_1$  is electric wall and  $P_2$  is magnetic wall, and  $\Gamma_{oo}$  is the reflection coefficient when both  $P_1$  and  $P_2$  are electric wall (Fig. 6.6).

The reflection coefficient can be calculated by using the terminating port admittance and the odd/even mode's input admittance. The reflection coefficient and input admittances can be found as [91]

$$\Gamma_{ij} = \frac{Y_0 - Y_{ij}}{Y_0 + Y_{ij}}$$
(6.9)

and

$$Y_{ee} = j\omega_0 C_2 \tag{6.10}$$



Figure 6.6: Equivalent circuit for even/odd mode excitation (a)  $\Gamma_{ee}$  (b)  $\Gamma_{eo}$  (c)  $\Gamma_{oe}$ (d)  $\Gamma_{oo}$ .

$$Y_{eo} = j\omega_0(2C_3 + C_2) + \frac{2}{R + j\omega_0 L(1+k)}$$
(6.11)

$$Y_{oe} = j\omega_0 (C_2 + 2C_1) \tag{6.12}$$

$$Y_{oo} = j\omega_0(2C_3 + 2C_1 + C_2) + \frac{2}{R + j\omega_0 L(1 - k)}$$
(6.13)

now for the lossless coupler (R = 0), the following conditions should be satisfied

$$S_{11}(\omega_0) = S_{41}(\omega_0) = 0 \tag{6.14}$$

and

$$\frac{S_{31}(\omega_0)}{S_{21}(\omega_0)} = j \frac{\alpha}{\sqrt{1 - (\alpha)^2}}$$
(6.15)

By using the above equations the values of L,  $C_1$ , and  $C_2$  can be found as a function of transformer coupling factor (k) and phase balancing capacitor  $(C_3)$  [90]. Figure 6.7 shows the values of L,  $C_1$ , and  $C_2$  as a function of transformer coupling factor for different values of  $C_3$  for 3-dB coupler ( $\alpha = 0.707$ ). For a fixed transformer coupling factor, the required transformer size and therefore hybrid area will decrease by increasing the phase correcting capacitor. A reasonable minimum limit to transformer size has to be selected to ensure functionality even with addition of parasitic routing inductances. With these factors in mind, inductors in the range of 70pH to 90pH are reasonable which dictates phase correcting capacitors in the range of 20-30fF. The transformer coupling factor presents a trade off between the hybrid's phase and amplitude imbalance. Increasing coupling factor will reduce the phase imbalance but on the other hand increases the amplitude imbalance.

### 6.4 Cascade RTPS

It was previously shown that in order for a capacitive load to provide the full 180° phase shift, the capacitance has to be tunable from zero to infinity. Obviously this is not realizable with on-chip varactors. The phase shift range of one stage RTPS with a reflective load is typically less than 40° due to limitations in tuning range of CMOS varactors [88]. By appropriate sizing of the varactors, an acceptable compromise between tuning range and losses can be obtained. 180° phase shift is achieved by cascading capacitive type reflective load RTPS as shown in Fig. 6.8.

The loss and phase shift of a cascade RTPS can be calculated as

$$IL_{cascade} = n(2IL_H + 20\log\sqrt{\frac{1 + [(R_v - Z_0)C_v\omega]^2}{1 + [(R_v + Z_0)C_v\omega]^2}})$$
(6.16)



Figure 6.7: Simulation results for L,  $C_1$ , and  $C_2$  as a function of  $C_3$  and transformer coupling factor (k)

$$\phi = n\left(-\frac{\pi}{2} - \tan^{-1}\frac{1}{(R_v - Z0)C_v\omega} + \tan^{-1}\frac{1}{(R_v + Z0)C_v\omega}\right)$$
(6.17)



Figure 6.8: Circuit schematics of the cascaded RTPS

where n is the number of cascaded stages,  $R_v$  is the series loss of the varactor, and  $I_{LH}$  is the hybrid insertion loss.

By using Eq. 6.17, we can drive the total phase shift as a function of number of cascade stages, minimum capacitor value, and the tuning range of varactor.

$$\Delta \phi = 2ntan^{-1} \left( \frac{Z_{\circ} \omega C_{min} (1 - TR)}{1 + (Z_{\circ} \omega C_{min})^2 TR} \right)$$
(6.18)

In Fig. 6.9, the required tuning range of the varactors for  $180^{\circ}$  phase shift is shown as a function of minimum capacitor value for different number of cascade stages. Increasing the number of stages reduces the required varactor tuning range, hence a higher quality factor varactor can be implemented. Although this reduces the loss per stage but the overhead loss (due to additional stages) increase, and hence an optimal exists for design of segments and number of stages (Fig. 6.10).

### 6.5 RLT Reflective-Type Phase Shifter

In RLT loads, the phase shift of a single stage is increased by adding a pole/zero to the reflective load (Fig. 6.11). This pole/zero combination increases the achievable phase shift with the penalty of reduced phase linearity across frequency. Additionally, inductors have a larger footprint and even though this structure only uses a single hybrid, the total area is still larger compared to the cascade RTPS. However, since the number of hybrids and varactors are reduced, a lower insertion loss can be achieved. The phase shift of RLT load can be calculated as

$$\phi = -\frac{\phi}{2} - 2tan^{-1} \left(\frac{LC_v \omega^2 - 1}{Z_o((C_v + C_T)\omega - LC_v C_T \omega^2)}\right)$$
(6.19)

To design the RLT load, the inductor value should resonate out with  $C_{min}$ . Then adding capacitor  $C_T$  will form an L-match which increases the phase shift and reduces the loss for a



Figure 6.9: Required varactor tuning range (TR) as a function of minimum capacitance for different number of cascade stages.



Figure 6.10: Total and per stage loss for different number of cascade stages

fix varactor tuning range. The phase shift of a RTPS with RLT load as a function of  $C_v$  for different values of  $C_T$  is shown in Fig. 6.12. (a). As shown in the figure, for a fix varactor range, the total phase shift will increase by increasing  $C_T$ . In order to achieve 180° phase shift the following component values can be used [88].

$$L = \frac{1}{\omega^2 C_{min}} \tag{6.20}$$

$$C_T = \frac{C_{max}C_{min}}{C_{max} - C_{min}} \tag{6.21}$$

Total phase shift as a function of  $C_T$  for  $C_{min} = 48 f F$  and TR = 3.3 is shown in Fig. 6.12. (b). The value of  $C_T$  required for 180° phase shift is 68fF which matches the calculations. The TR = 3.3 is realistic with available CMOS varactors.



Figure 6.11: Circuit schematics of the RLT RTPS

# 6.6 Experimental Results

Two RTPS with transformer based hybrid and capacitive and RLT reflective loads were designed and fabricated in 65nm digital CMOS technology with no RF or ultra-thick metal options. The micrograph of the 60GHz phase shifters is shown in Fig. 6.13. On-wafer measurements were performed up to 65GHz. The pads are de-embedded using two methods of OPEN/SHORT and THRU with the results being in agreement in the frequency range of interest.

The cascade RTPS is integrated in an areas of  $140\mu$ m by  $220\mu$ m. This is an area that is only slightly larger than a DC pad. In summary, the cascade RTPS achieves an insertion loss of 5-8.3dB while maintaining an  $S_{11}$  better than -11dB at 60GHz for all phase shifts. This topology covers the entire quadrant and the total measured phase shift is larger than 180°. Figure 6.14 shows the measurement results for the cascade RTPS. Figures 6.14(a) and (b) demonstrate the phase shift with control voltage and also across the 55-65 GHz frequency range. The measured phase shift remains linear in the frequency range of interest and demonstrates access to the entire 180° range. Figures 6.14(c) and (d) show the phase shifter losses across the frequency span of interest and with varying the control voltage. Programming for larger phase shifts incurs higher losses, all the way to 8.3 dB at the center frequency of 60 GHz. The return loss is plotted in Fig. 6.14(e) and (f) for various control



Figure 6.12: (a) Phase shift as a function of  $C_v$  for different values of  $C_T$  (b) Phase shift as a function of  $C_T$  for  $C_{min} = 48 fF$  and TR = 3.3

voltages as well as across the 55-65 GHz frequency range.  $S_{11}$  remains better than -9dB from 55-65GHz and for the entire programming range.

The RLT RTPS is integrated in an areas of  $140\mu$ m by  $340\mu$ m. This phase shifter achieves an insertion loss of 3.3-5.7dB while maintaining an  $S_{11}$  better than -15dB at 60GHz for all

| Type of Phase Shifter    | Cascaded RTPS | RLT RTPS      |
|--------------------------|---------------|---------------|
| Frequency (GHz)          | 60            | 60            |
| Phase Shift              | 180°          | $147^{\circ}$ |
| Insertion Loss (dB)      | 5-8.3         | 3.3-5.7       |
| $S_{11}$ (dB) (55-65GHz) | < -9          | < -13         |
| Area $(mm^2)$            | .031          | .048          |

Table 6.1: Summary of measured chip performance

phase shifts. The total achievable phase shift is 147°. The  $S_{11}$  remains better than -13dB from 55-65GHz for all programmed phase shift settings. Figures 6.15(a) and (b) demonstrate the phase shift with control voltage and also across the 55-65 GHz frequency range. The measured phase range covers 147°. Figures 6.15(c) and (d) show the phase shifter losses across the frequency span of interest and with varying the control voltage. Worst-case 60 GHz losses occur at control voltages in the range of -0.2V to -0.1V and are below 5.7dB across the range. The return loss is plotted in Fig. 6.14(e) and (f) for various control voltages as well as across the 55-65 GHz frequency range.  $S_{11}$  remains better than -13dB from 55-65GHz and for the entire programming range.

Table 6.1 summarizes the measurements of these two phase shifters.



Figure 6.13: Chip micrographs (Top: RLT, Bottom: Cascade)



Figure 6.14: Measured cascaded RTPS performance (a) phase shift vs. control voltage, (b) phase shift vs. frequency, (c) Loss vs. control voltage, (d) Loss vs. frequency, (e)  $S_{11}$  vs. control voltage, and (f)  $S_{11}$  vs. frequency.



Figure 6.15: Measured RLT RTPS performance (a) phase shift vs. control voltage, (b) phase shift vs. frequency, (c) Loss vs. control voltage, (d) Loss vs. frequency, (e)  $S_{11}$  vs. control voltage, and (f)  $S_{11}$  vs. frequency.

# Chapter 7 Conclusion

This thesis outlined the design of next-generation miniaturized and pad-less radio systems. We first presented the design and implementation of a passive mm-wave grain-sized radio in CMOS technology. To enable this design, several new novel architectural, system, and circuit level solutions were proposed and the results experimentally verified. In the second step, the design of silicon-based energy-efficient mm-wave phased array systems for mobile devices was demonstrated. The mm-wave beamforming solutions have applications in interfacing with the proposed miniaturized radios as well as in low-power Gpbs wireless links.

A radically different architectural design of highly miniaturized passive radios was presented. By utilizing mm-wave frequencies in the design of passive radios in dimension-constrained settings, the size and range-to-size metrics were significantly improved. Shorter wavelengths allow electronic beamforming on the central base unit (e.g., mobile cell phone), assisting in direction finding and multi-access. The dual frequency approach enabled efficient power delivery, high-data rate, and ranging/localization capabilities for the radio. A multi-access algorithm and system design was presented to address highly asymmetric networks and to support communication with a large number of passive radio nodes. New timing and modulation schemes were designed to facilitate aggressive scaling of power and dimensions. The final radio chip consumed standby power of 1.5  $\mu$ W and total area of 4.4 mm<sup>2</sup>. The radio is fully integrated in a single chip with no external components or pads.

Millimeter-wave phased-array systems that could act as the central unit (reader) for the proposed IoT radio were also discussed. Energy efficiency and scalability of the mm-wave CMOS arrays were the primary concern. The design and measurements of an efficient 60 GHz phased array in 65nm CMOS technology was presented. The receiver consumed less than 34 mW per element including LO synthesis and distribution. Using a lumped-element design strategy, the overall active area of the array was 3.5 mm by 2.5 mm for the entire TRX array (4 TX elements and 4 RX elements). The LO distribution was enabled by a new design for a compact and low-loss 60 GHz hybrid.

To support larger mm-wave arrays (required to enable larger apertures and higher array gains), an RF-baseband hybrid beamforming was proposed. Design of two lumped-based reflective-type mm-wave phase shifters designs were presented with measurements showing 5-7 dB loss across the band with phase-shifters that approach the size of a DC pad. The low loss and compact nature of the proposed phase shifters play an important role in larger hybrid phased-array elements.

Possible future directions include efforts to further enhance the passive radio efficiency, and to extend its range while shrinking overall dimensions. Demonstration of system with larger number of nodes and in environments that closely resemble application scenarios is another important direction. Multi-band, scalable, and energy efficient hybrid mm-wave phased arrays are required to support larger networks of nodes covering greater volumes of space. Other forms of energy scavenging can be combined with RF to further enhance the range and functionality of the passive radio. Scaling to finer technology nodes (and perhaps ones that are customized for low power and energy recovery) can help these efforts.

# Bibliography

- CISCO, "Cisco visual networking index: Global mobile data traffic forecast update, 2013 -2018." http://www.cisco.com/c/en/us/solutions/collateral/ service-provider/visual-networking-index-vni/white\_paper\_c11-520862. html, 2014.
- [2] D. Evans, "The internet of things: how the next evolution of the internet is changing everything," *CISCO white paper*, vol. 1, 2011.
- [3] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, "Internet of things (iot): A vision, architectural elements, and future directions," *Future Generation Computer Systems*, vol. 29, no. 7, pp. 1645–1660, 2013.
- [4] L. Atzori, A. Iera, and G. Morabito, "The internet of things: A survey," Computer networks, vol. 54, no. 15, pp. 2787–2805, 2010.
- [5] ERICSSON, "Annual mobility report." www.ericsson.com/annualreport2011, 2011.
- [6] M. Murphy and M. Meeker, "Top mobile internet trends," *KPCB Relationship Capital*, 2011.
- [7] MarketsandMarkets, "Millimeter wave technology market by components (sensors, radio, networking), products (mm scanners, mm radars, mm small cells), applications (communications, aerospace, healthcare, automotive, industrial) analysis & forecast to 2020." http://www.marketsandmarkets.com/Market-Reports/ millimeter-wave-technology-market-981.html, 2014.
- [8] R. C. Daniels, J. N. Murdock, T. S. Rappaport, and R. W. Heath, "60 ghz wireless: Up close and personal," *Microwave Magazine*, *IEEE*, vol. 11, no. 7, pp. 44–50, 2010.
- [9] E. Perahia, C. Cordeiro, M. Park, and L. L. Yang, "Ieee 802.11 ad: Defining the next generation multi-gbps wi-fi," in *Consumer Communications and Networking Conference* (CCNC), 2010 7th IEEE, pp. 1–5, IEEE, 2010.
- [10] A. M. Niknejad, "Siliconization of 60 ghz," Microwave Magazine, IEEE, vol. 11, no. 1, pp. 78–85, 2010.

- [11] M. Weiser, "The computer for the 21st century," Scientific american, vol. 265, no. 3, pp. 94–104, 1991.
- [12] J. Sarasohn-Kahn, "The connected patient: Charting the vital signs of remote health monitoring," *California Healthcare Foundation*, 2011.
- [13] J. Bryzek, "Trillion sensors summit." http://www.tsensorssummit.org/Resources/ Why\$%\$20TSensors\$%\$20Roadmap.pdf, 2013.
- [14] J. Manyika, M. Chui, J. Bughin, R. Dobbs, P. Bisson, and A. Marrs, Disruptive technologies: Advances that will transform life, business, and the global economy, vol. 180. McKinsey Global Institute San Francisco, CA, 2013.
- [15] A. Regalado, P. Fairley, D. Talbot, R. Metz, and T. Simonite, "The internet of things," Business Report, 2014.
- [16] J. Bradley, J. Barbier, and D. Handler, "Embracing the internet of everything to capture your share of \$14.4 trillion," CISCO white paper, 2013.
- [17] M. Usami, "An ultra-small rfid chip: μ-chip," in Advanced System Integrated Circuits 2004. Proceedings of 2004 IEEE Asia-Pacific Conference on, pp. 2–5, IEEE, 2004.
- [18] Y.-H. Chee, M. Koplow, M. Mark, N. Pletcher, M. Seeman, F. Burghardt, D. Steingart, J. Rabaey, P. Wright, and S. Sanders, "Picocube: a 1 cm 3 sensor node powered by harvested energy," in *Proceedings of the 45th annual Design Automation Conference*, pp. 114–119, ACM, 2008.
- [19] L. Guo, A. Popov, H. Li, Y. Wang, V. Bliznetsov, G. Lo, N. Balasubramanian, and D.-L. Kwong, "A small oca on a 1× 0.5-mm 2 2.45-ghz rfid tag-design and integration based on a cmos-compatible manufacturing technology," *Electron Device Letters, IEEE*, vol. 27, no. 2, pp. 96–98, 2006.
- [20] X. Chen, W. G. Yeoh, Y. B. Choi, H. Li, and R. Singh, "A 2.45-ghz near-field rfid system with passive on-chip antenna tags," *Microwave Theory and Techniques, IEEE Transactions on*, vol. 56, no. 6, pp. 1397–1404, 2008.
- [21] Y. H. Chee, A. M. Niknejad, and J. Rabaey, "A 46% efficient 0.8 dbm transmitter for wireless sensor networks," in VLSI Circuits, 2006. Digest of Technical Papers. 2006 Symposium on, pp. 43–44, IEEE, 2006.
- [22] G. Chen, S. Hanson, D. Blaauw, and D. Sylvester, "Circuit design advances for wireless sensing applications," *Proceedings of the IEEE*, vol. 98, no. 11, pp. 1808–1827, 2010.
- [23] Y. Lee, S. Bang, I. Lee, Y. Kim, G. Kim, M. H. Ghaed, P. Pannuto, P. Dutta, D. Sylvester, and D. Blaauw, "A modular 1 mm die-stacked sensing platform with low power i c inter-die communication and multi-modal energy harvesting," *Solid-State Circuits, IEEE Journal of*, vol. 48, no. 1, pp. 229–243, 2013.

- [24] C. A. Balanis, Antenna theory: analysis and design. John Wiley & Sons, 2012.
- [25] H. A. Wheeler, "Fundamental limitations of small antennas," Proceedings of the IRE, vol. 35, no. 12, pp. 1479–1484, 1947.
- [26] H. A. Wheeler, "Small antennas," Antennas and Propagation, IEEE Transactions on, vol. 23, no. 4, pp. 462–469, 1975.
- [27] J. S. McLean, "A re-examination of the fundamental limits on the radiation q of electrically small antennas," *Antennas and Propagation, IEEE Transactions on*, vol. 44, no. 5, p. 672, 1996.
- [28] R. C. Hansen, Electrically small, superdirective, and superconducting antennas. John Wiley & Sons, 2006.
- [29] R. C. Hansen, "Fundamental limitations in antennas," Proceedings of the IEEE, vol. 69, no. 2, pp. 170–182, 1981.
- [30] L. J. Chu, "Physical limitations of omni-directional antennas," Journal of applied physics, vol. 19, no. 12, pp. 1163–1175, 1948.
- [31] J. L. Volakis, C.-C. Chen, and K. Fujimoto, Small antennas: miniaturization techniques & applications. McGraw-Hill New York, NY, 2010.
- [32] E. Gilbert, "Impedance matching with lossy components," Circuits and Systems, IEEE Transactions on, vol. 22, no. 2, pp. 96–100, 1975.
- [33] Y. Han and D. J. Perreault, "Analysis and design of high efficiency matching networks," *Power Electronics, IEEE Transactions on*, vol. 21, no. 5, pp. 1484–1491, 2006.
- [34] D. F. Sievenpiper, D. C. Dawson, M. M. Jacob, T. Kanar, S. Kim, J. Long, and R. G. Quarfoth, "Experimental validation of performance limits and design guidelines for small antennas," *Antennas and Propagation, IEEE Transactions on*, vol. 60, no. 1, pp. 8–19, 2012.
- [35] H. Liebe, P. Rosenkranz, and G. Hufford, "Atmospheric 60-ghz oxygen spectrum: New laboratory measurements and line parameters," *Journal of quantitative spectroscopy and radiative transfer*, vol. 48, no. 5, pp. 629–643, 1992.
- [36] P. Smulders and L. Correia, "Characterisation of propagation in 60 ghz radio channels," Electronics & communication engineering journal, vol. 9, no. 2, pp. 73–80, 1997.
- [37] K. Finkenzeller and R. Waddington, *RFID handbook: radio-frequency identification fundamentals and applications*. Wiley New York, 1999.
- [38] M. Salehi and J. Proakis, *Digital communications*. McGraw–Hill, New York, 2008.
- [39] H. L. Van Trees, *Detection, estimation, and modulation theory*. John Wiley & Sons, 2004.
- [40] J. F. Dickson, "On-chip high-voltage generation in mnos integrated circuits using an improved voltage multiplier technique," *Solid-State Circuits, IEEE Journal of*, vol. 11, no. 3, pp. 374–378, 1976.
- [41] S. Pellerano, J. Alvarado, and Y. Palaskas, "A mm-wave power-harvesting rfid tag in 90 nm cmos," *Solid-State Circuits, IEEE Journal of*, vol. 45, no. 8, pp. 1627–1637, 2010.
- [42] F. Yuan, CMOS circuits for passive wireless microsystems. Springer, 2010.
- [43] P. R. Gray, P. J. Hurst, R. G. Meyer, and S. H. Lewis, Analysis and design of analog integrated circuits. John Wiley & Sons, 2008.
- [44] M. Mark, "Powering mm-size wireless implants for brain-machine interfaces," *PhD Dis*sertation UC Berkeley.
- [45] F. Klaus, "Rfid handbook: Fundamentals and applications in contactless smart cards and identification," *Hardcover*, 2003.
- [46] J.-P. Curty, N. Joehl, C. Dehollaini, and M. J. Declercq, "Remotely powered addressable uhf rfid integrated system," *Solid-State Circuits, IEEE Journal of*, vol. 40, no. 11, pp. 2193–2202, 2005.
- [47] D. R. Hush, C. Wood, et al., "Analysis of tree algorithms for rfid arbitration," in IEEE International Symposium on Information Theory, pp. 107–107, Citeseer, 1998.
- [48] J. Capetanakis, "Tree algorithms for packet broadcast channels," *Information Theory*, *IEEE Transactions on*, vol. 25, no. 5, pp. 505–515, 1979.
- [49] H. Vogt, "Efficient object identification with passive rfid tags," in *Pervasive Computing*, pp. 98–113, Springer, 2002.
- [50] D. K. Klair, K.-W. Chin, and R. Raad, "A survey and tutorial of rfid anti-collision protocols," *Communications Surveys & Tutorials, IEEE*, vol. 12, no. 3, pp. 400–421, 2010.
- [51] Z. Tang and Y. He, "Research of multi-access and anti-collision protocols in rfid systems," in Anti-counterfeiting, Security, Identification, 2007 IEEE International Workshop on, pp. 377–380, IEEE, 2007.
- [52] J. Myung and W. Lee, "Adaptive binary splitting: a rfid tag collision arbitration protocol for tag identification," *Mobile networks and applications*, vol. 11, no. 5, pp. 711–722, 2006.

- [53] A. A. Abidi, "Phase noise and jitter in cmos ring oscillators," Solid-State Circuits, IEEE Journal of, vol. 41, no. 8, pp. 1803–1816, 2006.
- [54] V. P. Nelson, H. T. Nagle, J. D. Irwin, and B. D. Carroll, "Digital logic circuit analysis & design, 1995," *Perntice Hall*, pp. 140–148.
- [55] D. Barras, F. Ellinger, H. Jackel, and W. Hirt, "Low-power ultra-wideband wavelets generator with fast start-up circuit," *Microwave Theory and Techniques, IEEE Transactions on*, vol. 54, no. 5, pp. 2138–2145, 2006.
- [56] N. Alexopoulos, P. Katehi, and D. Rutledge, "Substrate Optimization for Integrated Circuit Antennas," *IEEE Transactions on Microwave Theory and Techniques*, vol. 31, no. 7, pp. 550–557, 1983.
- [57] D. Pozar, "Considerations for millimeter wave printed antennas," *IEEE Transactions on Antennas and Propagation*, 1983.
- [58] A. Arbabian, S. Callender, S. Kang, M. Rangwala, and A. M. Niknejad, "A 94 ghz mmwave-to-baseband pulsed-radar transceiver with applications in imaging and gesture recognition," *Solid-State Circuits, IEEE Journal of*, vol. 48, no. 4, pp. 1055–1071, 2013.
- [59] D. Rutledge, "Substrate-lens coupled antennas for millimeter and submillimeter waves," IEEE Antennas and Propagation Society Newsletter, vol. 27, no. 4, pp. 4–8, 1985.
- [60] M. Nezhad Ahamdi and S. Safavi-Naeini, "On-chip antennas for 24, 60, and 77GHz single package transceivers on low resistivity silicon substrate," *IEEE Antennas and Propagation Society International Symposium*, pp. 5059–5062, 2007.
- [61] B. Floyd, S. Reynolds, U. Pfeiffer, T. Beukema, J. Grzyb, and C. Haymes, "A silicon 60ghz receiver and transmitter chipset for broadband communications," in *Solid-State Circuits Conference, 2006. ISSCC 2006. Digest of Technical Papers. IEEE International*, pp. 649–658, Feb 2006.
- [62] C.-H. Wang, H.-Y. Chang, P.-S. Wu, K.-Y. Lin, T.-W. Huang, H. Wang, and C. H. Chen, "A 60ghz low-power six-port transceiver for gigabit software-defined transceiver applications," in 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, pp. 192–596, 2007.
- [63] E. Laskin, M. Khanpour, R. Aroca, K. W. Tang, P. Garcia, and S. P. Voinigescu, "A 95ghz receiver with fundamental-frequency vco and static frequency divider in 65nm digital cmos," in *Solid-State Circuits Conference, 2008. ISSCC 2008. Digest of Technical Papers. IEEE International*, pp. 180–605, IEEE, 2008.
- [64] C. Marcu, D. Chowdhury, C. Thakkar, J.-D. Park, L.-K. Kong, M. Tabesh, Y. Wang, B. Afshar, A. Gupta, A. Arbabian, et al., "A 90 nm cmos low-power 60 ghz transceiver

with integrated baseband circuitry," *Solid-State Circuits, IEEE Journal of*, vol. 44, no. 12, pp. 3434–3447, 2009.

- [65] M. Tanomura, Y. Hamada, S. Kishimoto, M. Ito, N. Orihashi, K. Maruhashi, and H. Shimawaki, "Tx and rx front-ends for 60ghz band in 90nm standard bulk cmos," in *Solid-State Circuits Conference, 2008. ISSCC 2008. Digest of Technical Papers. IEEE International*, pp. 558–635, IEEE, 2008.
- [66] A. Tomkins, R. A. Aroca, T. Yamamoto, S. T. Nicolson, S. Voinigescu, et al., "A zero-if 60ghz transceiver in 65nm cmos with 3.5 gb/s links," in *Custom Integrated Circuits Conference, 2008. CICC 2008. IEEE*, pp. 471–474, IEEE, 2008.
- [67] S. Emami, R. F. Wiser, E. Ali, M. G. Forbes, M. Q. Gordon, X. Guan, S. Lo, P. T. McElwee, J. Parker, J. R. Tani, et al., "A 60ghz cmos phased-array transceiver pair for multi-gb/s wireless communications," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE International, pp. 164–166, IEEE, 2011.
- [68] A. Valdes-Garcia, S. T. Nicolson, J.-W. Lai, A. Natarajan, P.-Y. Chen, S. K. Reynolds, J.-H. Zhan, D. G. Kam, D. Liu, and B. Floyd, "A fully integrated 16-element phasedarray transmitter in sige bicmos for 60-ghz communications," *Solid-State Circuits, IEEE Journal of*, vol. 45, no. 12, pp. 2757–2773, 2010.
- [69] E. Cohen, C. Jakobson, S. Ravid, and D. Ritter, "A thirty two element phased-array transceiver at 60ghz with rf-if conversion block in 90nm flip chip cmos process," in *Radio Frequency Integrated Circuits Symposium (RFIC), 2010 IEEE*, pp. 457–460, IEEE, 2010.
- [70] S. K. Reynolds, A. S. Natarajan, M.-D. Tsai, S. Nicolson, J.-H. Zhan, D. Liu, D. G. Kam, O. Huang, A. Valdes-Garcia, and B. A. Floyd, "A 16-element phased-array receiver ic for 60-ghz communications in sige bicmos," in *Radio Frequency Integrated Circuits Symposium (RFIC)*, 2010 IEEE, pp. 461–464, IEEE, 2010.
- [71] A. Babakhani, X. Guan, A. Komijani, A. Natarajan, and A. Hajimiri, "A 77-ghz phasedarray transceiver with on-chip antennas in silicon: Receiver and antennas," *Solid-State Circuits, IEEE Journal of*, vol. 41, no. 12, pp. 2795–2806, 2006.
- [72] K.-J. Koh, J. W. May, and G. M. Rebeiz, "A q-band (40–45 ghz) 16-element phasedarray transmitter in 0.18-μm sige bicmos technology," in *Radio Frequency Integrated Circuits Symposium, 2008. RFIC 2008. IEEE*, pp. 225–228, IEEE, 2008.
- [73] R. J. Mailloux, *Phased array antenna handbook*. Artech House Boston, 2005.
- [74] R. C. Hansen, *Phased array antennas*, vol. 213. John Wiley & Sons, 2009.
- [75] A. M. Niknejad and H. Hashemi, *mm-Wave Silicon Technology*. Springer, 2008.

- [76] V. Rabinovich and N. Alexandrov, Antenna Arrays and Automotive Applications. Springer, 2013.
- [77] K. Raczkowski, W. De Raedt, B. Nauwelaers, and P. Wambacq, "A wideband beamformer for a phased-array 60ghz receiver in 40nm digital cmos," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2010 IEEE International*, pp. 40–41, Feb 2010.
- [78] O. Bakr and M. Johnson, "Impact of phase and amplitude errors on array performance," *EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2009-1*, 2009.
- [79] D. Parker and D. C. Zimmermann, "Phased arrays-part i: Theory and architectures," *IEEE transactions on microwave theory and techniques*, vol. 50, no. 3, pp. 678–687, 2002.
- [80] M. Tabesh, J. Chen, C. Marcu, L. Kong, S. Kang, A. M. Niknejad, and E. Alon, "A 65 nm cmos 4-element sub-34 mw/element 60 ghz phased-array transceiver," *Solid-State Circuits, IEEE Journal of*, vol. 46, no. 12, pp. 3018–3032, 2011.
- [81] T.-K. Nguyen, C.-H. Kim, G.-J. Ihm, M.-S. Yang, and S.-G. Lee, "Cmos low-noise amplifier design optimization techniques," *Microwave Theory and Techniques, IEEE Transactions on*, vol. 52, no. 5, pp. 1433–1442, 2004.
- [82] M. T. Terrovitis and R. G. Meyer, "Noise in current-commutating cmos mixers," Solid-State Circuits, IEEE Journal of, vol. 34, no. 6, pp. 772–783, 1999.
- [83] B. Afshar, Y. Wang, and A. M. Niknejad, "A robust 24mw 60ghz receiver in 90nm standard cmos," in Solid-State Circuits Conference, 2008. ISSCC 2008. Digest of Technical Papers. IEEE International, pp. 182–605, IEEE, 2008.
- [84] C. Zhou, H. Qian, and Z. Yu, "A lumped elements varactor-loaded transmission-line phase shifter at 60ghz," in *Solid-State and Integrated Circuit Technology (ICSICT)*, 2010 10th IEEE International Conference on, pp. 656–658, IEEE, 2010.
- [85] M.-D. Tsai and A. Natarajan, "60ghz passive and active rf-path phase shifters in silicon," in *Radio Frequency Integrated Circuits Symposium*, 2009. RFIC 2009. IEEE, pp. 223–226, IEEE, 2009.
- [86] B. Biglarbegian, M. R. Nezhad-Ahmadi, M. Fakharzadeh, and S. Safavi-Naeini, "Millimeter-wave reflective-type phase shifter in cmos technology," *Microwave and Wireless Components Letters, IEEE*, vol. 19, no. 9, pp. 560–562, 2009.
- [87] K. Hettak and G. Morin, "Compact variable reflective-type sige phase shifter using lumped elements for 5 ghz applications," in *Microwave Integrated Circuits Conference* (EuMIC), 2010 European, pp. 102–105, IEEE, 2010.

- [88] H. Zarei, C. Charles, and D. Allstot, "Reflective-type phase shifters for multiple-antenna transceivers," *Circuits and Systems I: Regular Papers, IEEE Transactions on*, vol. 54, pp. 1647–1656, Aug 2007.
- [89] H. Krishnaswamy, A. Valdes-Garcia, and J.-W. Lai, "A silicon-based, all-passive, 60 ghz, 4-element, phased-array beamformer featuring a differential, reflection-type phase shifter," in *Phased Array Systems and Technology (ARRAY), 2010 IEEE International Symposium on*, pp. 225–232, IEEE, 2010.
- [90] T.-Y. Chin, J.-C. Wu, S.-F. Chang, and C.-C. Chang, "Compact s -/ ka -band cmos quadrature hybrids with high phase balance based on multilayer transformer overcoupling technique," *Microwave Theory and Techniques, IEEE Transactions on*, vol. 57, pp. 708–715, March 2009.
- [91] D. Ozis, J. Paramesh, and D. J. Allstot, "Analysis and design of lumped-element quadrature couplers with lossy passive elements," in *Circuits and Systems, 2006. ISCAS 2006. Proceedings. 2006 IEEE International Symposium on*, pp. 4–pp, IEEE, 2006.