Copyright © 1994, by the author(s). All rights reserved.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.

#### DIGITAL FILTER DESIGN WITH HIGH PERFORMANCE SUPERCONDUCTING TECHNOLOGY

by

Renu Mehra

Memorandum No. UCB/ERL M94/57

22 August 1994

#### DIGITAL FILTER DESIGN WITH HIGH PERFORMANCE SUPERCONDUCTING TECHNOLOGY

by

Renu Mehra

Memorandum No. UCB/ERL M94/57

22 August 1994

#### **ELECTRONICS RESEARCH LABORATORY**

College of Engineering University of California, Berkeley 94720

#### DIGITAL FILTER DESIGN WITH HIGH PERFORMANCE SUPERCONDUCTING TECHNOLOGY

,

by

Renu Mehra

Memorandum No. UCB/ERL M94/57

22 August 1994

-

#### **ELECTRONICS RESEARCH LABORATORY**

College of Engineering University of California, Berkeley 94720

## Digital Filter Design with High Performance Superconducting Technology

By

**Renu Mehra** 

#### ABSTRACT

This report presents a large scale integrated system implemented using Josephson technology. A cell library of MVTL gates: OR, OR-AND and a current amplifier, has been designed, simulated, fabricated and tested. High speeds of up to few tens of gigahertz have been achieved during simulation. High margins of around  $\pm 30\%$  have been obtained from the final fabricated gates. Detailed delay-margin analyis of these gates has been performed. Statistical simulations have been performed to investigate yield problems. These gates have been used to implement macro blocks, like pipeline registers and adders, which have been demonstrated to work with acceptable margins. A 4-bit adder has been shown to work reliably.

The designed cell library is used to implement a large scale integrated system - a digital FIR filter. A completely manual layout of the filter has been done and the filter has been fabricated using the Hypres fabrication facility. Test results from the filter and the problems involved have been discussed.

#### Acknowledgments

My research advisor, Professor Jan Rabaey, has been greatly responsible for my being in this field. I wish to thank him for his support. I also wish to thank Professor T. Van Duzer for his constant advice and guidance throughout this work.

I am grateful to all my fellow students for their constructive criticisms at all times. In particular, I would like to thank Howard Luong and Dave Feld for their invaluable help during testing. Several times when I had no clue to get around a practical problem, they gave me valuable suggestions. I thank all the members in the CRYO group for their help at different times - for making tapes to send out, reviewing my layout before they were sent for fabrication and tons of general advice. I wish to convey my appreciation to Marwan Khalaf and Howard Luong for reviewing this report and giving me very helpful comments.

Like every other project, this project has had its share of setbacks and frustrations. I am indebted to Rajeev Ranjan for his patient encouragement during such times, when his support helped me persist. Special thanks to Rajeev for his continuous help during the project and his repeated reviews and critiques during the writing of this report.

I am always obligated to my family for supporting and motivating me in every way. Without their support I would not be here at Berkeley in the first place.

This project has been supported by the University Research Initiative (ONR) 00014-92-J-1835 grant and I gratefully acknowledge their support.

### **Table of Contents**

8

18

|     |                                    | _  |
|-----|------------------------------------|----|
| 1.1 | Introduction                       |    |
| 1.2 | Superconducting Logic Fundamentals |    |
| 1.3 | SQUID                              | 13 |
| 1.4 | Logic Styles                       | 14 |
| 1.5 | Superconducting Process            | 15 |
| 1.6 | Summary                            |    |

#### **MVTL Gates: Design and Analysis**

2.1 2.2 2.2.1 2.2.2 2.2.3 2.2.4 2.3 2.3.1 2.3.2 2.3.3 2.4 2.5 2.6 2. 6.1 Simulation Results of the AND Gate, MAJORITY Gate and 2.7 2.8 2.9 2.10 

#### **FIR Filter**

| F   | <b>IR Fil</b> | ter                                                 | 47 |
|-----|---------------|-----------------------------------------------------|----|
| 3.1 | Introd        | luction                                             | 47 |
| 3.2 | FIR F         | ïlters                                              | 48 |
| 3.3 | Desig         | n of a 4-Bit, 3-Tap, Low Pass, Josephson FIR Filter | 50 |
| 3.4 | Imple         | mentation                                           |    |
|     | 3.4.1         | The Full Adder and the Half Adder                   |    |
|     | 3.4.2         | The Delay Element                                   |    |
|     | 3.4.3         | Converting Multiplies to Shift-Adds                 |    |
|     | 3.4.4         | Assembling the Filter                               |    |
| 3.5 | Summ          | nary                                                | 64 |

#### **Testing and Results**

#### 65

| 65 |
|----|
| 67 |
| 67 |
| 70 |
| 72 |
| 74 |
| 81 |
| ,  |

#### References

.

## List of Figures

| Figure 1.1  | Equivalent circuit for a nonideal Josephson junction.                 | 10 |
|-------------|-----------------------------------------------------------------------|----|
| Figure 1.2  | I-V curve for a Josephson junction.                                   | 11 |
| Figure 1.3  | I-V curves for hysteretic and nonhysteretic junctions.                | 12 |
| Figure 1.4  | A two-junction SQUID.                                                 | 13 |
| Figure 1.5  | Threshold curves for a two-junction SQUID. N is the number of         |    |
|             | flux quanta at the peak of the lobe.                                  | 14 |
| Figure 1.6  | Fanout coupling for (a) magnetic coupling and                         |    |
|             | (b) current injection logic styles.                                   | 14 |
| Figure 2.7  | The MVTL OR gate                                                      | 20 |
| Figure 2.8  | Loading considerations.                                               | 22 |
| Figure 2.9  | Threshold curves for the MVTL gate without the isolator junction      | 27 |
| Figure 2.10 | Threshold curves for MVTL gate with the isolator junction             | 27 |
| Figure 2.11 | Gate arrangements for (a) margins and delay simulations;              |    |
|             | (b) maximum clock rate determinations.                                | 29 |
| Figure 2.12 | Gate output                                                           | 30 |
| Figure 2.13 | Delay vs. gate current at different current densities                 | 30 |
| Figure 2.14 | Gate-current margin vs. delay at different current densities          | 31 |
| Figure 2.15 | Standard deviation of I <sub>c</sub> variations vs. the junction size | 33 |
| Figure 2.16 | Layout of an MVTL OR gate                                             | 34 |
| Figure 2.17 | (a) The AND gate (b) The MAJORITY gate                                | 35 |
| Figure 2.18 | (a) The 2OR-AND cell (b) The 3OR-MAJORITY cell                        | 35 |
| Figure 2.19 | Threshold curves for the AND gate.                                    | 36 |
| Figure 2.20 | Threshold curves and their relation to fanout.                        | 36 |
| Figure 2.21 | The output current amplifier                                          | 37 |
| Figure 2.22 | Gate response during simulation                                       | 39 |
| Figure 2.23 | Delay vs. gate current varies in the 2OR-AND cell                     | 40 |
| Figure 2.24 | Margins vs. delay as gate current varies in the 2OR-AND cell          | 40 |
| Figure 2.25 | The 2OR-AND cell.                                                     | 41 |
| Figure 2.26 | The MAJORITY gate (input resistances included).                       | 42 |
| Figure 2.27 | The output current amplifier                                          | 43 |
| Figure 2.28 | Reset times for unipolar and bipolar sinusoidal clocks compared       | 45 |
| Figure 3.1  | Simple, nonpipelined version of a FIR filter                          | 48 |

| Figure 3.2  | A retimed version of the FIR filter.                                | 49 |
|-------------|---------------------------------------------------------------------|----|
| Figure 3.3  | Basic structure of the 3-tap, 4-bit FIR filter.                     | 50 |
| Figure 3.4  | Frequency response of the 3-tap FIR filter comparing the            |    |
|             | floating point and fixed point implementations.                     | 51 |
| Figure 3.5  | Sum and carry generation in a half adder                            | 52 |
| Figure 3.6  | Sum generation in the full adder with dual rail logic               | 53 |
| Figure 3.7  | Carry generation in the full adder.                                 | 53 |
| Figure 3.8  | 1-bit full adder layout                                             | 55 |
| Figure 3.9  | Layout of a 4-bit adder                                             | 56 |
| Figure 3.10 | Implementation of the unit delay.                                   | 57 |
| Figure 3.11 | Multiplication implemented in add-shifts. Fanout tree and           |    |
|             | sign extension shown                                                | 58 |
| Figure 3.12 | Block diagram; 3-phase timing arrangement                           | 59 |
| Figure 3.13 | The FIR filter layout (unoptimized)                                 | 61 |
| Figure 3.14 | Optimized filter layout                                             | 62 |
| Figure 3.15 | Floorplan of the optimized filter                                   | 63 |
| Figure 3.16 | The FIR fliter chip                                                 | 64 |
| Figure 4.1  | I-V characteristic of a 3.3 x 3.3 $\mu$ m <sup>2</sup> junction     | 67 |
| Figure 4.2  | Oscilloscope photograph of the OR gate results                      | 68 |
| Figure 4.3  | Oscilloscope photograph of the functioning of the 2OR-AND cell      | 68 |
| Figure 4.4  | Oscilloscope photograph of the results from MAJORITY gate           | 69 |
| Figure 4.5  | Oscilloscope photograph of the results from the output current      |    |
|             | amplifier                                                           | 69 |
| Figure 4.6  | Test results for a 1-bit delay unit                                 | 70 |
| Figure 4.7  | Test results of a 1-bit half adder                                  | 71 |
| Figure 4.8  | Oscilloscope photograph of the results of the full adder            | 71 |
| Figure 4.9  | Carry results of the full adder                                     | 72 |
| Figure 4.10 | 4-bit full adder - inputs                                           | 73 |
| Figure 4.11 | 4-bit full adder - outputs. (a) sum bits (b) carry bits             | 74 |
| Figure 4.12 | Signals at different parts of the chip                              | 77 |
| Figure 4.13 | Three phase clocks                                                  | 79 |
| Figure 4.14 | Bit patterns measured at internal filter nodes (pin numbers shown). |    |
|             | * Indicates signals that were not working correctly                 | 80 |
| Figure 4.15 | Bit patterns obtained on the output nodes of the filter.            | 81 |

.

## List of Tables

T

| Table 1.1:  | Hypres niobium process flow                                                | 16 |
|-------------|----------------------------------------------------------------------------|----|
| Table 2.2:  | Calculated characteristic impedance for M1 and M2 microstrips              | 22 |
| Table 2.3:  | Josephson junction parameters used for simulations                         | 28 |
| Table 2.4:  | Parameter values selected.                                                 | 28 |
| Table 2.5:  | Important properties of the OR gate measured from simulations              | 29 |
| Table 2.6:  | Parameter values, variations allowed, and effect of variations on          |    |
|             | delay for the OR gate at a current density of 1000 A/cm <sup>2</sup>       | 31 |
| Table 2.7:  | Parameter values, variations allowed, and effect of variations on          |    |
|             | delay for the OR gate at a current density of 5000 A/cm <sup>2</sup>       | 32 |
| Table 2.8:  | Parameter values used for the different gates in the cell library          | 38 |
| Table 2.9:  | Important properties of the gates designed ( $I_c = 1000 \text{ A/cm}^2$ ; |    |
|             | Fanout = 3)                                                                | 39 |
| Table 2.10: | Parameter values, variations allowed, and effect of variations             |    |
|             | on delay                                                                   | 39 |
| Table 3.1:  | Gate and junction count for 1 bit full and half adders                     | 54 |
| Table 3.2:  | Gate and junction count of the final implementation                        | 60 |
| Table 4.1   | Tested values of junction parameters.                                      | 67 |
| Table 4.2   | Maximum and minimum allowable gate currents and margins                    |    |
|             | of the basic gates                                                         | 69 |
| Table 4.3   | Limits on the gates currents allowed for a one bit delay unit              | 70 |
| Table 4.4   | Test sequence for the 4-bit adder                                          | 72 |
| Table 4.5   | Signal values at internal/external nodes of the filter with a              |    |
|             | test input sequence.                                                       | 75 |
| Table 4.6   | Signals values after accounting for carry bit exchange in adder.           | 76 |
| Table 4.7   | Expected bit patterns at output pins (*: signals that did not work)        |    |
|             |                                                                            |    |

# 1

## Introduction

#### **1.1 Introduction**

Josephson junctions, with their hysteretic behavior and picosecond switching times, opened the way to digital circuit design using superconducting technology with the possibility of low power, ultra-high-speed logic.

Recently, much work has been done in circuit design with Josephson technology. The last decade has seen the emergence of several different logic families both of the voltage state and flux transfer types. Among the most successful voltage state logic families are the *Resistor Coupled Josephson Logic* (RCJL) [Son82], the *Variable Threshold Logic* [Fuj85], the *Modified Variable Threshold Logic* (MVTL) [Fuj89] and the *Four Junction Logic* (4JL) [Nak82]. A detailed comparison between the RCJL, MVTL and 4JL is presented in [Kis93]. Flux based families include the *Quantum Flux Parametron*, (QFP) [Hos91], *Phase Mode Logic* and the *Rapid Single Flux Quantum* Logic/Memory Family, (RSFQ) [Lik91].

Josephson integrated circuit technology was greatly improved with the discovery of  $Nb/AlO_x/Nb$  junctions. These junctions show low leakage currents and good stability with respect to thermal recycling and long term storage. Due to small scattering of the junction characteristic, the inherent high performance of Josephson junctions could finally be realized. Among the different logic families, the MVTL logic family has proved to be the most robust while being competitive in terms of speed.

Reasonably complex circuits are now being made in the superconducting technology. MVTL gates are among the most popular for complex circuit design. Some of the circuits realized using this design style include an 8-bit shift register [Has88], a 16-bit ALU [Has88], a 4-bit microprocessor [Has88], an ultra-high-speed multiplier [Kot87], a sub-nanosecond 16-bit ALU [Kot88], a sub-nanosecond 4-bit processor [Kot89] and an 8-bit DSP processor [Has91]. Josephson technology has been successfully demonstrated to work with large scale integrated circuits. However little research effort has gone into the design of high speed digital filters.

The importance of the present work lies in building a reliable cell library of MVTL gates and in demonstrating its use in a reasonably large circuit. It applies a relatively new technology to a very important circuit, an FIR filter.

#### **1.2 Superconducting Logic Fundamentals**

All digital logic is based on the nonlinear behavior of a "key" device, which causes two distinct states, logic "0" and logic "1". The effectiveness of the device depends on the sharpness of its nonlinearity. In semiconductor technology, the transistor performs the function of this "key" device whereas in superconductor technology, the Josephson junction has the required nonlinear behavior. Before going into the main issues involved in the logic design with Josephson junctions, a brief discussion of how they operate will be presented.

The Josephson junction consists of two superconductors separated by an insulating barrier. When there is no voltage across the junction, the wave functions of the two superconductors couple, leading to a low energy state. This coupling allows current to pass freely through the insulator even in the absence of a voltage. The amount of current depends on the phase difference  $\phi$  between the two wave functions (Equation 1.1).

$$I = I_c \sin \phi \tag{1.1}$$

where the maximum zero-voltage current  $I_c$  is determined by the thickness of the insulating barrier. When a voltage V is applied, the wave functions tend to slip with

respect to each other at a rate determined by the voltage. Equation 1.2 quantifies this relation.

$$\frac{d\phi}{dt} = \frac{4\pi eV}{h} \tag{1.2}$$

where e is represents the charge per electron and h is the Planck's constant.

Equations 1.1 and 1.2 specify the behavior of an ideal junction. A real Josephson junction has a capacitance C and a nonlinear conductance G(V) in parallel with the ideal junction as shown in Figure 1.1.



Figure 1.1 Equivalent circuit for a nonideal Josephson junction.

The total current is then given by,

$$I = I_c \sin\phi + G(V)V + C \, dV/dt \tag{1.3}$$

From the characteristic defined in Equation 1.2, the junction functions as a voltage controlled oscillator, at any nonzero voltage there is an oscillating current. These oscillations are called *Josephson Oscillations*. Figure 1.2 shows the dc time-averaged I-V characteristic of a junction. The load line used in typical digital circuits is shown in the figure. It intersects the characteristic at two points: a zero-voltage, high-current superconductive state (logic "0"), and a high-voltage, low-current, resistive state (logic "1"). In the resistive state, the voltage is close to the gap voltage V<sub>g</sub> of the superconductor. For niobium (Nb) the gap voltage is 2.8 mV.



Figure 1.2 I-V curve for a Josephson junction.

The characteristic shown in Figure 1.2 is for a hysteretic junction. A junction may also be nonhysteretic. A dc I-V characteristic of nonhysteretic junction is shown in Figure 1.3. The McCumber parameter  $\beta_c$  defined in Equation 1.4 is a measure of the amount of hysteresis in the junction.

$$\beta_c = \frac{4\pi e}{h} \frac{I_c C}{G^2} \tag{1.4}$$

As can be seen from Figure 1.3, higher value of  $\beta_c$  increases hysteresis. Junctions must be hysteretic to be used for voltage state digital logic design.

In typical digital logic applications, the junction is originally biased in the superconductive or zero-voltage state (logic "0"). By injecting a current into the junction,  $I_c$  can be exceeded, and the circuit gets latched into the resistive state with a voltage close to  $V_g$  (logic "1"). Another technique commonly used in digital logic design is to cause magnetic flux to link the junction when the current flowing through it is a little less than its critical current. This reduces the critical current of the junction, causing it to switch to the resistive state.



Figure 1.3 I-V curves for hysteretic and nonhysteretic junctions.

The Josephson junction features high-speed switching (1-10 ps), low power consumption due to the extremely low gap voltage and moderate current levels (microwatts per gate) and low dispersion signal transmission due to frequency independence of the skin depth and very low line resistances. These three features make it very attractive for designing high speed computing systems. However, there are a few inherent problems.

- There is no input/output isolation. Therefore, a junction is almost never used by itself in a logic circuit. The most common configuration used in digital logic design is the SQUID (Superconducting QUantum Interference Device), discussed in the next section.
- The switching behavior is latching. Whenever the circuit switches into the logic state, it cannot be reset except by removal of the bias current. As a result, the bias current needs to be clocked. The clocking of these circuits is therefore critical and needs to be carefully examined. It can be compared to clocking in dynamic CMOS logic.
- The basic logic element, the Josephson junction, is noninverting. As a result it is very difficult to build inverters in this technology. It is, however, possible to make a *timed inverter*, which requires separate clocking.

#### 1.3 SQUID

A SQUID is a superconducting loop broken at one or more points by junctions. Figure 1.4 shows a two junction SQUID. The I-V curve of the SQUID is very similar to that of a single junction, with the maximum critical current being the sum of the critical currents of the two junctions. Switching may be achieved either by injecting current into the loop or by linking magnetic flux through it. When magnetic flux links the superconducting loop, the critical current of the SQUID is reduced.



Figure 1.4 A two-junction SQUID.

Magnetic flux may be linked to the loop by passing a current  $I_{in}$  through a wire (thin film) magnetically coupled to the loop, resulting in the reduction of critical current. The relationship between the externally applied flux,  $\Phi_{ex}$ , (normalized with respect to the flux quantum  $\Phi_0$ ), and the critical current of the SQUID is represented by the threshold curve for the SQUID. The threshold curve defines the boundary under which the junctions are in the superconducting state, and above which they are in the resistive state. The height of the curve therefore represents the critical currents at various values of the external flux. Any threshold characteristic consists of a set of N threshold curves, each representing a different "mode". The N<sup>th</sup> curve represents the mode with N flux quanta linked to the SQUID loop at the peak of the lobe. Figure 1.5 shows the threshold characteristic of a symmetric two junction SQUID for N=-1, 0 and 1. In the case of current injection schemes, the threshold curve is plotted with the control current injected on the x-axis.



Figure 1.5 Threshold curves for a two-junction SQUID. N is the number of flux quanta at the peak of the lobe.

#### 1.4 Logic Styles

Based on the techniques discussed in the previous sections, Josephson voltage state logic gates can be broadly classified into two groups: *magnetically coupled* gates and *current injection* gates. Magnetic coupling gates are controlled by the magnetic field generated by the input signal passing close to the superconducting loop. For magnetically coupled logic, the output signal can be applied serially to multiple gates at subsequent stages. In a current injection gate, the output can only fan out in parallel to the next stage. Figure 1.6 shows the fanout patterns of both the magnetic coupling and current injection gates. Current injection gates are typically characterized by small area and relative insensitivity to magnetic flux trapping. But they have lower operating margins than magnetically coupled gates.



Figure 1.6 Fanout coupling for (a) magnetic coupling and (b) current injection logic styles.

The circuit methodologies discussed above are based on latching of the Josephson junction into the gap voltage  $V_g$  and can be grouped into a class called voltage state logic. Another type of logic called flux transfer logic works on an entirely different principle. In this approach, the binary information is represented not by a dc voltage, but by very short voltage pulses, V(t), of quantized area, (Equation 1.5).

$$\int V(t) \, dt = \Phi_0 = \frac{h}{2e} = 2.07 mV \times ps \tag{1.5}$$

These single flux quantum (SFQ) pulses can be produced by circuits with overdamped Josephson junctions. Switching from one state to another is manifested in a voltage pulse much below the gap voltage. This approach is discussed in detail in [Lik91]. The main advantages of this type of logic style over voltage state logic are:

- The immense increase in speed. The simple cells of this logic style have been tested at 100 GHz, and higher speeds are expected with new fabrication technology.
- The punchthrough problem (explained in detail in Section 2.9) is eliminated.
- The voltage pulses consume very low power.

However, the timing of these circuits is very critical and very difficult to meet.

#### **1.5 Superconducting Process**

Josephson integrated circuits consist of Nb/AlO<sub>x</sub>/Nb junctions, SiO<sub>2</sub> insulators, Mo resistors, a Nb ground plane, and Nb wiring. The process used by Hypres Inc. is briefly described below. For a more detailed description refer to [Hyp92]. The Hypres fabrication process uses refractory materials in every stage, with the exception of Ti/Pd/Au contact metallization. Niobium is used as superconducting material due to its high critical temperature, electrical and thermal stability, and the ability to withstand thermal cycling. Junctions are made by depositing an *insitu* trilayer Nb/AlO<sub>x</sub>/Nb across the entire wafer, and subsequently defining the junction areas through etching and using deposited SiO<sub>2</sub> for isolation.

The Josephson junctions are interconnected into circuit configurations with two superconducting Nb layers, a superconductive ground plane, and two resistive layers. Two resistance layers are used to obtain medium and low value resistors, the respective sheet resistances being 1  $\Omega$ /square and 0.01  $\Omega$ /square. SiO<sub>2</sub> is used to isolate the junctions and the ground plane, and to act as a second insulating layer. SiO<sub>2</sub> allows low capacitance and relatively high impedance microstrip lines. A third Nb superconductive layer is provided as an additional wire-up layer. The different layers, as they appear in the layout tool, MAGIC, are summarized in Table 1.1. The minimum feature sizes for the wiring layers, resistance layers, and the junction definition layer are also included.

| Sequence | Hypres level designation | Minimum<br>width<br>(µm) | Brief description                                                               |  |
|----------|--------------------------|--------------------------|---------------------------------------------------------------------------------|--|
| 1        | M0                       | 2                        | Ground plane etch(Nb).                                                          |  |
| 2        | 10                       | 3                        | Ground plane insulation (SiO <sub>2</sub> ).                                    |  |
| 3        | M1                       | 2.5                      | Trilayer formation.                                                             |  |
| 4        | I1A                      | 3                        | Junction definition (counter electrode).                                        |  |
| 5        |                          |                          | SiO <sub>2</sub> Deposition.                                                    |  |
| 6        | R2                       | 5                        | Medium value resistor.                                                          |  |
| 7        |                          |                          | SiO <sub>2</sub> Deposition.                                                    |  |
| 8        | I1B                      | 2                        | Via to counter-electrode (I1A) and resistor (R2).                               |  |
| 9        | M2                       | 2                        | Second metallization which makes contact to I1A, R2, M1 and M0 (through M1).    |  |
| 10       | 12                       | 3                        | Insulation layer utilizing deposited SiO <sub>2</sub> .                         |  |
| 11       | M3                       | 2                        | Third metallization (makes contact to M2)                                       |  |
| 12       | R3                       | 5                        | Bonding metal, makes contact to M3, and acts as the low value resistance layer. |  |

Table 1.1: Hypres niobium process flow.

#### 1.6 Summary

Some of the basic design techniques in the superconducting technology have been presented. Design issues concerning voltage state logic have been discussed. Chapter 2 describes how techniques described in this chapter have been used in the design of the MVTL cell library.

## 2 MVTL Gates: Design and Analysis

#### 2.1 Introduction

For system level design in any technology it is essential to have a stabilized and reliable cell library including in a research environment where the system level design possibilities of a new technology are being explored. This part of the current work is an attempt to provide the basic building blocks for further research in digital design in the Josephson technology.

The most important decision facing designers of Josephson LSI circuits is the choice of a fast and reliable logic family. In this context, the following facts have been observed.

The gate delay depends on the current density, the bias current level and the capacitance per unit area. The power consumption depends mainly on the current level. Neither the delay nor the power consumed depend strongly on the gate structure being used. The gate area however, depends on the structure of the gate and increases as the number of magnetically coupled control lines is increased. If there is only one magnetically coupled control line, the area is limited by the size of the electrodes. The margins exhibited by the gate are strongly dependent on the structure of the gate and on its threshold curve.

A comparison of three different emerging logic styles in the Josephson technology has been performed in [Kis93]. The results clearly showed that the Modified Variable Threshold Logic, MVTL, has the highest margins and the maximum three phase clocking speed. The MVTL logic gate, first designed at Fujitsu, has only one magnetically coupled control line. It has been shown to be the most robust as far as margins are concerned. Since the Josephson processing technology has not fully matured, high margins are important for reasonable yields.

In the current work, a cell library of MVTL gates has been designed. The basic cells include the OR gate, the 2OR-AND cell and the 2/3 MAJORITY gate. Each one of the gates have been fabricated and tested for functionality and margins. Using these basic blocks, a set of macros blocks have been built. This set includes the *half adder*, the *full adder* and the *delay unit*.

Section 2.2 discusses the basic OR gate and its parameters. Most of the parameter values used are the same as those used by Fujitsu. However, each one was varied and simulated to check for optimality and margins. Simulation results for the OR gate are described in Section 2.3. Section 2.4 discusses the operation of the AND gate and the MAJORITY gate. In Section 2.5 the output current amplifier cell is introduced.In Section 2.6 and 2.7, simulation results and the layouts, respectively, of the AND gate, the MAJORITY gate and the output current amplifier are described in detail. Sections 2.8 and 2.9 discuss the concept of dual rail logic and the powering system respectively.

#### 2.2 The OR Gate

The OR gate is the basic gate in the MVTL logic family. Besides performing an important logic function, it acts as an output-to-input isolation device. The other gates in the family do not have output-to-input isolation and need to be buffered by the OR gate. The OR gate therefore is crucial to any MVTL design. Figure 2.7 shows the structure of the MVTL OR gate.



Figure 2.7 The MVTL OR gate.

#### 2.2.1 Basic Operation

The MVTL OR gate is essentially an asymmetric interferometer, with two junctions  $J_1$ ,  $J_2$  and a loop inductance L. The two junctions have different critical currents and the gate current is fed asymmetrically into the interferometer loop. Generic equations governing the asymmetry are given below.

$$I_c(J_I) = pI_{max} \tag{2.1}$$

$$I_c(J_2) = qI_{max} \tag{2.2}$$

$$left inductance = qL \tag{2.3}$$

$$right inductance = pL \tag{2.4}$$

$$p + q = l \tag{2.5}$$

The input current flowing through  $L_x$  magnetically couples to the interferometer loop, causing a reduction of the critical currents of the junctions  $J_1$  and  $J_2$ . This current is also injected into the gate. Both magnetic coupling and current injection are thus used to facilitate fast switching. After the SQUID junctions ( $J_1$  and  $J_2$ ) switch, the gate current finds more favorable paths to the ground, through the

isolation resistance  $R_i$  or the load resistance  $R_L$ .  $R_i$  is designed to be smaller then  $R_L$ and therefore, most of the gate current goes through  $J_3$  to  $R_i$  (if  $J_3$  has not already switched).  $J_3$  is designed to switch immediately if this happens, serving two important purposes. First, it ensures that all the bias current flows into  $R_L$ , giving maximum possible input to the next stage. A high input current increases the switching speed, maximizing the efficiency of the circuit. Secondly, it isolates the input current from the loop and hence from the output. Any changes in one stage will therefore propagate forward only and output-input isolation is complete. The input current finds a low resistance path to the ground through  $R_i$ .

The resistance  $R_d$  shown in Figure 2.7 is used to damp out oscillations in the SQUID loop. The input resistances  $R_{in1}$  and  $R_{in2}$  provide the required load resistance to the previous stage. Also, they present a high impedance to the gate current of the previous stage thus preventing it from switching the interferometer loop of this stage until after the previous one has switched. The load resistance  $R_L$  seen by any gate is the input resistance of the next stage  $R_{in}$  divided by the fanout.

The source resistance  $R_s$  allows the gate current of all gates in the same phase to be fed from a single voltage source. A common voltage source provides much better control than several independent current sources.

#### 2. 2. 2 Design and Parameter Selection

Optimal loading of an MVTL OR gate is determined by Equation 2.6. It has been found that if this criterion is not followed, the margins of the gate are reduced.

$$V_g = I_{max} R_L \tag{2.6}$$

where  $I_{max}$  is the sum of the critical currents of the junctions  $J_1$  and  $J_2$ , and  $V_g$  is the gap voltage. Figure 2.8 illustrates the reason for the above criterion. The SQUID characteristic, the load line, and the  $I_g$  line (the total current available) are shown in the figure. When the circuit switches, it settles at the intersection of the load line and the I-V characteristic (point "a"), current  $I_J$  flows through the junctions and  $I_t$  is transferred to the output. For maximum efficiency, it is important that the load line intersects the knee of the curve at  $V_g$ . If  $R_L$  is larger than that required by Equation 2.6, very little of

the current is transferred to the output, and if  $R_L$  is lower than the optimal value, the junction does not latch close to the gap voltage.

The voltage difference between the two breakpoints of the characteristic, points "a" and "b", is called  $\Delta V$ .



Figure 2.8 Loading considerations.

| width (in µm) | Characteristic<br>Impedance $(\Omega)$ (M1) | Characteristic<br>Impedance (Ω) (M2) |
|---------------|---------------------------------------------|--------------------------------------|
| 4             | 28.42                                       | 75.05                                |
| 8             | 14.21                                       | 37.52                                |
| 12            | 9.47                                        | 25.02                                |
| 16            | 7.11                                        | 18.76                                |
| 20            | 5.68                                        | 15.01                                |
| 25            | 4.55                                        | 12.01                                |
| 30            | 3.79                                        | 10.01                                |

Table 2.2: Calculated characteristic impedance for M1 and M2 microstrips.

As mentioned earlier, the load resistance  $R_L$  for any gate, is input resistance of the next gate divided by the fanout. For high speed applications, inter-cell connection lines act as transmission lines and must be appropriately impedance matched. The characteristic impedance of the lines must therefore be equal to  $R_{in}$ . Table 2.2 shows characteristic impedances of metal-1 and metal-2 microstrip lines of different width for the Hypres process. Since 8  $\mu$ m is a reasonable line width with the current fabrication process and metal-1 microstrips are used for output lines, the impedance of 14  $\Omega$  was selected for the input resistances.

The MVTL OR gate design was targeted for a fanout of 2 (the reason is explained later in Section 2.5) and the load resistance was therefore fixed at 7  $\Omega$  (R<sub>in</sub> being fixed at 14  $\Omega$ ). For a high speed design, it is important that the characteristic impedance of the line that is connected to the output be 7  $\Omega$  (14  $\Omega$ ) for a fanout of one (two). Since the current design is a low frequency design, the interconnect was not designed with impedances that match the input resistance of the gates. However, the gates are designed so that if impedance matching were desired, the interconnect widths could be changed, without changing the gate design.

The value of the gap voltage  $V_g$  is 2.8 mV. The value of  $I_{max}$  (the maximum zero voltage current through the SQUID) was therefore determined as 400  $\mu$ A from Equation 2.6.

Having fixed the load and the maximum current, all other circuit parameters should be chosen carefully for best performance. The normalized loop inductance  $\lambda$  (defined in Equation 2.7) and the asymmetry ratio q/p are crucial to the design.

$$\lambda = LI_{max} / \Phi_0 \tag{2.7}$$

Fujimaki et al. [Fuj89] have shown that the values of these two parameters for the maximum operating margins are as given in Equation 2.8 and 2.9.

$$q/p = 3 \tag{2.8}$$

$$\lambda = I \tag{2.9}$$

These conditions determine the value of the loop inductance at 5.2 pH and the asymmetry factor at 1/3. The left and right loop inductances are therefore 3.9 pH and 1.3 pH respectively. The critical current of  $J_1$  and  $J_2$  are  $I_{max}/4$  (100  $\mu$ A) and  $3I_{max}/4$  (300  $\mu$ A) respectively. For a current density of 1000 A/cm<sup>2</sup> this resulted in the size of  $J_1$  being the smallest (3.2 x 3.2  $\mu$ m<sup>2</sup>) that either the UCB or the Hypres processes can support.

The actual value of the inductance of the input line is not critical to the performance. This line must couple to the loop inductance as closely as possible.

Laying out the input inductance in M2 metallization and the loop inductance in M1, a magnetic coupling of 0.6 could be reached. As currently laid out, the inductance of the input line is 21 pH. The value does not affect the static operation of the gate, it only affects the dynamic operation.

The isolator junction J3 must have a critical current greater than or equal to that of  $J_1$ . Otherwise, it may switch before  $J_1$ , cutting off the input current before the interferometer has switched. Also, to provide good isolation, its critical current must be small so that it switches immediately after the gate does, preventing any of the input current going to the output. Hence, the critical current of  $J_3$  is kept at 100  $\mu$ A, same as that of the small junction,  $J_1$ . For the isolator junction to be effective in cutting off the input (once the gate has switched), it is important that after the SQUID switches to the resistive state, the gate current should be diverted towards the isolator junction rather than to the load. The isolation resistance  $R_i$  should therefore be much smaller than the load resistance. In this work it is fixed at 1  $\Omega$  (recall that the load resistance is 7  $\Omega$ ).

The damping resistance  $R_d$  must be chosen so that the circuit is critically damped. The criterion for critical damping is,

$$R_d = \frac{1}{2} \sqrt{\frac{L}{C}}$$
(2.10)

Here L is the loop inductance and C is the series capacitance of the two junctions in the loop. With a capacitance per unit area of 50 fF/ $\mu$ m<sup>2</sup>, the series capacitance of the two junctions is calculated to be 375 fF. With a loop inductance of 5.2 pH, the value of R<sub>d</sub> is calculated to be 1.8  $\Omega$ 

The source resistance  $R_s$  should be chosen much larger than  $R_{in}$  so that the loading at the source does not change when the gate switches. This will prevent glitches in the power supply due to switching of the gates connected to it. However, a large resistance leads to higher power consumption, more heat dissipation and larger gate area. Considering the above trade-offs, a value of 54  $\Omega$  was chosen for the source resistance  $R_s$ .

#### 2. 2. 3 Factors Determining the Speed of the MVTL Gate

The delay  $\tau$  of any gate is the sum of the turn on time  $\tau_t$  and the rise time  $\tau_r$ . These are given by the following equations:

$$\tau = \tau_r + \tau_t \tag{2.11}$$

$$\tau_r = \frac{V_g C}{I_{max}} \tag{2.12}$$

$$\tau_t = \frac{A}{\omega_p} \tag{2.13}$$

where:

C is the junction capacitance.

 $I_{max}$  is the maximum gate current at zero voltage state

 $V_g$  is the gap voltage

A is the overdrive factor

 $\omega_p$  is the plasma frequency of the junction defined by,

$$\omega_p = \sqrt{\frac{4\pi e I_{max}}{hC}} \tag{2.14}$$

Both delays can be reduced by decreasing  $C/I_{max}$ . C/Imax depends on the critical current density, the thickness of the barrier, and on the isolation material used (Equation 2.15).

$$\frac{C}{I_{max}} = \frac{\varepsilon}{J_c t}$$
(2.15)

where:

 $J_c$  is the critical current density

 $\epsilon$  is the dielectric thickness of the tunnel barrier

t is the thickness of the gate

 $\varepsilon/t$  is the capacitance per unit area of the junction

Therefore delays can be decreased by increasing the critical current density and reducing the capacitance per unit area.

#### 2.2.4 Threshold Curves

As has been explained before, threshold curves for a given circuit are a set of points, in a gate current - input current (or external magnetic flux) space, representing the boundary at which switching occurs. In this section the threshold curves of the MVTL gate are derived and compared to those of a symmetric two junction SQUID. The currents through  $J_1$  and  $J_2$  are  $pI_{max} \sin \phi_1$  and  $qI_{max} \sin \phi_{2}$ , respectively. From Kirchoff's current law we get,

$$I_g + I_c = pI_{max} \sin \phi_1 + qI_{max} \sin \phi_2 \qquad (2.16)$$

Phase quantization around the loop gives,

$$\phi_1 - \phi_2 + (2\pi/\Phi_0) \left[ qL \left( pI_{max} \sin \phi_1 - I_c \right) - pL \left( qI_{max} \sin \phi_2 \right) - MI_c \right] = 2m\pi$$
(2.17)

Equation 2.16 and 2.17 determine the threshold curves of an MVTL OR gate without the effect of the isolator junction (Figure 2.9).

Margins on the gate current are defined as the positive and negative variation allowed in the gate current about the bias value without any malfunction of the circuit. From now onward, margins will be used to refer to gate current margins, unless otherwise specified.

The unique structure of the MVTL threshold curves results in very high margins. Comparing the threshold curves in Figure 2.9 to those of the symmetrical SQUID shown in Figure 1.5 it can be seen that the use of both current injection and magnetic coupling results in an higher slope of the threshold curves for MVTL gates. Since the two sides of the SQUID in the MVTL gate are not identical, its threshold curves are skewed in two ways. Firstly, the curves for different modes are vertically shifted allowing larger variations in the gate current. Secondly, due to the asymmetric nature of the SQUID, the left and right sides of any one threshold curve are not identically shaped. The right side is very steep, reducing the response time of the circuit which depends on the overdrive (refer to Section 2. 2. 3) and the left side is almost flat resulting in higher margins. The threshold curves in Figure 2.9 get modified when the isolator junction is included. For control currents larger than the critical current of  $J_3$ , the threshold curve is defined mainly by the dynamics of the isolator junction and becomes nearly a horizontal line as shown in Figure 2.10.



Figure 2.9 Threshold curves for the MVTL gate without the isolator junction.



Figure 2.10 Threshold curves for MVTL gate with the isolator junction.

#### 2.3 OR Gate Simulations and Layout

This section elaborates the simulation results and layouts of the gate designed in the previous section.

#### 2.3.1 Simulation Results

In this section the parameters used for each design are summarized and the layouts and the simulation results for delays, reset times and margins are presented. All

the gates were simulated using JSIM, (Josephson SIMulator) for delay measurements and parameter variations. The Josephson junction model was also built into HSPICE, which was used for more extensive, large scale simulations. The model parameters used for the junction are given in Table 2.3. The final parameters used for all the gates are summarized in Table 2.4. General parameters are those used in all the gates, the other parameters are specific to the OR gate.

| Parameter                            | Values                        |
|--------------------------------------|-------------------------------|
| Critical current density             | 1000 A/cm <sup>2</sup> †      |
| Normal resistance                    | 14 Ω                          |
| Sub-gap resistance                   | 300 Ω                         |
| Specific capacitance (per unit area) | $50 \text{ fF}/\mu\text{m}^2$ |
| Gap voltage                          | 2.8 mV                        |
| V <sub>m</sub>                       | 30 mV                         |
| Δν                                   | 0.1 mV                        |

Table 2.3: Josephson junction parameters used for simulations.

Table 2.4: Parameter values selected.

| General parameters                        |                             |  |  |  |
|-------------------------------------------|-----------------------------|--|--|--|
| Parameter name Value used in final design |                             |  |  |  |
| Source resistance $R_s(\Omega)$           | 54 Ω                        |  |  |  |
| OR gate pa                                | ramelers                    |  |  |  |
| Parameter name                            | Value used in final design. |  |  |  |
| Input resistances R <sub>in</sub>         | 7Ω                          |  |  |  |
| Critical current of J <sub>1</sub>        | 100 μΑ                      |  |  |  |
| Critical current of J <sub>2</sub>        | 300 µA                      |  |  |  |
| Critical current of J <sub>3</sub>        | 100 μΑ                      |  |  |  |
| Isolation resistance R <sub>i</sub>       | 1.0 Ω                       |  |  |  |
| Damping resistance R <sub>d</sub>         | 1.8 Ω                       |  |  |  |
| Loop inductance L                         | 5.2 pH                      |  |  |  |
| Left loop inductance                      | 3.9 pH                      |  |  |  |
| Right loop inductance                     | 1.3 pH                      |  |  |  |
| Input wire inductance L <sub>x</sub>      | 4 pH                        |  |  |  |
| Mutual coupling factor k                  | 0.6                         |  |  |  |



Figure 2.11 Gate arrangements for (a) margins and delay simulations; (b) maximum clock rate determinations.

To simulate a realistic situation with gates fanning out to similar gates, each having a fanout of two, the gates were configured as shown in Figure 2.11. Each of the circles represents a single OR gate. All measurements were done on gate 2 since it has realistic inputs or outputs. All simulation results are given for two different current densities,  $1000 \text{ A/cm}^2$  and  $5000 \text{ A/cm}^2$ . Junction areas were appropriately adjusted to correspond to the different current densities. Delays, gate current margins (positive and negative) and the fastest three phase sinusoidal clock at which each of the gates could function correctly for fanouts of one and two are summarized in Table 2.5.

|                               | Current density<br>= $1000 \text{ A/cm}^2$ | Current density<br>= $5000 \text{ A/cm}^2$ |
|-------------------------------|--------------------------------------------|--------------------------------------------|
| Delay (ps)                    | 20.9                                       | 12.6                                       |
| Low speed $I_g$ margin (±)    | 33%                                        | 30.2%                                      |
| Maximum clock rate (fanout=1) | 11 GHz                                     | 30 GHz                                     |
| Maximum clock rate (fanout=2) | 11 GHz                                     | 25 GHz                                     |

Table 2.5: Important properties of the OR gate measured from simulations.

The response of the gate to an input from another gate are shown in Figure 2.12. It is seen that the total output current is twice as much as the input current required to switch the gate. Thus the gate has the capability of fanning out to two similar gates.



Figure 2.12 Gate output.

The delay of the gate and its margins are strongly dependent on the current level at which the gate is biased. The variation of delay with bias current at current densities of  $1000 \text{ A/cm}^2$  and  $5000 \text{ A/cm}^2$  are shown in Figure 2.13. It is clear that the speed of the circuit increases with the bias current. Though delays can be reduced by increasing the gate current, a price is paid in terms of the margins. The trade-off between delay and margins, achieved by gate current variation is clearly seen in Figure 2.14.



Figure 2.13 Delay vs. gate current at different current densities



Figure 2.14 Gate-current margin vs. delay at different current densities.

The parameters were varied individually to find the variations allowed on each of them. The maximum possible variations on each, and the delays with these values for current densities of  $1000 \text{ A/cm}^2$  and  $5000 \text{ A/cm}^2$  are shown in Tables 2.6 and 2.7.

| Para-<br>meter  | Maximum<br>allowable<br>decrease (%) | Corresponding<br>delay (ps) | Maximum<br>allowable<br>increase (%) | Corresponding<br>delay (ps) |
|-----------------|--------------------------------------|-----------------------------|--------------------------------------|-----------------------------|
| R <sub>in</sub> | 66.7                                 | 20.6                        | 316.7                                | 57.9                        |
| $I_c(J_1)$      | 40                                   | 16.0                        | 120                                  | 20.9                        |
| $I_c(J_2)$      | 16.7                                 | 15.4                        | 53.7                                 | 65.9                        |
| $I_c(J_{3})$    | 100                                  | 27.37                       | 160                                  | 25.78                       |
| R <sub>i</sub>  | 100                                  | 94.1                        | 800                                  | 13.4                        |
| R <sub>d</sub>  | 100                                  | 23.2                        | >2000                                | 20.8                        |

Table 2.6: Parameter values, variations allowed, and effect of variations on delay for the OR gate at a current density of  $1000 \text{ A/cm}^2$ .

It is interesting that the OR gate functions correctly even when the junction  $J_3$  is open circuited. This reduces the gate to a magnetically coupled SQUID with no input-output isolation. It is known that these SQUIDs do not show good enough delay/margin characteristics to be used in LSI system. Increasing  $R_i$  has the same effect as removing  $J_3$  and therefore  $R_i$  has very high upper margins.

Since the only criterion for  $R_i$  is that it needs to be much less than the value of  $R_L$ ,  $R_i$  can be reduced without affecting the functioning/margins of the gate. However, very small resistances are difficult to lay out. It is also seen that the gate can function irrespective of the value of  $R_d$ . For resetting reasons, however (to damp the Josephson oscillations quickly),  $R_d$  must be chosen so that the circuit is critically damped.

| Para-<br>meter  | Maximum<br>allowable<br>decrease (%) | Corresponding<br>delay (ps) | Maximum<br>allowable<br>increase (%) | Corresponding<br>delay (ps) |
|-----------------|--------------------------------------|-----------------------------|--------------------------------------|-----------------------------|
| R <sub>in</sub> | 28.6                                 | 11.9                        | 614                                  | 48.7                        |
| $I_c(J_1)$      | 55                                   | 10.6                        | 55                                   | 15.7                        |
| $I_c(J_2)$      | 55                                   | 8.5                         | 33.3                                 | 22.3                        |
| $I_c(J_{3})$    | 100                                  | 9.8                         | 80                                   | 16.4                        |
| R <sub>i</sub>  | 50                                   | 15.0                        | 400                                  | 7.7                         |
| R <sub>d</sub>  | 98                                   | 14.0                        | 900                                  | 11.2                        |

**Table 2.7:** Parameter values, variations allowed, and effect of variations on delay for the OR gate at a current density of 5000 A/cm<sup>2</sup>.

#### 2.3.2 Statistical Simulations

The Josephson junction model has been built into HSPICE and Monte Carlo simulations were done. This allows the effect of process variations to be taken into account. The MVTL OR gate was simulated with gaussian variations on the junction critical currents and resistances. This takes into consideration critical current spreads and resistor variations, giving a realistic idea of the performance of these circuits. The significance of these statistical simulations are embodied in this fact: if no circuits go wrong in 30 simulations, then there is a 99% chance that 80% of the circuits will work.

Statistical simulations were performed on the five-gate structure shown in Figure 2.11. After looking at typical variations in the Fujitsu, UCB and the Hypres processes, the following standard deviations were used. All resistors were assumed to have a standard deviation of 6.66%. The junction critical currents were varied along a Gaussian curve with a  $\pm 3\sigma$  value depending on the size of the junctions. A graph showing the values of the standard deviation of junction current vs. their size is shown
in Figure 2.15 (these data has been collected from the literature). Only one out of 100 instances went wrong, establishing the robust nature of the gate.



Figure 2.15 Standard deviation of  $I_c$  variations vs. the junction size.

It does not matter how many similar circuits are simulated. In an attempt to check this, the following experiment was done. Thirty simulation iterations each of a 10 gate configuration and a 15 gate configuration were performed. The  $3\sigma$  values used for resistors was 20% and on the junction critical current was 12%. For 10 gates, 5 instances went wrong, and for 15 gates, 4 went wrong. It is clear that the failures do not depend on the size of the circuit. These simulations were repeated several times to make sure the trends were the same.

Statistical simulations were also performed to find the effect of process variations on the highest frequency that a shift register configuration (cascade of three OR gates) could tolerate with a three phase clock. Recall that with the design values of the parameters, the highest achieved frequency with this setup was 30 GHz and 25 GHz, respectively, for fanouts of 1 and 2. With 30 simulation iterations one wrong iteration appeared at 20 GHz for a fanout of 2 and one at 25 GHz for a fanout of 1. Therefore, the gate is reasonably stable at these frequencies.

#### 2.3.3 Layout of the OR Gate

Layouts of all the cells were done in the UCB and Hypres technologies. Since the test results that are presented have been obtained from the Hypres run, only the Hypres run layouts are included. The critical current density was 1000 A/cm<sup>2</sup> and the sheet resistance was 1  $\Omega$ /square.

The layout of the OR gate is shown in the Figure 2.16. The dimensions of the gate with and without the input and source resistances are 56.0 x 75.0  $\mu$ m<sup>2</sup> and 26.5 x 36.5  $\mu$ m<sup>2</sup> respectively.



Figure 2.16 Layout of an MVTL OR gate.

#### 2.4 The AND Gate and the MAJORITY Gate

A complete logic family would consist of OR gates, AND gates and inverters. In addition, some gates may be included to enhance the utility of the logic family. In digital system design, the adder is a frequently used circuit. The MAJORITY gate serves as a convenient way to generate the carry in an adder circuit. In the MVTL family, the 2/ 3-MAJORITY gate is obtained by slight modification to the AND gate. Therefore, it seemed reasonable to include it in the cell library. However, the inverter is currently not part of the cell library. The reason for this is explained later (Section 2.8).



Figure 2.17 (a) The AND gate (b) The MAJORITY gate.

The schematics of the AND gate and the 2/3 MAJORITY gate are shown in Figure 2.17 and Figure 2.18 respectively. The AND gate switches when both its inputs are high, the MAJORITY gate switches when either two or three of its inputs are high. The operation of both of these gates are controlled by the switching of a single junction. There is no SQUID and the input current is directly injected into the junction. As a result, these gates have no input-output isolation. They must therefore always be used with OR gates at their inputs as shown in Figure 2.18



Figure 2.18 (a) The 2OR-AND cell (b) The 3OR-MAJORITY cell.

As shown in Figure 2.17 and Figure 2.18, the output current of the OR gates is injected into the junction  $J_a$  through resistors  $R_a$ .  $J_a$  should be designed to switch when at least two of the OR gates have latched into the resistive state. The gate current into the unswitched OR gate is  $0.83I_{max}$ . After the OR gate switches, an extra resistance  $(R_a)$  is added in series with the gate resistance  $R_g$ . This reduces the gate current to  $(0.83I_{max})*R_g/(R_g+R_a)$  or  $0.75I_{max}$ . The critical current of  $J_a$  is therefore set at  $0.83I_{max}$ . To prevent  $J_a$  from getting switched by leakage current from the output,  $J_a$  is

accompanied by a resistance  $R_p$  and a junction  $J_p$ .  $J_p$  sinks leakage currents from the output, preventing  $J_a$  from switching. The resistance  $R_p$  (0.75  $\Omega$ ) ensures that currents from the output are diverted into  $J_p$  but currents from the input are diverted into  $J_a$  for correct operation. The critical current of  $J_p$  is fixed at  $pI_{max}$ , the critical current of the smaller of the junctions in the OR gate. The threshold curve for the AND gate is shown in Figure 2.19.



Figure 2.19 Threshold curves for the AND gate.

#### 2.5 Fanout Considerations: The Output Current Amplifier

The operating point on the threshold characteristic sets the fanout of the gate. Each gate provides its own bias current as its output. To understand the fanout that MVTL gates can provide, consider the portion of the threshold curve near the operating point (Figure 2.20).



Figure 2.20 Threshold curves and their relation to fanout.

When the input current is applied, the operating point shifts along the dotted line shown. The gate switches to the resistive state when the operating point crosses the threshold line (marked as x on the figure). The speed of switching depends on the amount of overdrive. Overdrive refers to the actual input current applied divided by the input current at point where it crosses the threshold curve. The state of the OR gate is defined by the f/o=1, 2, 3 lines for a fanout of one, two and three, respectively. The point on the lines where the circuit would settle would depend on the gate current used. These points are marked by black dots on all the lines for the current design.

Margins on the gate current are defined as the variation allowed in the gate current without any malfunction of the circuit. The vertical arrows in Figure 2.20 show the variation in  $I_g$  allowed for each of the three cases. It is clear that the margins are the same for fanouts of two and one but are reduced for higher fanouts. As a result the OR gate in the MVTL family is used with a fanout of two.

The AND gate and the MAJORITY gate have a larger current output (they output the gate current of two or more OR gates) and can support a fanout of three.

There are several situations where a fanout of three may be desired for an OR gate. In these cases the output current of the OR gate needs to be increased. This is done by the *output current amplifier* cell shown in Figure 2.21.



Figure 2.21 The output current amplifier

Along with the output current from the OR gate, an extra gate current is fed into an AND-gate-type configuration. Since extra current always flows through the  $R_{as}$ ,  $J_a$ switches as soon as the output of the OR gate is high. The unit therefore outputs about twice the current of the OR gate and the fanout capability is increased. The value of  $R_{as}$ should be greater than the value of  $R_s$ , the source resistance of the OR gate, so that the current through it is lower than the gate current of the OR gate (recall that the critical current of  $J_a$  is  $0.83I_{max}$ ). In this work the value of  $R_{as}$  has been chosen to be 68  $\Omega$ 

#### 2.6 Simulations

## 2. 6. 1 Simulation Results of the AND Gate, MAJORITY Gate and the Output Current Amplifier

The bias point values for the AND, MAJORITY, and the OUTPUT CURRENT AMPLIFIER are given in Table 2.8. Since the performance of each of these gates is limited by that of an OR gate followed by an AND gate, the delays and margins exhibited by these gates are the same and are summarized in Table 2.9. The output waveform of these gates when they are fed from similar gates is shown in Figure 2.22. It is seen clearly that the output current is approximately three times the input current required to switch the gate, establishing the ability of these gates to support a fanout of three.

| AND gate and MAJORI                | ry gate parameters.         |
|------------------------------------|-----------------------------|
| Parameter name                     | Value used in final design. |
| Critical current of J <sub>a</sub> | 183 µA                      |
| Critical current of J <sub>p</sub> | 100 μΑ                      |
| Resistance R <sub>a</sub>          | 4.0 Ω                       |
| Resistance R <sub>p</sub>          | 0.75 Ω                      |
| Output current amp                 | difier parameters           |
| Parameter name                     | Value used in final design. |
| R <sub>as</sub>                    | 68 Ω                        |

Table 2.8: Parameter values used for the different gates in the cell library.



Figure 2.22 Gate response during simulation.

Table 2.9: Important properties of the gates designed ( $I_c = 1000 \text{ A/cm}^2$ ; Fanout = 3).

| Parameter                           | Value |  |
|-------------------------------------|-------|--|
| Delay (ps)                          | 26.0  |  |
| I <sub>g</sub> margin ( <u>+</u> %) | 26.0  |  |

Each of the parameters was varied individually to find their allowed margins on each of them. The maximum possible variations on each and the delays with these values for current density of 1000 A/cm<sup>2</sup> are shown in Table 2.10.

Table 2.10: Parameter values, variations allowed, and effect of variations on delay.

| Para-<br>meter | Maximum<br>allowable<br>decrease (%) | Corresponding<br>delay (ps) | Maximum<br>allowable<br>increase (%) | Corresponding<br>delay (ps) |
|----------------|--------------------------------------|-----------------------------|--------------------------------------|-----------------------------|
| R <sub>a</sub> | 90%                                  | 29.55                       | 300%                                 | 41.6                        |
| J <sub>a</sub> | 9.1%                                 | 27.0                        | 76%                                  | 63.7                        |

The gate current margins of the these gates were studied in detail. As in the case of the OR gate, delays of the 2OR-AND and the MAJORITY gates can be reduced by increasing the bias level. The variation of delay as a function of the gate current is shown in Figure 2.23. However, the margins tend to reduce as the gate current increases. The trade-off between delay and margins as a result of gate current variation is shown in Figure 2.24. The optimal design (the design with maximum margins) therefore has a delay of 26 ps at the current density used.



Figure 2.23 Delay vs. gate current varies in the 2OR-AND cell.



Figure 2.24 Margins vs. delay as gate current varies in the 2OR-AND cell.

#### 2.7 Layouts

The layouts of the AND gate, the MAJORITY gate, and the output current amplifier are shown in Figure 2.25 and 2.26. The dimensions including the source resistances and the input resistances is 95.5 x 96.0  $\mu$ m<sup>2</sup> for the 2OR-AND cell, and 162.5 x 75.0  $\mu$ m<sup>2</sup> for the MAJORITY gate. Without the input resistance the dimensions of the 2OR-AND cell are 95.5 x 64.0  $\mu$ m<sup>2</sup>. These dimensions include the OR gates added for isolation purposes. Without the OR gates, the dimensions of the AND/MAJORITY gate are

11 x 48.5  $\mu$ m<sup>2</sup>. The output current amplifier is 89  $\mu$ m x 72  $\mu$ m in size. Its layout is shown below.



Figure 2.25 The 2OR-AND cell.



Figure 2.26 The MAJORITY gate (input resistances included).



Figure 2.27 The output current amplifier.

#### 2.8 Dual Rail Logic

Since, unlike CMOS, switching in Josephson circuits is noninverting, inverters are very difficult to design. The MVTL logic family has a *timed inverter* which needs to be clocked separately in a single stage. This is because its functioning requires that the input arrive before the clock. This introduces difficult timing problems and has very low margins, especially at the high speeds for which the superconducting technology is targeted. Inverters are avoided as much as possible in system designs using MVTL. The inputs are inverted once to obtain signal and signal-inverse. The rest of the logic runs in dual rail. In this scheme, the inverse is always generated along with the true signal by using appropriate logic. Therefore, no inverters are needed on the chip except at the inputs.

#### 2.9 Clocking Strategy

The powering scheme in Josephson logic is complicated by its latching nature. If the result of a particular logic evaluation for any gate is high, the gate latches into the resistive state. Before the next evaluation it has to be brought back to the nonresistive state. This is done by reducing the gate current to zero. However, the output of the gate should propagate into the next stage before the result of the present computation is lost by the reduction of the power supply. If the next gate is powered by the same power supply, logic ripples through while the gate current is high. If the next stage is powered by a gate current in a different phase the two consecutive phases of the gate current must overlap sufficiently to allow transfer of logic.

Since the power supply to the circuit (the gate current) is not constant, but switches like a clock, the terms "clock" and "power supply" will be used interchangeably. The circuit receives new data every clock cycle. Consider the case where every gate is clocked in a phase different from its neighbors. The "on" period of each clock phase therefore needs to be equal to the delay of the slowest unit plus some setup/hold time. The "off" period is determined be the reset time of the gate. The reset time is the time taken by the gate to return to the nonresistive state after the gate current has been lowered. This case results in the highest possible throughput. It has been found that the reset time for a gate is larger than its delay. Also, the gate current must be low long enough to avoid punchthrough (the concept of punchthrough will be explained later in this section). The clock period is limited therefore by the "off" period and the "on" period may not be fully utilized. Due to this, more than one consecutive stages may be powered by the same phase.

As noted above, for correct transfer of logic, the next stage should be triggered before the output of the current stage has been reset. If a single phase powering scheme is used, two types of latches are needed. A master latch which keeps the data when the power is reset, and a slave latch which receives the data from the master when the power rises and passes it onto the next logic element. If a two phase powering scheme is used, the two phases must also overlap to enable transfer of logic to subsequent stages. A slave latch is needed to prevent race conditions. For more information on one and two phase powering refer to [Jon82]. In the case of three or more clocks, no latches are necessary between logic stages. Multi-phase clock schemes also help to reduce the ground bounce problem, which is common for large circuits. Due to the simplicity of this scheme, a *three phase* clock has been chosen.

There are different options on the type of waveform on the clock. It may be *sinusoidal* or *trapezoidal*, *unipolar* or *bipolar*. A trapezoidal waveform has the advantage that the bias is constant during the active time. However, at the targeted frequencies, the trapezoidal clock introduces higher harmonic frequency components

that require design measures to protect the power bus from resonance. Even though the sinusoidal clock causes a reduction in the margins of the circuit, it is favored because of the absence of harmonics. Another factor in favor of sinusoidal clock is waveform shaping. For trapezoidal clocks, the waveform is clipped using regulator junctions. These junctions also help in preventing oscillations on the power bus. For sinusoidal power bus, no devices are required for shaping, but terminal resistors are added to reduce oscillations on the power bus. It has been observed that regulator junctions dissipate 1.5 times as much power as the gates, whereas the terminal resistors dissipate the same power as consumed by the gates and are thus cheaper [Fuj89]. Sinusoidal clocks are thus cheaper in terms of power and safer due to absence of higher harmonics.



Figure 2.28 Reset times for unipolar and bipolar sinusoidal clocks compared.

The choice between unipolar and bipolar clocks depends on the *punchthrough* probability. Punchthrough is said to occur when a gate in the "1" state returns to a "1" state when the clock is lowered and again raised with no input current. The probability of punchthrough depends on how long the gate current remains below a certain level,  $I_{min}$ . This time is called the *reset time*,  $T_r$ .  $I_{min}$  is defined as the current below which the junction resets to the zero-voltage stage when the gate bias is decreased. Its value depends on the junction capacitance, critical current density, the sub-gap resistance and the load resistance, and ranges between  $0.25I_{max}$  and  $0.5I_{max}$ . The punchthrough probability is an exponentially decreasing function of the reset time, shown in Equation 2.18.

$$P = e^{-\omega_p T_r} \tag{2.18}$$

Figure 2.28 shows the normalized reset time  $T_r/T_{clk}$  for different values of  $I_{min}/I_{max}$ . It is clear that the reset time available with a unipolar clock is higher. The decision was therefore to use a sinusoidal, three phase, unipolar clock.

#### 2.10 Summary

A cell library of MVTL gates has been designed, simulated and fabricated in the Hypres process. The final results from the fabricated chip are given in Chapter 4. Before examining the fabrication results of the cell library, a system level application of the cell library will be described. In Chapter 3, the design and simulation results of a 3 tap, 4 bit FIR filter is described in detail. The filter was fabricated in the Hypres technology and the results are presented in Chapter 4.

# 3

### **FIR Filter**

#### **3.1 Introduction**

High speed combined with low power consumption of superconductive circuits has generated a lot of research interest in superconductive circuit design. High speeds of these circuits however, cannot be fully exploited unless they can be interfaced to the real world. Real world data are in a "noisy" form. They are coupled with a large variety of extraneous data or "noise". Signal processing can be used to represent, transform and manipulate signals. Sophisticated signal processing algorithms and hardware are prevalent in a wide range of systems, from highly specialized military systems through industrial applications to low cost, high volume consumer electronics. Probably the most important class of signal processors are filters. They form important front and back ends of many electronic systems.

For the high speed of superconducting circuits to be fully exploited, signal processors that provide them with noise-free data should also operate at the same speed. Therefore, for superconducting technology to be utilized in the present day IC market, signal processing capability is very important.

Low pass filters may be used for several purposes. Some of these are: eliminating high frequency noise from incoming data and removing higher harmonics to get clean single frequency waveforms. Another important class of signal processing problems is signal interpretation. Input signals may be processed in order to be understood or recognized as in speech/hand writing recognition systems. Some systems may deliberately combine two of more signals for processing steps. These combined signals may later need to be separated.

The reason that a digital filter has been chosen over analog is twofold. There exist several techniques to convert analog signals to digital signals. There are several on-going efforts to make high speed superconducting analog to digital converters (ADCs). These form convenient interfaces to the outside world. Secondly, digital filters are easier to design, cheaper to build and more robust than their analog counterparts.

Section 3.2 presents a brief overview of FIR filters. Section 3.3 gives an overall design description of the targeted FIR filter and Section 3.4 describes its implementation in detail. Finally, a brief summary is presented in Section 3.5.

#### **3.2 FIR Filters**

A finite impulse response filter is a filter whose impulse response has nonzero values only for a finite duration. The transfer function of such a filter is given as

$$H(z) = \sum_{m=0}^{M} a_m z^{-m}$$
(3.1)

and the impulse response is,

$$h(n) = a_n, n = 0, 1, 2, \dots, M.$$
 (3.2)  
= 0, otherwise



Figure 3.1 Simple, nonpipelined version of a FIR filter.

The basic structure of an FIR filter is shown in Figure 3.1. There are several possible realizations of this filter, with varying amounts of parallelism and pipelining. Parallelism reduces the time taken by increasing the number of hardware units used. A fully parallel structure has dedicated hardware for each operation. Parallel structures are simple in that they require less control. Pipelining refers to the introduction of latches at different stages of the structure. This allows the inputs to be clocked at a faster rate and keeps the hardware busy all the time. The throughput of the circuit is increased considerably without much penalty in area. The area is increased due to the introduction of the latches, but latches are much cheaper than computational blocks like adders and multipliers. A fully pipelined structure has latches after every operator, data is clocked in on every cycle, and every hardware unit is busy in each cycle.

Simple modifications to the above structure can give better implementations. The filter can be retimed by moving the delays across the multiplications and the additions, reducing the critical path. This gives better performance without any extra cost in hardware. For the 5-tap filter shown, the critical path reduces from five to two and the throughput thus increased by applying this simple technique. Instead of supplying the input after every five cycles, the filter is ready for new inputs after every two cycles. A retimed version is shown in Figure 3.2.



Figure 3.2 A retimed version of the FIR filter.

The latching nature of Josephson circuits and the use of a three phase clocking scheme allows as fine pipelining as required without the need of latches. For maximal pipelining every gate should be clocked at a different phase from the previous gate. This way maximal clock speed can be achieved but the reset time  $T_r$  available becomes limited.

#### 3.3 Design of a 4-Bit, 3-Tap, Low Pass, Josephson FIR Filter

The FIR filter designed as part of this work is a 3-tap, 4-bit FIR filter. The coefficient values used in the design are given below.

$$a_0 = 0.375$$
  
 $a_1 = 1.000$   
 $a_2 = 0.375$ 

For reasons discussed in the previous section, a fully parallel structure with limited amount of pipelining was used.



Figure 3.3 Basic structure of the 3-tap, 4-bit FIR filter.

Two types of data representation can be used to implement arithmetic functions. The fixed point representation assumes that the position of the decimal point is fixed. Floating point number representation consists of two fixed point numbers, the mantissa m and the exponent e. The floating point number f is given by the following equation.

$$f = m.2^e \tag{3.3}$$

Digital filters use the fixed point representation. This may introduce finite word length errors for the following reason. The input of the filter (e.g., from an A/D converter) will be a finite length sequence. The results of processing may, however, lead to filter variables that may require additional bits for accurate representation. For example, a b-bit input multiplied by a b-bit coefficient would result in a 2b-bit result. This product is truncated resulting in truncation errors. A high level simulation of the filter was done using the HYPER high level synthesis system. Simulations were done for both floating point and 4-bit, fixed point data representation methods, and the frequency responses obtained in the two cases are shown in Figure 3.4. In this case, the performance of the 4-bit representation closely tracks the floating point representation.



Figure 3.4 Frequency response of the 3-tap FIR filter comparing the floating point and fixed point implementations.

Since one of the inputs to each multiplier is constant, given by the filter coefficient, add-shifts can be used instead of multipliers, reducing the amount of hardware needed. For a custom designed filter the shifts can be hard-wired, immensely reducing the gate count needed for multiplication. Delays can be implemented with OR gates clocked carefully to function as latches. In the following sections, the design of the basic elements is discussed, issues concerning fanout constraints and sign extension are explained and an overall picture of the filter is presented.

#### **3.4 Implementation**

#### 3. 4. 1 The Full Adder and the Half Adder

The half adder has two parts corresponding to the sum and the carry evaluations. The logic equations for the sum and the carry are given below.

$$SUM = A \oplus B \tag{3.4}$$

$$CARRY = AB \tag{3.5}$$

The sum evaluation circuit is essentially a set of XOR gates. Equation 3.6 and Equation 3.7 show *product of sums* forms of an XOR function and its inverse. Since the dual rail logic is used, the signal complements are always available. The product of sums is the natural outcome of the 2OR-AND cell (see Figure 2.18). The carry and its inverse are generated with an AND gate and an OR gate respectively. The logic used for the half adder is shown in Figure 3.5.

$$A \oplus B = \overline{A}B + A\overline{B} = (A + B)(\overline{A} + \overline{B})$$
(3.6)

$$\overline{A \oplus B} = AB + \overline{AB} = (\overline{A} + B) (A + \overline{B})$$
(3.7)



Figure 3.5 Sum and carry generation in a half adder.

The full adder has an additional input, the carry in from the previous bit. The logic equations for the sum and the carry for a full adder are given below.

$$SUM = A \oplus B \oplus C_{in} \tag{3.8}$$

$$CARRY = AB + BC + CD \tag{3.9}$$

As in the half adder, the XOR function is generated with 2OR-AND gates. Since the SUM here is the XOR of three inputs, two stages of 2OR-AND gates are needed. The implementation of the sum generation is shown in Figure 3.7.



Figure 3.6 Sum generation in the full adder with dual rail logic.

The carry function outputs "1" if two or more inputs to it are high. For a full adder, therefore, the carry circuit is basically a MAJORITY gate as shown in Figure 3.7.



Figure 3.7 Carry generation in the full adder.

The numbers of gates and junctions needed to generate the sum and carry for one bit full and half adders are given in Table 3.1

| Name       | OR gates | Current<br>amplifiers | OR-AND<br>cells | MAJORITY<br>gates | Junctions |
|------------|----------|-----------------------|-----------------|-------------------|-----------|
| Full Adder | -        | -                     | 4               | 2                 | 54        |
| Half Adder | -        | 1                     | 3               | -                 | 29        |

Table 3.1: Gate and junction count for 1 bit full and half adders.

The regularity of the logic enables a very compact layout for the full adder as shown in Figure 3.8. The dimensions of the adder are as follows:

Full adder =  $(227.0 \times 483.0) \ \mu m^2$ Half Adder =  $(151.5 \times 404.0) \ \mu m^2$ Four bit full adder =  $(483.5 \times 918.0) \ \mu m^2$ 

A 4-bit full adder consists of a single half adder followed by three full adders. The carry-out of each bit is the carry in of the next higher bit. A layout of a four bit full adder is given in Figure 3.9.





Figure 3.8 1-bit full adder layout.



Figure 3.9 Layout of a 4-bit adder.

#### 3.4.2 The Delay Element

The delay element must make sure that every time it receives an input the corresponding output is fed into the next stage one clock cycle later. The latching property of Josephson circuits reduces the task of latching data to a task of correctly clocking it. In this work the delay element has been implemented by a set or three OR gates clocked in series. For fanout requirements the last gate is a output current amplifier. The basic structure and the timing diagram is shown in Figure 3.10.



Figure 3.10 Implementation of the unit delay.

#### 3. 4. 3 Converting Multiplies to Shift-Adds

In order to reduce the number of gates, the multipliers are converted to shifts and adds. The two's-complement notation is used for binary number representation. Consider the multiplication of the input with the coefficient 0.375. The two's complement representation of 0.375 is 0.011.

0.011 \* x

The multiplication with 0.375 can therefore be converted into one addition and two shifts. For a shift right in two's compliment notation, the last bit needs to be used for sign extension. The sign extension in the case of one/two shift is shown below. When a bit is used for sign extension, the fanout increases. An appropriate fanout tree is needed at each sign extension. Figure 3.11 shows how the above multiplication has been split into shifts and adds along with sign extension and the fanout tree.



Figure 3.11 Multiplication implemented in add-shifts. Fanout tree and sign extension shown.

#### 3.4.4 Assembling the Filter

The input  $x_{in}$ , as seen in the Figure 3.3, feeds into several other gates. Assuming that the input of the filter is the output of some other MVTL gates, it can only handle a fanout of two. At the input, therefore, a large fanout tree is built with output current amplifier cells. The block diagram of the current implementation with the fanout blocks, the sign extension blocks, the full adders and the delay elements is shown in Figure 3.12.



Figure 3.12 Block diagram; 3-phase timing arrangement.

The gate count and the number of junctions of the straightforward implementation is given in Table 3.2. Since this is a custom design, additional reduction of the number of junctions was possible by inspecting the logic and removing those parts that didn't get used. Some of the optimizations done are the following. Two of the filter coefficients are the same so that the number of multiplications could be reduced to one. The carry of the last bit of any of the adders is not used and therefore was not generated. In cases where the output of the adder were right-shifted, rendering the lower order sum bits useless, the sum part of the circuitry was removed from the lower order bits. The second half of Table 3.2 shows the gate and junction count after optimization. A 33% reduction in the number of junctions was achieved. The layouts of both the optimized and unoptimized versions are shown in Figure 3.13 and Figure 3.14. For the optimized version, the floorplan of the chip is shown in Figure 3.15. The final chip photo with the pads is shown in Figure 3.16.

| F                  | lotal number | in the filter b | efore optimi  | zation |       |
|--------------------|--------------|-----------------|---------------|--------|-------|
|                    | Full         | Half            | Fanout        | Delay  | Total |
|                    | Adder (12)   | Adder (4)       | Trees         | Units  |       |
| OR gates           | -            | -               | 12            | 24     | 36    |
| Current Amplifiers | -            | 4*1             | 64            | 16     | 84    |
| 20R-AND cells      | 12*4         | 4*3             | -             | -      | 60    |
| Majority gates     | 12*2         | -               | -             | -      | 24    |
| Junctions          | 648          | 116             | 356           | 152    | 1272  |
|                    | lotal number | in the filter b | efore optimi: | zation |       |
|                    | Full         | Half            | Fanout        | Delay  | Total |
|                    | Adder (9)    | Adder (3)       | Trees         | Units  |       |
| OR gates           | -            | -               | 12            | 24     | 36    |
| Current Amplifiers | -            | 3               | 50            | 16     | 69    |
| 2OR-AND cells      | 32           | 7               | -             | -      | 39    |
| Majority gates     | 12           | -               | _             | -      | 12    |
| majority guide     | 12           | _               |               |        |       |

 Table 3.2: Gate and junction count of the final implementation.



Figure 3.13 The FIR filter layout (unoptimized).



Figure 3.14 Optimized filter layout.



Figure 3.15 Floorplan of the optimized filter.



Figure 3.16 The FIR fliter chip.

#### 3.5 Summary

The 3 tap, 4 bit FIR filter has been designed and a complete custom layout has been presented. The final layouts were fabricated in the Hypres foundry. The results from the fabricated chip are elucidated in Chapter 4.

# 4

### **Testing and Results**

#### 4.1 The Test Setup

The MVTL cell library and the FIR filter were fabricated in the UCB and Hypres processes. Due to some problems with the UCB run however only the Hypres run could be tested. In this section the test setup is described and general problems in superconducting circuit testing are discussed. Section 4.2 describes the test results in detail.

There are several problems which lead to lower yields in superconducting circuits.

• Fabrications issues

Fabrication process for superconducting circuits is not as mature as that for semiconductor circuits.

• High frequency related problems

Inductances become extremely crucial at the high frequencies at which the Josephson technology is targeted. However there is no accurate theoretical 3D model for the self inductance of lines on a chip. As a result, it is difficult to lay them out exactly.

Mutual inductances are also critical at high frequencies and crosstalk becomes dominant if signal lines are routed close to high current carrying clock lines. Due to gigahertz operating speeds on-chip metal lines behave as transmission lines requiring all connections to be impedance matched. This is difficult since it implies accurate knowledge of the metal thicknesses beforehand.

• Difficulties due to superconducting lines

Flux trapping during testing may drastically change the characteristics of the circuit in general and the critical currents of the junctions in particular.

The test setup for superconductive circuits is complicated by the following facts. First, the temperature of the chips needs to be maintained at 4.2 K. Second, due to the low temperature and the extremely low resistances, superconducting loops are formed. These loops are highly prone to flux trapping. SQUIDs and other circuits relying on the Josephson effect are very sensitive to flux levels and need to be frequently de-fluxed. Third, noise, both internal device noise and external noise from environmental interference is of great concern due to the sensitive nature of the circuits and due to the comparatively low margins. Interconnect wires are also a source of noise especially those that are not rigid like coaxial cables.

Due to the above mentioned reasons, several precautions need to be taken during test. All experiments are done in a electromagnetically shielded room to reduce RF interference. Multiple magnetic shields are used to minimize flux trapping. The Helium dewar and the probe that holds the chip are magnetically shielded. The electromagnetic shields on the probe are de-gaussed before each test.

Low speed test equipment consists of a 40 pin probe designed for testing 5mm x 5mm chips. The signals are picked off the pads of the chip by springy fingers on the probe. The end of the probe that holds the chip is immersed inside a liquid helium dewar. Coaxial cables connect the springy fingers to BNC connectors on the other end of the probe. Each I/O line in the probe is filtered and shielded from RF interference. Each one first passes through a 17 kHz low pass filter and then a RF tight enclosure with EMI filters. For more detail on the probe refer to [Fle93].

Gate currents and signal currents to be supplied to the chip are generated either by a pattern (sine wave) generator which provides high currents, an oscillator, or a word generator (HP 8175A) which can provide 24 different word patterns at 1kHz. For I-V characteristics and threshold curves, a SQUID threshold curve tracer designed internally has been used. All signals are monitored using oscilloscopes. Connections between the current sources, the probe and the scopes are with wires connected to each other through SMA or BNC connectors. The maze of connections generates a number of ground loops making the setup prone to flux trappings. To reduce ground loops a bank of 20 opto-isolated current sources and filters were designed by David Feld. Another tactic adopted to reduce the amount of current noise in the loops is to increase the impedance around the loops. Superconducting loops on the chip increase chances of flux trapping.  $1k\Omega$  resistances are added in series with the current sources to reduce current noise.

#### 4.2 Test Results

#### 4.2.1 Basic Gates

The individual gates of the cell library were fabricated at Hypres and tested for functionality and margins at low speed. The I-V characteristic of a 3.3 x 3.3  $\mu$ m<sup>2</sup> junction with a critical current of 100  $\mu$ A is shown in Figure 4.1. The tested values for the critical current and  $\Delta V$  are given below.





| Table 4.1 | Tested | values of | of | junction | parameters. |
|-----------|--------|-----------|----|----------|-------------|
|-----------|--------|-----------|----|----------|-------------|

| Property         | Tested value |
|------------------|--------------|
| Critical current | 100 µA       |
| ΔV               | 0.2 mV       |

All gates worked correctly with input currents greater than 70  $\mu$ A. The oscilloscope photographs of the results from the OR gate, the 2OR-AND cell, the 3OR-MAJORITY gate and the output current amplifier are shown in Figures 4.2, 4.3, 4.4 and 4.5, respectively. The minimum and maximum allowable gate currents and the margins for each gate are given in Table 4.2.



Figure 4.2 Oscilloscope photograph of the OR gate results.



Figure 4.3 Oscilloscope photograph of the functioning of the 2OR-AND cell.


Figure 4.4 Oscilloscope photograph of the results from MAJORITY gate.



Figure 4.5 Oscilloscope photograph of the results from the output current amplifier.

| Cell Name                | Max. gate<br>current (µA) | Min. gate<br>current (µA) | Margins<br>(±%) |
|--------------------------|---------------------------|---------------------------|-----------------|
| OR gate                  | 440                       | 200                       | 37.5            |
| 20R-AND cell             | 700                       | 440                       | 22.8            |
| MAJORITY gate            | 1360                      | 700                       | 31.7            |
| Output current amplifier | 720                       | 420                       | 26.3            |

| Table 4.2 | Maximum | and | minimum | allowable | gate | currents | and | marging | of | the basic | gates |
|-----------|---------|-----|---------|-----------|------|----------|-----|---------|----|-----------|-------|
|-----------|---------|-----|---------|-----------|------|----------|-----|---------|----|-----------|-------|

A cascade consisting of an OR gate, another OR gate and an output current amplifier were tested with a three phase clock. This structure forms the basis of the delay units used in the final design of the filter and may be considered as a one-bit delay. Figure 4.6 shows the oscilloscope photograph of the results.

| S m                                | Max gate<br>current (µA) | Min gate<br>current (µA) | Margins<br>( <u>+</u> %) |
|------------------------------------|--------------------------|--------------------------|--------------------------|
| OR gate (phase 1)                  | 400                      | 160                      | 42.8                     |
| OR gate (phase 2)                  | 400                      | 160                      | 42.8                     |
| Output current amplifier (phase 3) | 720                      | 420                      | 26.3                     |

Table 4.3 Limits on the gates currents allowed for a one bit delay unit.



Figure 4.6 Test results for a 1-bit delay unit.

### 4. 2. 2 1-Bit Half Adder and 1-Bit Full Adder

A one-bit full adder and a one-bit half adder were tested. The functionality of the half adder is shown in Figure 4.7. Figure 4.8 and 4.9 show the functionality test results for a full adder. Both the half adder and the full adder show margins of 20%.



Figure 4.7 Test results of a 1-bit half adder.



Figure 4.8 Oscilloscope photograph of the results of the full adder



Figure 4.9 Carry results of the full adder.

#### 4.2.3 4-Bit Full Adder

The 4-bit full adder was separately fabricated and tested. During test, however, it was found that the carry and carry-bar from the LSB had been interchanged before being connected to the next bit. Apart from this, the adder worked with margins of 16% and showed robust performance. The response to a test sequence of 16 patterns was examined. Table 4.4 shows the inputs, the correct outputs and the outputs expected with the single wrong connection. Oscilloscope photographs of the inputs and the adder response obtained are shown in Figures 4.10 and 4.11 respectively. Bit patterns obtained at all the sum and carry outputs are exactly as expected.

| Signal names   | 1    | 2    | 3    | 4    | 5    | 6    | 7    | 8    |
|----------------|------|------|------|------|------|------|------|------|
|                |      |      | Se   | et 1 |      |      |      |      |
| Input - A      | 0000 | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 |
| Input - B      | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 |
| Actual Sum     | 0000 | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 |
| Actual Carry   | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 |
| Expected Sum   | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 | 1000 | 1001 |
| Expected Carry | 0001 | 0001 | 0011 | 0011 | 0001 | 0001 | 0111 | 0111 |

Table 4.4 Test sequence for the 4-bit adder.

| Signal names   | 1    | 2    | 3    | 4    | 5    | 6    | 7    | 8    |
|----------------|------|------|------|------|------|------|------|------|
|                |      |      | S    | et 2 |      |      |      |      |
| Input - A      | 1000 | 1001 | 1010 | 1011 | 1100 | 1101 | 1110 | 1111 |
| Input - B      | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 |
| Actual Sum     | 1000 | 1001 | 1010 | 1011 | 1100 | 1101 | 1110 | 1111 |
| Actual Carry   | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 |
| Expected Sum   | 1010 | 1011 | 1100 | 1101 | 1110 | 1111 | 0000 | 0001 |
| Expected Carry | 0001 | 0001 | 0011 | 0011 | 0001 | 0001 | 1111 | 1111 |

 Table 4.4 Test sequence for the 4-bit adder.



Figure 4.10 4-bit full adder - inputs



Figure 4.11 4-bit full adder - outputs. (a) sum bits (b) carry bits

#### 4.2.4 FIR Filter Results

Figure 4.12 shows the schematic of the FIR filter with all the variables named. These variable names will be used in the discussion that follows. The filter was tested using a sequence of sixteen numbers. Using this test input sequence, signal values at all the internal and external nodes (all variables) were calculated and are shown in table 4.5. However, due to the layout error in the 4-bit adder reported in section 4. 2. 3, signal values shown in Table 4.5 cannot be obtained. The signal values were therefore recomputed assuming the carry bit out of the LSB of all adders was interchanged with its inverse (Table 4.6). Table 4.5 Signal values at internal/external nodes of the filter with a test input sequence.

| Signal name | 1     | 7     |       | 4     | S     | 9     | 2     | ∞     | 6     | 10    | 11    | 12    | 13    | 14    | 15    | 16    |
|-------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| In          | 0000  | 0001  | 0010  | 0011  | 0100  | 0101  | 0110  | 0111  | 1000  | 1001  | 1010  | 1011  | 1100  | 1101  | 1110  | 1111  |
| c           | 0000  | 0000  | 0001  | 0001  | 0010  | 0010  | 0011  | 0011  | 1100  | 1100  | 1101  | 1101  | 1110  | 1110  | 1111  | 1111  |
| D           | 0000  | 0001  | 0011  | 0100  | 0110  | 0111  | 1001  | 1010  | 10100 | 10101 | 10111 | 11000 | 11010 | 11011 | 11101 | 11110 |
| Е           | 0000  | 0000  | 0000  | 0001  | 0001  | 0001  | 0010  | 0010  | 1101  | 1101  | 1101  | 1110  | 1101  | 1110  | 1111  | 1111  |
| Н           | 0000  | 0001  | 0010  | 0011  | 0100  | 0101  | 0110  | 0111  | 1000  | 1001  | 1010  | 1011  | 1100  | 1101  | 1110  | 1111  |
| K           | 0000  | 0000  | 0000  | 0001  | 0001  | 0001  | 0010  | 0010  | 1101  | 1101  | 1101  | 1110  | 1101  | 1110  | 1111  | 1111  |
| G           | 1111  | 0000  | 0000  | 0000  | 0001  | 0001  | 0001  | 0010  | 0010  | 1101  | 1101  | 1101  | 1110  | 1101  | 1110  | 1111  |
| Н           | 0000  | 0001  | 0010  | 0011  | 0100  | 0101  | 0110  | 0111  | 1000  | 1001  | 1010  | 1011  | 1100  | 1101  | 1110  |       |
| I (G+H)     | 1111  | 0001  | 0010  | 0011  | 0101  | 0110  | 0111  | 1001  | 1010  | 10110 | 10111 | 11000 | 11010 | 11010 | 11000 | 11110 |
| J           | 11110 | 1111  | 0001  | 0010  | 0011  | 0101  | 0110  | 0111  | 1001  | 1010  | 0110  | 0111  | 1000  | 1010  | 1010  | 1000  |
| K           | 0000  | 0000  | 0000  | 0001  | 1000  | 0001  | 0010  | 0010  | 1101  | 1101  | 1101  | 1110  | 1101  | 1110  | 1111  | 1111  |
| J+K         | 11110 | 01111 | 00001 | 00011 | 00100 | 00110 | 01000 | 01001 | 10110 | 10111 | 10011 | 10101 | 10101 | 11000 | 11001 | 10111 |
| Out         | 1110  | 1111  | 0001  | 0011  | 0100  | 0110  | 1000  | 1001  | 0110  | 0111  | 0011  | 0101  | 0101  | 1000  | 1001  | 0111  |
|             |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |

| bit exchange in adder. |
|------------------------|
| for carry              |
| accounting 1           |
| es after               |
| als valu               |
| 6 Sign                 |
| Table 4.(              |

| Signal Name | 1     | 7    | 3    | 4    | S    | 9    | 7    | 8    | 6     | 10    | Ħ     | 12    | 13    | 14    | 15    | 16    |
|-------------|-------|------|------|------|------|------|------|------|-------|-------|-------|-------|-------|-------|-------|-------|
| П           | 0000  | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 | 1000  | 1001  | 1010  | 1011  | 1100  | 1101  | 1110  | 1111  |
| C           | 0000  | 0000 | 0001 | 0001 | 0010 | 0010 | 0011 | 0011 | 1100  | 1100  | 1101  | 1101  | 1110  | 1110  | 1111  | 1111  |
| D           | 0010  | 0011 | 0101 | 0010 | 1001 | 1001 | 1011 | 1000 | 10110 | 10111 | 11001 | 10110 | 11100 | 11101 | 11111 | 11100 |
| Е           | 0000  | 0000 | 0001 | 0000 | 0010 | 0010 | 0010 | 0010 | 1101  | 1101  | 1110  | 1101  | 1111  | 1111  | 1111  | 1111  |
| Н           | 0000  | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 | 1000  | 1001  | 1010  | 1011  | 1100  | 1101  | 1110  | 1111  |
| K           | 0000  | 0000 | 0001 | 0000 | 0010 | 0010 | 0010 | 0010 | 1101  | 1101  | 1110  | 1101  | 1111  | 1111  | 1111  | 1111  |
| Ċ           | 1111  | 0000 | 0000 | 0001 | 0000 | 0010 | 0010 | 0010 | 0010  | 1101  | 1101  | 1110  | 1101  | 1111  | 1111  | 1111  |
| Н           | 0000  | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 | 1000  | 1001  | 1010  | 1011  | 1100  | 1101  | 1110  | 1111  |
| I (G+H)     | 10001 | 0011 | 0100 | 0010 | 0110 | 1001 | 1010 | 1011 | 1100  | 10000 | 11001 | 11011 | 11011 | 11010 | 11111 | 11100 |
| J           | 1100  | 0001 | 0011 | 0100 | 0010 | 0110 | 1001 | 1010 | 1011  | 1100  | 0000  | 1001  | 1011  | 1011  | 1010  | 1111  |
| K           | 0000  | 0000 | 0001 | 0000 | 0010 | 0010 | 0010 | 0010 | 1101  | 1101  | 1110  | 1101  | 1111  | 1111  | 1111  | 1111  |
| J+K         | 1110  | 0011 | 0010 | 0110 | 0110 | 1010 | 1101 | 1110 | 10110 | 11011 | 0000  | 10000 | 10000 | 10000 | 11011 | 11100 |
| Out         | 1110  | 0011 | 0010 | 0110 | 0110 | 1010 | 1101 | 1110 | 0110  | 1011  | 0000  | 0000  | 0000  | 0000  | 1011  | 1100  |



加卡冈卡拉卡

Figure 4.12 Signals at different parts of the chip.

Since only 40 pins are available, only a limited number of internal nodes could be tapped onto pins to be tested. The bit patterns expected at each of the internal nodes that were tapped onto pins and at the output of the filter along with the pin numbers and corresponding signal names are shown in Table 4.7. These have been derived directly from Table 4.6. The filter was clocked using a three phase overlapping clock (Figure 4.13). Figures 4.13, 4.14 and 4.15 show oscilloscope photographs of bit patterns obtained at these internal and external nodes of the filter. Correct signal values were observed at pins 13, 14, 16, 17, 18, 22, 23, 25, 26, 27, 28, 29, 36, 37. However, internal nodes at pins 32 and 35 did not function correctly and proper signals could not be obtained at the final outputs.

When all the parts are active, the high currents feeding the circuit cause a lrge amount of ground bounce. Some signals that showed correct bit patterns when only parts of the circuit were clocked, showed wrong or even unsteady signals when other parts of the circuit were activated. This is clearly an example of ground bounce problems. Another problem encountered was crosstalk. Crosstalk can be caused by the ground bounce or by simply routing two signals close to each other. If high current carrying clock lines are close to signal lines, there is an increased possibility of cross talk.

77

Several problems were encountered during the testing process. Unlike the smaller circuits previously tested, very low margins were observed. Often, parts of the system trapped flux and needed to be de-gaussed. Correct signal values were seen only with the shielded room fully closed.

| pin # | Signal           | 1 | 2 | 3 | 4  | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
|-------|------------------|---|---|---|----|---|---|---|---|---|----|----|----|----|----|----|----|
| 13    | $\overline{A}_3$ | 1 | 1 | 1 | 1  | 1 | 1 | 1 | 1 | 0 | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
| 14    | A <sub>3</sub>   | 0 | 0 | 0 | 0  | 0 | 0 | 0 | 0 | 1 | 1  | 1  | 1  | 1  | 1  | 1  | 1  |
| 16    | B <sub>3</sub>   | 0 | 0 | 0 | 0  | 0 | 0 | 0 | 0 | 1 | 1  | 1  | 1  | 1  | 1  | 1  | 1  |
| 17    | $\overline{F}_3$ | 1 | 1 | 1 | 1  | 1 | 1 | 1 | 1 | 0 | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
| 18    | F <sub>3</sub>   | 0 | 0 | 0 | 0  | 0 | 0 | 0 | 0 | 1 | 1  | 1  | 1  | 1  | 1  | 1  | 1  |
| 22    | Ē <sub>0</sub>   | 1 | 1 | 0 | ·1 | 1 | 1 | 1 | 1 | 0 | 0  | 1  | 0  | 0  | 0  | 0  | 0  |
| 23    | E <sub>0</sub>   | 0 | 0 | 1 | 0  | 0 | 0 | 0 | 0 | 1 | 1  | 0  | 1  | 1  | 1  | 1  | 1  |
| 25    | $\overline{G}_3$ | 0 | 1 | 1 | 1  | 1 | 1 | 1 | 1 | 1 | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
| 26    | G <sub>3</sub>   | 1 | 0 | 0 | 0  | 0 | 0 | 0 | 0 | 0 | 1  | 1  | 1  | 1  | 1  | 1  | 1  |
| 27    | H <sub>3</sub>   | 0 | 0 | 0 | 0  | 0 | 0 | 0 | 0 | 1 | 1  | 1  | 1  | 1  | 1  | 1  | 1  |
| 28    | Î4               | 0 | 1 | 1 | 1  | 1 | 1 | 1 | 1 | 1 | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
| 29    | I <sub>4</sub>   | 1 | 0 | 0 | 0  | 0 | 0 | 0 | 0 | 0 | 1  | 1  | 1  | 1  | 1  | 1  | 1  |
| 32*   | I <sub>3</sub>   | 0 | 0 | 0 | 0  | 0 | 1 | 1 | 1 | 1 | 0  | 1  | 1  | 1  | 1  | 1  | 1  |
| 35*   | J <sub>3</sub>   | 1 | 0 | 0 | 0  | 0 | 0 | 1 | 1 | 1 | 1  | 0  | 1  | 1  | 1  | 1  | 1  |
| 36    | $\overline{E}_3$ | 1 | 1 | 1 | 1  | 1 | 1 | 1 | 1 | 0 | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
| 37    | E <sub>3</sub>   | 0 | 0 | 0 | 0  | 0 | 0 | 0 | 0 | 1 | 1  | 1  | 1  | 1  | 1  | 1  | 1  |
| 39*   | Out <sub>3</sub> | 1 | 0 | 0 | 0  | 0 | 1 | 1 | 1 | 0 | 1  | 0  | 0  | 1  | 1  | 1  | 1  |
| 40*   | Out <sub>2</sub> | 1 | 0 | 0 | 1  | 1 | 0 | 1 | 1 | 1 | 0  | 0  | 1  | 0  | 0  | 0  | 1  |
| 1*    | Out <sub>1</sub> | 1 | 1 | 1 | 1  | 1 | 1 | 0 | 1 | 1 | 1  | 0  | 0  | 0  | 0  | 1  | 0  |
| 2*    | Out <sub>0</sub> | 0 | 1 | 0 | 0  | 0 | 1 | 0 | 0 | 1 | 0  | 0  | 0  | 0  | 0  | 1  | 0  |

 Table 4.7 Expected bit patterns at output pins (\*: signals that did not work).

One of the biggest problems was the need for low noise, high current sources. Each MVTL gate needs bias and maximum currents of 330 and 400  $\mu$ A respectively. As compared to other superconducting logic families, this current requirement is relatively high [Kis93]. Since the circuit implemented is a large scale circuit with few hundreds of gates, the total current needed per clock phase is very high. Some of the clock phases on the filter chip require few tens of milliamperes of current. The problems caused by this were two-fold. Firstly, good, low noise, high current sources were not available. Noise from the current sources increases the probability of flux trapping considerably. The 4-bit full adder was tested with a cleaner, isolated current source and showed reasonable margins. Apart from their unavailability, high currents cause an increased ground bounce.



Figure 4.13 Three phase clocks.



Figure 4.14 Bit patterns measured at internal filter nodes (pin numbers shown). \* Indicates signals that were not working correctly.



Figure 4.15 Bit patterns obtained on the output nodes of the filter.

#### **4.3 Conclusions and Future Work**

An MVTL cell library has been designed, fabricated and fully tested. Dependable operation with high margins has been demonstrated. Macro cells like a one bit 3-phase register, 1-bit half and full adders and a 4-bit full adder have been shown to work with reasonable margins. The 4-bit adder had a layout error, and was shown to give results expected with this error. The important contribution of this work is the design of a stable and reliable cell library that has shown robust performance in reasonable sized macro blocks.

A digital FIR filter has been designed and fabricated. It has been shown to work partly. It is important to note that a filter design involves large scale integration. Despite this, the design and layout had to be done completely manually. The chips had to be fabricated three times. The first two runs had open circuits on the power supply lines and needed to be re-fabricated. Tools for extraction and for doing layout versus schematic checks can help avoid such errors. Except for simulation tools there were no supporting CAD tools. Good supportive CAD tools that address the needs of the superconducting community are not yet available.

Crosstalk, flux trapping and the unavailability of low noise current sources were among the main problems. Crosstalk may be reduced with the help of CAD tools that help model this problem during layout [Kha94]. Errors in the layout process can also be aided by CAD tools. Flux trapping can be reduced with low noise current supplies and better magnetic shielding. Future work would include more careful design accounting for crosstalk between signal lines. Better current sources may lead to better results.

`

.

.

# 5

## References

- [Fle93] J. Fleischman. "A Flux-Shuttle Shift Register and Computer Architecture for Superconductive Digital Systems", Ph.D. Thesis, University of California, Berkeley, May 1993.
- [Fuj85] N. Fujimaki, H. Hoko, H. Shibayama, S.Hasuo and T. Yamaoka, "Variable Threshold Logic with Superconducting Quantum Interferometers", *IEEE Trans. Magn.*, Vol. Mag-19, pp. 1234-1237, May 1983.
- [Fuj85] N. Fujimaki, S. Kotani, S.Hasuo and T. Imamura, "9 ps Gate Delay Josephson OR Gate with Modified Variable Threshold Logic", Japan J. of Appl. Phys., Vol. 24, pp. L1-L2, Jan. 1985.
- [Fuj87] N. Fujimaki, S. Kotani, T. Imamura and S.Hasuo, "Josephson 8-bit Shift Register", *IEEE J. Solid State Circuits*, Vol. SC-227, pp. 886-891, Oct. 1987.
- [Fuj89] N. Fujimaki, S. Kotani, T. Imamura and S.Hasuo, "Josephson Modified Variable Threshold Logic Gates for use In Ultra-High-Speed LSI", *IEEE Trans. Electron Devices*, Vol.36, No. 2, pp. 433-446, Feb. 1989.
- [Has88] S. Hasuo, "High Speed Josephson Integrated Circuit Technology", *IEEE Trans.* Magn., Vol. 25, pp. 740-749, Mar. 1989.
- [Has89] S. Hasuo and T. Imamura, "Digital Logic Circuits", Proc. IEEE, Vol. 77, pp. 1177-1193, Aug. 1989.

- [Has91] S. Hasuo, S. Kotani, A. Inoue and N. Fujimaki, "High-Speed Josephson Processor Technology", *IEEE Trans. on Magn.*, Vol.27, pp. 2602-2609, Mar. 1991.
- [Hos91] M. Hosoya, W. Hioe, J. Casas, R. Kamikawai, Y. Harada, Y. Wada, H. Nakane,
   R. Suda and E. Goto, "Quantum Flux Parametron: A Single Quantum Flux Device for Josephson Supercomputer", *IEEE Trans. Appl.Superconductivity*, Vol. 1, pp. 77-94, Jun. 1991.
- [Hyp92] Hypres Design Rules, Document No. 22-80601
- [Jon82] H.C. Jones, T.R. Gheewala, "AC Powered Josephson Latch Circuits", IEEE J. Solid State Circuits, Vol. SC-17, pp. 1201-1210, Dec. 1982.
- [Kha94] M. Khalaf, "A Computer Aided Design Framework for Superconducting Circuits", Master's Report, EECS Dept. Univ. of Cal., Berkeley, May 1994.
- [Kis93] S. Kishore, A. Marathe, R.Mehra, S.R.Whiteley and T.VanDuzer, "Comparison of Speed and Margins for RCJL, 4JL and MVTL logic families", Fourth International Superconductive Elecetronics Conference, (ISEC '93), Boulder CO, pp. 80-81, Aug. 1993.
- [Kot89] S. Kotani, T. Imamura and S.Hasuo, "A Sub-ns Clock Josephson 4-Bit Processor", Digest of Technical Papers, Symposium VLSI Circuits, New York, pp. 23-24, 1989.
- [Kot87] S. Kotani, N. Fujimaki, S. Morohashi, S.Ohara and S.Hasuo, "Feasibility of an Ultra-High-Speed Josephson Multiplier", *IEEE J. Solid State Circuits*, Vol. SC-22, pp. 98-103, Feb. 1987.
- [Kot88] S. Kotani, N. Fujimaki, T. Imamura and S.Hasuo, "A Subnanosecond Josephson 16-bit ALU", IEEE J. Solid State Circuits, Vol. 23, pp. 591-596, Apr. 1988.
- [Kot88] S. Kotani, T. Imamura and S.Hasuo, "A 1.5 ps Josephson OR gate", Technical Digest, Integrated Electron Devices Meeting (IEDM), San Francisco, pp. 884-885, 1988.

- [Lik91] K. K. Likharev and V. K. Semenov, "RSFQ Logic/Memory Family: A New Josephson-Junction Technology for Sub-Terahertz-Clock-Frequency Digital Systems", IEEE Trans. of Appl. Superconductivity, Vol. 1, pp. 3-28, Mar. 1991.
- [Nak82] H. Nakagawa, E. Sogowa, S. Kosaka, S. Takada and H. Hayakawa, "Operating Characteristics of Josephson Four-Junction Logic", Japan. J. Appl. Phys., Vol. 21, pp. L198-L200, Apr. 1982.
- [Son82] J. Sone, T. Yoshida and H. Abe, "Resister Coupled Josephson Logic", Appl. Phys. Lett., Vol. 40(8), Apr. 1982, pp. 741-744.
- [Van81] T. Van Duzer and C. W. Turner, Principles of Superconductive Devices and Circuits, Elsevier, NewYork, 1981.