RF to Millimeter-wave Linear Power Amplifiers in Nanoscale CMOS SOI Technology

Jing-Hwa Chen
Purdue University

Follow this and additional works at: https://docs.lib.purdue.edu/open_access_dissertations
Part of the Electrical and Computer Engineering Commons

Recommended Citation
https://docs.lib.purdue.edu/open_access_dissertations/178

This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact epubs@purdue.edu for additional information.
This is to certify that the thesis/dissertation prepared

By Jing-Hwa Chen

Entitled
RF to Millimeter-wave Linear Power Amplifiers in Nanoscale CMOS SOI Technology

For the degree of Doctor of Philosophy

Is approved by the final examining committee:

SAEED MOHAMMADI
Chair

BYUNGHOO JUNG

DIMITRIOS PEROULIS

KAUSHIK ROY

To the best of my knowledge and as understood by the student in the Research Integrity and Copyright Disclaimer (Graduate School Form 20), this thesis/dissertation adheres to the provisions of Purdue University’s “Policy on Integrity in Research” and the use of copyrighted material.

Approved by Major Professor(s): SAEED MOHAMMADI

Approved by: M. R. Melloch 11-21-2013
Head of the Graduate Program Date
RF TO MILLIMETER-WAVE HIGH POWER LINEAR AMPLIFIER IN 
NANOSCALE CMOS SOI TECHNOLOGY

A Dissertation

Submitted to the Faculty

of

Purdue University

by

Jing-Hwa Chen

In Partial Fulfillment of the

Requirements for the Degree

of

Doctor of Philosophy

December 2013

Purdue University

West Lafayette, Indiana
ACKNOWLEDGEMENTS

The many years at Purdue University have been an adventure full of bitter and sweet moments. To the people who have assisted and supported me during my graduate studies, I would like to express my deepest gratitude and appreciation.

First and foremost, I would like to thank my advisor, Professor Saeed Mohammadi, for offering continuous support throughout the course of this thesis. He encourages me not only to grow as an engineer but also inspires me to think outside the box as an independent researcher. I would like to thank Professor Byunghoo Jung for getting my graduate study started on the right foot and offering valuable advices. I would like to thank my committee members, Professor Dimitrios Peroulis and Professor Kaushik Roy for investing their time and energy to this thesis as well as their valuable insights.

I would like to thank all my colleagues at Purdue University, Sultan, Alice, and Hossein who were involved in this project from the beginning. This research would not be successful without their great contribution. I would like to thank Dan Nobbe, Bryan Hash, Chris Olson, and Joe Schultz from Peregrine Semiconductor for offering all the tapeout opportunities as well as their technical assistance.

I would like to thank all of my friends, Mei-Hung, Julia, Chun-Yu, Yu-Yun, Annie, Tzu-Ying, Yu-Ting, Tsung-Chieh, Yung-Yao and others far too numerous to mention here. I was blessed with so many amazing people supporting me and seeing me through the difficult parts of this journey.
Lastly and most importantly, I would like to express my appreciation and deepest love to my family — my parents, Yan-Ping Chen and Muoi Tang, my brother Pei-Hua, and also Ya-Ping for their patient, understanding, and unconditional support. Their love for me is what keeping my head above the water and gives me the courage to pursue my dreams at every stage of my life.
# TABLE OF CONTENTS

<table>
<thead>
<tr>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>LIST OF TABLES</td>
</tr>
<tr>
<td>LIST OF FIGURES</td>
</tr>
<tr>
<td>ABSTRACT</td>
</tr>
<tr>
<td>1. INTRODUCTION</td>
</tr>
<tr>
<td>1.1 An Overview of CMOS Power Amplifiers</td>
</tr>
<tr>
<td>1.2 Motivation</td>
</tr>
<tr>
<td>1.3 Thesis Organization</td>
</tr>
<tr>
<td>2. POWER AMPLIFIER FUNDAMENTALS</td>
</tr>
<tr>
<td>2.1 Load Impedance Termination</td>
</tr>
<tr>
<td>2.2 Efficiency Analysis</td>
</tr>
<tr>
<td>2.2.1 Matching Loss and Bandwidth</td>
</tr>
<tr>
<td>2.2.2 Transistor Knee Voltage</td>
</tr>
<tr>
<td>2.3 Class A and Class AB PA</td>
</tr>
<tr>
<td>2.4 Summary</td>
</tr>
<tr>
<td>3. RF AND MILLIMETR-WAVE CMOS POWER AMPLIFIER</td>
</tr>
<tr>
<td>3.1 The Proposed Power Amplifier in CMOS SOI Technology</td>
</tr>
<tr>
<td>3.1.1 Circuit Topology</td>
</tr>
<tr>
<td>3.1.2 Effect of Parasitic Capacitance</td>
</tr>
<tr>
<td>3.2 Wideband CMOS SOI Power Amplifier</td>
</tr>
<tr>
<td>3.2.1 Introduction</td>
</tr>
<tr>
<td>3.2.2 X-Band Power Amplifier Design</td>
</tr>
<tr>
<td>3.2.3 Measurement Results</td>
</tr>
<tr>
<td>3.2.4 Conclusion</td>
</tr>
<tr>
<td>3.3 MM-Wave PA in Nanoscale CMOS Technology</td>
</tr>
<tr>
<td>3.3.1 Introduction</td>
</tr>
<tr>
<td>3.3.2 CMOS FET RF Characteristics</td>
</tr>
<tr>
<td>3.3.3 Ks-Band PA Design</td>
</tr>
<tr>
<td>3.3.4 Measurement Results</td>
</tr>
<tr>
<td>Section</td>
</tr>
<tr>
<td>---------</td>
</tr>
<tr>
<td>3.3.5</td>
</tr>
<tr>
<td>3.4</td>
</tr>
<tr>
<td>4.1</td>
</tr>
<tr>
<td>4.2</td>
</tr>
<tr>
<td>4.2.1</td>
</tr>
<tr>
<td>4.2.2</td>
</tr>
<tr>
<td>4.3</td>
</tr>
<tr>
<td>4.4</td>
</tr>
<tr>
<td>5.1</td>
</tr>
<tr>
<td>5.2</td>
</tr>
<tr>
<td>5.3</td>
</tr>
<tr>
<td>5.3.1</td>
</tr>
<tr>
<td>5.3.2</td>
</tr>
<tr>
<td>5.4</td>
</tr>
<tr>
<td>5.4.1</td>
</tr>
<tr>
<td>5.4.2</td>
</tr>
<tr>
<td>5.5</td>
</tr>
<tr>
<td>6.1</td>
</tr>
<tr>
<td>6.2</td>
</tr>
<tr>
<td>6.2.1</td>
</tr>
<tr>
<td>6.2.2</td>
</tr>
<tr>
<td>6.3</td>
</tr>
<tr>
<td>6.4</td>
</tr>
<tr>
<td>6.5</td>
</tr>
<tr>
<td>7.</td>
</tr>
<tr>
<td>8.</td>
</tr>
<tr>
<td>9.</td>
</tr>
<tr>
<td>10.</td>
</tr>
</tbody>
</table>
LIST OF TABLES

<table>
<thead>
<tr>
<th>Table</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>3.1 Design parameters comparison of an unit power amplifier and PAs designed with $n$-parallel and $n$-stacked transistors</td>
<td>19</td>
</tr>
<tr>
<td>3.2 Performance comparison of X-band CMOS PAs</td>
<td>30</td>
</tr>
<tr>
<td>3.3 Performance comparison of mm-wave CMOS PAs</td>
<td>36</td>
</tr>
<tr>
<td>4.1 Performance comparison of broadband PAs</td>
<td>51</td>
</tr>
<tr>
<td>5.1 Performance comparison of Watt-level CMOS PAs</td>
<td>67</td>
</tr>
<tr>
<td>6.1 Performance comparison of high power linear amplifiers</td>
<td>81</td>
</tr>
</tbody>
</table>
# LIST OF FIGURES

<table>
<thead>
<tr>
<th>Figure</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>1.1 The simplified circuit schematic of PAs designed with (a) $n$-parallel CS cells (b) $n$-stacked CS cells with input transformers connected in series (c) dynamically-biased $n$-stacked Cascode cells with a mixture of series and parallel combinations of input transformers and (d) $n$-stacked of multiple CG transistors.</td>
<td>2</td>
</tr>
<tr>
<td>1.2 Power combiners implemented using (a) DAT structure [3] and (b) transformers [4].</td>
<td>3</td>
</tr>
<tr>
<td>2.1 (a) The simplified equivalent circuit schematic of a linear power amplifier and (b) the conceptual loadlines for conjugate match and loadline match.</td>
<td>8</td>
</tr>
<tr>
<td>2.2 (a) Gate oxide breakdown and (b) drain-source reach-trough in a MOS transistor [4].</td>
<td>10</td>
</tr>
<tr>
<td>2.3 The simplified circuit schematic of a CS PA with an $I:n$ output impedance transformation network.</td>
<td>11</td>
</tr>
<tr>
<td>2.4 The loss of a single stage LC matching network versus impedance transformation ratio for different inductor quality factors of 5, 10, and 20.</td>
<td>12</td>
</tr>
<tr>
<td>2.5 The $I_D-V_{DS}$ curves of a CMOS SOI transistor when biased under two different current densities.</td>
<td>13</td>
</tr>
<tr>
<td>2.6 The voltage and current waveforms of (a) Class A and (a) Class AB PA.</td>
<td>14</td>
</tr>
<tr>
<td>2.7 Gain, output power, PAE, and DE when the PA is biased under current densities of 3.5 mA/mm (dashed lines) and 15 mA/mm (solid lines).</td>
<td>15</td>
</tr>
<tr>
<td>2.8 ACLR, PAE, and gain versus output power with WCDMA signal when biased under current densities of 3.5 mA/mm (dashed lines) and 15 mA/mm (solid lines).</td>
<td>16</td>
</tr>
<tr>
<td>3.1 The simplified circuit schematic of the proposed transformer –coupled stacked Cascode power amplifier.</td>
<td>17</td>
</tr>
<tr>
<td>Figure</td>
<td>Page</td>
</tr>
<tr>
<td>--------</td>
<td>------</td>
</tr>
<tr>
<td>3.2</td>
<td>Simplified cross-section of the proposed stacked PA implemented in CMOS SOI technology</td>
</tr>
<tr>
<td>3.3</td>
<td>Simplified equivalent circuit schematic of a stacked power amplifier with internodal parasitic capacitance between stacked cells</td>
</tr>
<tr>
<td>3.4</td>
<td>The circuit schematic of the X-band stacked power amplifier</td>
</tr>
<tr>
<td>3.5</td>
<td>Simulated voltage swings across each stacked transistor at 12 GHz when biased under a supply voltage of 4.8V</td>
</tr>
<tr>
<td>3.6</td>
<td>The (a) photo and (b) block diagrams of the measurement setups for small-signal S-parameters, large-signal power measurements and two tone measurements</td>
</tr>
<tr>
<td>3.7</td>
<td>Measured and simulated small-signal S-parameters and measured stability factor from 5 to 20 GHz when biased under $V_{DD}$=4.8V</td>
</tr>
<tr>
<td>3.8</td>
<td>Measured output power, PAE, and gain at 12 GHz when biased under $V_{DD}$=3.6 V and 4.8 V</td>
</tr>
<tr>
<td>3.9</td>
<td>Measured $P_{SAT}$, $P_{1dB}$, peak PAE, and corresponded DE at 12 GHz when biased under different supply voltages from 3.6 V to 4.8 V</td>
</tr>
<tr>
<td>3.10</td>
<td>Measured $P_{SAT}$, peak PAE, and OIP3 from 9 to 15 GHz when biased under $V_{DD}$=3.6 V and 4.8 V</td>
</tr>
<tr>
<td>3.11</td>
<td>The chip micrograph of the X-band PA</td>
</tr>
<tr>
<td>3.12</td>
<td>The circuit schematics of Ka band PAs designed with (a) a Cascode cell and (b) a 2-stacked Cascode cells</td>
</tr>
<tr>
<td>3.13</td>
<td>Measured small-signal S-parameters of the Cascode PA and stacked Cascode PA from 30 to 40 GHz when biased under 1.8 V and 3.6 V, respectively</td>
</tr>
<tr>
<td>3.14</td>
<td>Measured output power and PAE of the Cascode PA and the stacked Cascode PA when biased under 1.8 V and 3.6 V, respectively</td>
</tr>
<tr>
<td>3.15</td>
<td>Measured $P_{SAT}$, $P_{1dB}$, and peak PAE of the stacked PA at 37 GHz biased under various supply voltages from 3 to 4.4 V</td>
</tr>
<tr>
<td>4.1</td>
<td>The circuit schematic of the proposed broadband stacked power amplifier</td>
</tr>
</tbody>
</table>
4.2 The simulated output impedance $S_{22}$ from 5 to 30 GHz of NMOS transistors with various widths connected in CS and Cascode configurations. The output impedances of transformer-coupled stacked cells with 6 transistors (320 µm wide) are also shown (6 CS transistors, 3 Cascode cells, and 2 Cascode cells with 2 CS transistors). All transistors are biased under $I_d=0.2$ (mA/µm) and $V_{ds}=0.75$ V. ........40

4.3 Simulated $V_{gs}$-$V_{ds}$ waveforms of the top transistor in the stack at 18 GHz when biased under a supply voltage of 4.5 V ($V_{DS,DC}$ per transistor is 0.75 V) for output powers of 12, 16, and 20 dBm.........................................................42

4.4 Simulated load lines of each transistor in the stack at -1 dB compression point when biased under supply voltage of 4.5 V at (a) 6 GHz ($P_{1db}=18.6$ dBm) (b) 18 GHz ($P_{1db}=19.2$ dBm) and (c) 26 GHz ($P_{1db}=18.1$ dBm). .........................43

4.5 The layout of the input transformer and simulated insertion loss and return loss of the transformer-coupled input matching network from 5 to 30 GHz .................44

4.6 The simulated time-domain voltage waveforms of the broadband PA at -1 dB compression point at (a) 18 GHz ($P_{1db}=19.2$ dBm) and (b) 23 GHz ($P_{1db}=18.5$ dBm). ...............................................46

4.7 Measured (solid lines) and simulated (dash lines) small-signal S-parameter under a supply voltage of 4.5 V.................................................................47

4.8 Measured and simulated output power, gain, PAE and DE versus input power under a supply voltage of 4.5 V at 18 GHz ........................................48

4.9 Measured $P_{SAT}$, $P_{1db}$, and peak PAE under various supply voltages from 3.6 V to 7.2 V at 18 GHz ..................................................................................48

4.10 Measured $P_{SAT}$, $P_{1db}$, and peak PAE from 6 to 26 GHz when biased under supply voltages of 4.5 V and 5.4 V, and 7.2 V........................................49

4.11 Micrograph of the broadband PA. ............................................................................50

5.1 Simulated drain-source voltage swings of each transistor in a PA designed with 4 stacked CS cells (a) with and (b) without the effect of parasitic capacitances........54

5.2 The simplified circuit schematic of the RF CMOS SOI PA designed with 8 stacked dynamically-biased Cascode cells (16 transistors). ..............................55

5.3 Simulated $V_{dc}$-$V_{gs}$ voltage swings of the top transistor in the stack of 16 transistors with input powers of 10, 15, and 20 dBm. .................................56
5.4 Measured input (red curves) and output (blue curves) reflection coefficients from 1 to 5 GHz of a 2 mm transistor, one Cascode cell implemented with 2 mm transistors, and the PA designed with 8 stacked Cascode cells. ..........................58

5.5 (a) Post-processing steps for AlN substrate transfer. Step 1: Bonding the chip to a temporary substrate using photo-resist. Step 2: Etching the backside silicon substrate using XeF₂. Step 3: Releasing the chip from the temporary substrate using acetone. Step 4: Bonding the chip to an AlN substrate using a thin adhesive PMMA layer on a heat plate at 80°C. (b) The photo and the chip micrograph of the CMOS SOI chip with substrate transferred to AlN. ................................................60

5.6 Measured transistor (a) $I_D-V_{DS}$ and (b) $I_D-V_{GS}$ characteristics before (red curves) and after (blue curves) substrate transfer. ...........................................................................................................61

5.7 Measured transistor RF performance before (red curves) and after (blue curves) substrate transfer. ..........................................................................................................................62

5.8 Measured small signal S-parameters of the stacked PA under 9 V and 12 V supply voltage ..........................................................................................................................63

5.9 Measured $P_{SAT}$, $P_{1dB}$, and PAE versus input power at 1.8 GHz under 15 V supply voltage before (dash curves) and after (solid curves) post-processing..........63

5.10 Measured $P_{SAT}$, $P_{1dB}$, and peak PAE of the PA under various supply voltages at 1.8 GHz before (dash curves) and after (solid curves) post-processing........64

6.1 Simulated optimum load impedance ($Z_{opt}$) and output reflection coefficient ($S_{22}$) for different number and size of stacked Cascode transistor cells at 1.4GHz. ...69

6.2 (a) Simulated $P_{SAT}$, peak PAE, and Gain of stacked PAs designed with different number of stacked cells with a fixed $R_{opt} = 20 \Omega$. (b) Simulated $P_{SAT}$, peak PAE, and fractional bandwidth of stacked PAs designed with different number of stacked cells with the same transistor width of 10 mm.................................71

6.3 (a) Circuit schematic and (b) chip micrograph of the wideband high power linear amplifier. ..................................................................................................................72

6.4 The layout and dimension of the input transformers. ........................................73

6.5 Measured and simulated small-signal S-parameters of the PA. ..........................74

6.6 Measured output power, gain, PAE, and DE at 1.4 GHz when biased under two supply voltages of 13.5 and 16V. ..................................................................................74
6.7 Measured $P_{\text{SAT}}$, gain, PAE, and corresponding DE from 1 to 2 GHz when biased under two supply voltages of 13.5 and 16V. .........................................................75

6.8 Measured ACLR and DE versus output power using WCDMA signal at 1.4 GHz when biased under two supply voltages of 13.5 and 16V. .................................76

6.9 Measured linear output power using WCDMA signals under two supply voltages of 13.5 and 16V. .................................................................................................76

6.10 Measured ACLR and DE versus output power using LTE signal at 1.4 GHz when biased under a supply voltage of 13.5 V .................................................................77

6.11 Measured output spectrum of the PA with LTE signal when biased under a supply voltage of 13.5 V. .................................................................................................77

6.12 (a) The waveform of the pulse power supply and (b) the photo of the thermal measurement system setup. .................................................................................................78

6.13 The thermal image at output power of 30 dBm and the temperature profile across the each stacked cell under DC and RF excitations with different output power levels.............................................................................................................................79
ABSTRACT

Chen, Jing-Hwa. Ph.D., Purdue University, December 2013. RF to Millimeter-wave Linear Power Amplifiers in Nanoscale CMOS SOI Technology. Major Professor: Saeed Mohammadi.

The low manufacturing cost, integration capability with baseband and digital circuits, and high operating frequency of nanoscale CMOS technologies have propelled their applications into RF and microwave systems. Implementing fully-integrated RF to millimeter-wave (mm-wave) CMOS power amplifiers (PAs), nevertheless, remains challenging due to the low breakdown voltages of CMOS transistors and the loss from on-chip matching networks. These limitations have reduced the design space of CMOS power amplifiers to narrow-band, low linearity metrics often with insufficient gain, output power, and efficiency.

A new topology for implementing power amplifiers based on stacking of CMOS SOI transistors is proposed. The input RF power is coupled to the transistors using on-chip transformers, while the gate terminal of each transistor is dynamically biased from the output node. The output voltages of the stacked transistors are added constructively to increase the total output voltage swing and output power. Moreover, the stack configuration increases the optimum load impedance of the PA to values close to 50 ohm, leading to power, efficiency and bandwidth enhancements. Practical design issues such as limitation in the number of stacked transistors, gate oxide breakdown, stability, effect of parasitic capacitances on the performance of the PA and large chip areas have also been addressed. Fully-integrated RF to mm-wave frequency CMOS SOI PAs are successfully implemented and measured using the proposed topology.
1. **INTRODUCTION**

1.1 **An Overview of CMOS Power Amplifiers**

Radio Frequency (RF), microwave and millimeter wave power amplifier circuits are often implemented with GaAs, GaN, BiCMOS, and LDMOS technologies that provide high power handling capabilities. The reasons that CMOS technologies are not considered as attractive choices for implementing power amplifiers are summarized in the following paragraphs [1], [2]:

(i) The low gate-oxide breakdown and low drain-source reachthrough voltage of nanoscale transistors limit the maximum voltage swings across transistor terminals. As a result, the power delivered by CMOS PAs has been reduced proportionally to the square of supply voltages \( P_{\text{out}} \sim V_{DD}^2 \) as the transistor dimensions continuously scale down. In order to compensate for the low breakdown voltages, large output powers are commonly achieved by utilizing wide periphery multi-finger transistors with low optimum load impedances to increase the output current swing.

(ii) The low optimum load impedances, nevertheless, require matching networks with high impedance transformation ratios to match to the system impedance (typically 50 Ω). Due to the conductive nature of Si substrate and thin metal layers in advance CMOS technologies, on-chip passive components generally suffer from low quality factors (\( Q \)). Therefore, the loss from matching networks may significantly degrade PA's performance, especially when the impedance transformation ratios are high. These limitations have reduced the design space of power amplifiers to narrow-band, low linearity metrics with often insufficient gain, output power, and efficiency.
Figure 1.1 The simplified circuit schematic of PAs designed with (a) \( n \)-parallel CS cells
(b) \( n \)-stacked CS cells with input transformers connected in series (c) dynamically-biased
\( n \)-stacked Cascode cells with a mixture of series and parallel combinations of input
transformers and (d) \( n \)-stacked of multiple CG transistors.

The existing topologies for implementing high power amplifiers in CMOS technology
are summarized in Figure 1.1. Figure 1.1 (a) shows a simplified schematic of a PA
implemented with \( n \)-parallel transistor combination. The input and optimum load
impedances of the PA without using impedance transformation networks are \( Z_{\text{in}}/n \) and
\( Z_{\text{opt}}/n \), where \( Z_{\text{in}} \) and \( Z_{\text{opt}} \) are the input and optimum load impedances of the unit transistor
cell, respectively, and \( n \) is the number of transistor cells. The unit transistor cell can be
implemented using any single-stage or multi-stage transistor amplifier topology with two
common examples of common source CS (shown in Figure 1.1) and Cascode
configurations. The low input and optimum load impedance of PAs designed with
parallel combination, nevertheless, require matching networks with high impedance
transformation ratios to match to 50 \( \Omega \) impedances.

Another approach to implement high power amplifiers is to use various power
combining techniques to combine the output of several unit power amplifiers into one
single-ended output as shown in Figure 1.1(b). The common structures of power
combiners using distributed active-transformer (DAT) structures [3] and transformers [4]
are plotted in Figure 1.2. The insertion loss of on-chip combiners reduces the maximum output power and sets a limit on the maximum efficiency and output power that can be practically achieved especially at high operating frequencies. Nevertheless, this technique has been widely used to implement relatively narrowband PAs.

Figure 1.2 Power combiners implemented using (a) DAT structure [3] and (b) transformers [4].

Figure 1.3 The circuit schematic of PAs implemented using (a) stack FET with one CS device [5] and transformer-coupled CS cells with (a) series and (b) parallel input connections [6].
In order to overcome the bandwidth, efficiency and output power limitations caused by output matching networks of PAs with parallel transistor combinations and power combing techniques, stacked PA configuration has been proposed [5-14]. Figure 1.3 summarizes the circuit schematic of PAs implemented using (a) a stack with one CS device [5] and transformer-coupled CS cells with (b) series and (b) parallel input connections [6]. With the transformer-coupled configuration, the input and output impedances of the PA without impedance transformation networks are $nZ_{in}$ and $nZ_{opt}$. The input transformers may be arranged in parallel to achieve an input impedance of $Z_{in}/n$ while the outputs of transistor cells are still in series. Alternatively, as shown in Figure 1(c), a mixture of series and parallel combinations at the input nodes with an input impedance between $Z_{in}/n$ and $nZ_{in}$ may be arranged in order to achieve an input impedance matched to 50 Ω, if desired.

Practical implementations of the stacked transistor approach have been limited to a maximum of only four transistors in the stack. There are several reasons why it is challenging to stack more than four transistors in the stack as discussed in the following.

(i) First and foremost, if the gate biases of the transistors in the stack are fixed at certain voltages, the top transistor in the stack (closest to the output node) may experience gate-oxide breakdown for large output voltage swings.

(ii) Secondly, for bulk CMOS or BiCMOS processes, transistors may experience substrate breakdown or severe substrate leakage, or both, when the number of transistors in the stack is increased. Note that this limitation only applies to transistors on conducting substrates and excludes CMOS SOI transistors, as well as compound semiconductor FETs with semi-insulating substrates.

(iii) The next limiting mechanism in the number of transistors that can be practically stacked stems from parasitic capacitances at transistor terminals connecting to the common GND. These parasitics break up the symmetry of the stacked PA circuit and introduce phase and amplitude differences to the drain-source voltages of transistors in the stack. It turns out that the added phase of the transistor cell farthest from the output is
quadratically proportional to the number of transistors in the stack [15]. The phase difference prevents the voltage swings from adding up constructively and results in lowering the voltage combining efficiency and output power as the number of stacked transistors is increased. The variations on the amplitude, on the other hand, result in a substantially higher voltage swing of the top transistor comparing to other transistors in the stack. As a result, the top transistor may experience breakdown before the other transistors in the stack contribute any appreciable voltage swings. This effect limits the maximum output power that can be achieved and instability may occur under high power operation.

Adding proper capacitance loads at the gate terminals have been demonstrated to adjust the voltage swings for PAs designed with a stack of multiple common-gate (CG) transistors as shown in Figure 1.1(d). The capacitor values required at the top of the stacked, nevertheless, are substantially lower than the others to achieve an equal drain-source voltage swing across each transistor in the stack. The maximum number of stacked transistors is limited to 3–4.

1.2 Motivation

Power amplifiers are considered as one of the most important components in a wireless communication system. The performance of a PA not only decides the transmission quality and distance but also determines the overall efficiency and the thermal dissipation capability requirements. Currently, the high-end 3G and 4G mobile markets are dominated by GaAs PAs due to the superior linearity and efficiency. CMOS PAs, on the other hand, are used for 2G applications mainly due to the difficulty in finding a balance between cost and performance.

Significant efforts have been made to reduce the manufacturing cost of transceiver chips. One attractive way is to achieve high level integration of multiple functional blocks into one single chip [16], [17]. With the relatively low manufacturing cost and integration capability with digital and baseband circuits, nanoscale CMOS technology has become an attractive choice for implementing RF systems-on-chip (SoC). Many
transceiver building blocks have been successfully demonstrated in advance CMOS technology nodes. Nevertheless, one of the bottlenecks is to design high-performance CMOS power amplifiers using nanoscale CMOS transistors.

In this dissertation, a transformer-coupled Cascode topology is proposed to overcome the low operating voltages of scaled CMOS transistors while ensuring circuit stability. The drain-source voltages of each stacked transistor are added constructively to increase the total output voltage swing. The buried-oxide layer in the SOI technology allows the drain-source voltage swings of all transistors to be added constructively without causing breakdown and leakage current to the substrate. High optimum load impedances are achieved by selecting the size, number, and topology of stacked cells which suppress the loss from on-chip impedance transformation networks and the associated performance degradations. CMOS SOI and SOS PAs implemented using the proposed topology have successfully demonstrated high linear output power, high efficiency while achieving wide operating bandwidth at RF and mm-wave frequencies.

1.3 Thesis Organization

Chapter 2 summarizes the basic design concepts and performance metrics of implementing linear power amplifiers. The tradeoffs between conjugate and loadline output match on PA’s output power and gain are presented. The mechanisms of several limiting factors on the PA’s power added efficiency, including class of operation, output matching network, and transistor knee voltage are discussed. Finally, the tradeoffs between gain, output power, efficiency, and linearity for PAs biased under different classes of operation are compared.

Chapter 3 presents the proposed power amplifier topology using transformer-coupled stacked Cascode cells in CMOS SOI technology. The effects of parasitic capacitance on stacked transistor configuration are qualitatively discussed. Wideband power amplifiers designed using the proposed topology are implemented in 45 nm CMOS SOI technology for X-band and Kα-band applications. The proposed topology overcomes the low breakdown voltages of nanoscale CMOS transistors and allows the PAs to deliver high
output power over wide bandwidths while maintaining high efficiency and linearity at RF and mm-wave frequencies.

Chapter 4 presents a novel broadband PA implemented with stacked transistors in 45 nm CMOS SOI technology. By optimizing the number, size, and topology of transistor cells in the stack, the output impedance is designed to be close to 50 Ω without utilizing an output matching network. As a result, the bandwidth and loss by output matching network is significantly reduced. The presented broadband PA demonstrated high saturated output powers (> 20 dBm) over an ultrawide bandwidth (~ 20 GHz).

Chapter 5 presents a fully-integrated wideband power amplifier implemented in 45 nm CMOS SOI technology. Using a simple post-processing technology, the Si substrate is substituted by an AlN substrate to remove the effects of parasitic capacitances and further improve the PA’s performance. A Watt-level output power with a wide operating bandwidth from 1.5 to 2.4 GHz is achieved despite using only low breakdown thin-oxide CMOS transistors in an advanced technology node.

Chapter 6 presents a wideband power amplifier (PA) implemented in 0.25 μm CMOS silicon-on-sapphire (SOS) technology. The insulating substrate in the SOS process significantly suppresses the effect of parasitic capacitance and hence minimizes the amplitude and phase differences among drain-source voltage waveforms across each transistor. The spatial thermal distribution of the stacked PA under DC and RF excitation is examined using a thermal reflectance imagining technique. The thermal images confirm the voltage swings are equally distributed across each stacked transistor with no thermal runaway when operating under high power density.

Chapter 7 summarizes the contribution of this dissertation. The proposed approach demonstrates the feasibility of implementing RF and mm-wave PAs with high linear output power, high efficiency, and wide operating bandwidth in advanced CMOS technology nodes. The limitations as well as the future work of this dissertation are summarized in this chapter.
2. POWER AMPLIFIER FUNDAMENTALS

This chapter summarizes the basic design concepts and performance metrics of implementing linear power amplifiers. The tradeoffs between conjugate and loadline output match are presented in section 2.1. The mechanisms of several limiting factors on the PA’s power added efficiency, including class of operation, transistor knee voltage, are discussed in section 2.2. And finally, the linearity and efficiency for PAs biased under different classes of operation are compared in section 2.3.

2.1 Load Impedance Termination

![Simplified equivalent circuit schematic](a) ![Conceptual loadlines](b)

Figure 2.1 (a) The simplified equivalent circuit schematic of a linear power amplifier and (b) the conceptual loadlines for conjugate match and loadline match.

Figure 2.1(a) depicts the simplified circuit schematic of a linear power amplifier terminated by a load impedance of $R_{\text{load}}$. The transistor is modeled as a current generator where its output impedance is denoted as $R_{\text{gen}}$. The imaginary parts of the impedances can be resonated out by selecting $X_{\text{load}}=-X_{\text{gen}}$. Conjugate matching can be performed to achieve maximum gain by setting the value of the load impedance equals to the transistor’s output impedance ($R_{\text{load}}=R_{\text{gen}}$) such that maximum power transfer is achieved.
The conceptual loadline of conjugate match is plotted using dashed line in Figure 2.1(b). In reality, however, the maximum allowable output voltage swing ($V_{MAX}$) across transistor’s drain-source terminal is likely to be limited by its breakdown voltage. As a result, for conjugate matching, the current conducted by the transistor may be much less than its full capacity ($I_{MAX}$) when the output voltage reaches its maximum allowable value.

To deliver high RF power, both large voltage swing and large current swing at the load are required. The load impedance in power amplifier designs is often selected to achieve maximum output power by utilizing both the maximum current swing ($I_{MAX}$) and maximum voltage swing ($V_{MAX}$) of the transistor. The conceptual loadline that achieves maximum output power is plotted using solid line in Figure 2.1(b) where an optimum load resistance, denoted as $R_{opt}$, is selected to be the ratio between $V_{MAX}$ and $I_{MAX}$. The loadline match represents a real world compromise which is necessary to utilize the maximum power from RF transistors [18].

2.2 Efficiency Analysis

The efficiencies of PAs are commonly evaluated by two different metrics, namely drain efficiency (DE) and power added efficiency (PAE). Drain efficiency is defined by the ratio between the RF output power to the DC power consumption (2.1), where power added efficiency is defined by the ratio between the difference of the output and input RF power to the DC power consumption (2.2). For PAs with relatively large power gain, PAE approaches DE.

$$DE = \frac{P_{OUT}}{P_{DC}} \quad (2.1)$$

$$PAE = \frac{P_{out} - P_{in}}{P_{DC}} = \left(1 - \frac{1}{Gain}\right) \cdot DE \quad (2.2)$$

Various mechanisms limit the efficiency of a power amplifier. The effect of these limitations on the power added efficiency can be described according to the following expression (2.3):

$$PAE = \left(1 - \frac{1}{Gain}\right) \cdot \frac{P_{OUT}}{P_{DC}} < \left(1 - \frac{1}{Gain}\right) \cdot \eta_{class} \cdot \eta_{knee} \cdot \eta_{matching} \quad (2.3)$$
where \( Gain \) is the power gain of the amplifier, \( \eta_{\text{Class}} \) denotes the maximum drain efficiency that can be achieved for different operating classes (50% and 78.5% for class A and class B PAs, respectively), \( \eta_{\text{knee}} \) represents the limitation caused by transistor knee voltages, and \( \eta_{\text{matching}} \) denotes the efficiency of the output matching network.

### 2.2.1 Matching Loss and Bandwidth

Nanoscale CMOS transistors are characterized by low gate-oxide breakdown voltage and by low drain-source reach-through voltage. Gate-oxide breakdown is caused by the tunneling current due to the high voltages applied at the gate terminal. Tunneling current may result in defects trapped in the oxide or the silicon-oxide interface, and potentially lead to creating an ohmic connection between the gate and the channel. Figure 2.2(a) depicts the cross section of a MOSFET and the stress across the gate-oxide. Gate oxide breakdown leads to a permanent damage in a MOSFET. Drain-source reachthrough, on the other hand, occurs when a large voltage is applied to the drain terminal. As a result, the depletion region will extend and eventually reach the depletion region of the source-bulk junction. Figure 2.2(b) depicts the cross section of a MOSFET when drain-source reachthrough occurs. Drain-source reachthrough may result in a large current flowing
through the device even in the absence of any significant gate bias. From a design perspective, the voltage swings across transistor terminals must remain below certain values (safe operating voltage) for any given technology to ensure reliable operation.

![Diagram](image)

Figure 2.3 The simplified circuit schematic of a CS PA with an $1:n$ output impedance transformation network.

As the voltage swings across nanoscale CMOS transistors are limited by these two breakdown mechanisms, large output powers in CMOS PAs are commonly achieved by utilizing wide periphery multi-finger transistors with low optimum load impedances to increase the output current swing. Figure 2.3 plots the simplified circuit schematic of a single-stage common-source (CS) power amplifier with an $1:n$ output matching network interposed between the PA and the 50 Ω load. As an example, a 1:4 impedance transformation translates a 5 $V_{p-p}$ swing at the drain of the CS transistor to 20 $V_p$ at the 50 Ω load. The use of impedance transformation network, nevertheless, requires the current generated by the PA to be proportionally higher. In this example, the required peak current may exceed 1.6 A. Under a certain bias current density, the high current operation requires an enormous periphery multi-finger transistor that often does not provide sufficient gain due to layout parasitic capacitance and inductance components. Additionally, the loss from output matching networks can significantly degrade the power transfer efficiency as well as the matching bandwidth for fully-integrated PAs.

The efficiency $\eta_{\text{Trans}}$ and bandwidth $BW$ of a simple single-stage LC matching network are expressed by equations (2.4) and (2.5), respectively [20].
The loss of a single stage LC matching network versus impedance transformation ratio for different inductor quality factors of 5, 10, and 20.

\[
\eta_{\text{Trans}} = \frac{Q_{\text{ind}}}{Q_{\text{ind}} + \sqrt{\frac{R_L}{R_{\text{opt}}} - 1}} 
\]

\[
BW \approx \frac{f_0}{Q_L} = \frac{f_0}{\sqrt{\frac{R_L}{R_{\text{opt}}} - 1}} 
\]

where \( R_{\text{opt}} \) is the optimum load impedance, \( R_L = 50 \Omega \) is the load resistance, \( Q_{\text{ind}} \) is the quality factor of the inductor used in the LC match, and \( f_0 \) is the center frequency of the matching network. When large multi-finger power transistors are in parallel combination, the required transformation ratio is large (large \( R_L/R_{\text{opt}} \)), thus \( \eta_{\text{Trans}} \) heavily depends on the quality factor of the output inductor. In addition, the maximum achievable output power of the PA is also degraded due to the additional loss of the output matching network.

As an example, the efficiency of a simple LC matching network with different inductor quality factors versus different impedance transformation ratios are plotted in Figure 2.4. The efficiency of the LC matching network (\( \eta_{\text{Trans}} \)) heavily depends on the quality factor of the inductor. As a result, the maximum achievable output power of the PA may also degrade due to the additional loss of the matching network. This example
illustrates the challenge of achieving high efficiency in standard CMOS PA designs where wide transistors with small impedances are commonly used in such designs in order to achieve high output power.

### 2.2.2 Transistor Knee Voltage

The efficiency limitation due to transistor knee voltages can be expressed according to (2.6):

\[
\eta_{knee} = \left(\frac{V_{MAX} - V_{knee}}{V_{MAX}}\right)^2
\]  

(2.6)

In processes that offer high breakdown voltage such as GaAs and GaN, the \(V_{knee}\) is typically 10% to 15% of the supply voltage. On the contrary, the knee voltages of nanoscale CMOS transistors can reach 50% of the supply voltage when transistors are biased under high current density. Figure 2.5 plots the \(I_D-V_{DS}\) curves of a CMOS transistor biased under two different current densities. As shown in the curves, the high knee voltage under high current density reduces the drain-source voltage swing (plotted in dashed green curve) and poses significant limitations on the maximum achievable output power as well as the efficiency.

![Figure 2.5 The \(I_D-V_{DS}\) curves of a CMOS SOI transistor when biased under two different current densities.](image)
2.3 Class A and Class AB PA

Linear power amplifiers can be classified by the conduction angle (denoted as $\theta$) of their drain current. The voltage and current waveforms of Class A and Class AB operation are plotted in Figure 2.6 (a) and (b), respectively.

![Figure 2.6 The voltage and current waveforms of (a) Class A and (a) Class AB PA.](image)

The efficiency and output power for power amplifiers operating in class A, AB, B, or C can be calculated according to (2.9) and (2.10), respectively, where $V_{DD}$ is the supply voltage, $\theta$ is the conduction angle, $V_{knee}$ voltage of the transistor, $I_{MAX}$ and is the maximum drain current.

$$\eta = \frac{V_{DD} - V_{knee}}{V_{DD}} \frac{\theta - \sin \theta}{4 \left( \sin \frac{\theta}{2} - \frac{\theta}{2} \cos \frac{\theta}{2} \right)} \quad (2.9)$$

$$P_{out} = \frac{1}{2} (V_{DD} - V_{knee}) \frac{I_{MAX}}{2\pi} (\theta - \sin \theta) \quad (2.10)$$

Class A PAs are biased to operate over the entire cycle where the output voltage is an exact amplification of the input signal leading to high linearity. As a result, Class A power amplifiers are often used in communication systems that require good linearity. The maximum drain efficiency of a Class A PA is limited to 50%. In Class AB operation, on the other hand, the transistor is biased to conduct between 50% to 100% of the cycle. Class AB PAs sacrifice some linearity in comparison with class A but generally exhibit better efficiency especially at power back off.
Figure 2.7 Gain, output power, PAE, and DE when the PA is biased under current densities of 3.5 mA/mm (dashed lines) and 15 mA/mm (solid lines).

Figure 2.7 plots output power, gain, and efficiency of a CMOS PA biased under two current densities of 3.5 mA/mm (Class AB) and 15 mA/mm (Class A). The source and load impedance are selected to achieve maximum output power (loadline match). As shown in the Figure, Class A operation provides ~5 dB higher power gain while delivering slightly higher output power. The peak PAE and DE are similar for two Classes. The linearity of the two bias conditions is examined using modulated signals. Figure 2.7 plots the measured PAE, gain, and ACLR versus output power when applying an uplink WCDMA signal. Class A operation provides higher gain and linearity while delivering ~3 dB higher output power comparing to Class AB operation at a certain ACLR. On the other hand, Class AB bias provides substantially higher efficiency at certain output power. The measurement results demonstrate the tradeoffs of different Classes in terms of linearity and efficiency. The bias current density can be selected according to the requirement of a certain communication system.
Figure 2.8 ACLR, PAE, and gain versus output power with WCDMA signal when biased under current densities of 3.5 mA/mm (dashed lines) and 15 mA/mm (solid lines).

2.4 Summary

Basic design concepts and performance metrics of linear power amplifiers are presented. The tradeoffs between conjugate and loadline output match on PA’s output power and gain are discussed. The mechanisms of several limiting factors on the PA’s power added efficiency, including Class of operation, transistor knee voltage, and finite voltage combining efficiency, are presented. Finally, the tradeoffs between gain, output power, efficiency, and linearity for PAs biased under different Classes of operation are discussed.
3. RF AND MILLIMETR-WAVE CMOS POWER AMPLIFIER

3.1 The Proposed Power Amplifier in CMOS SOI Technology

3.1.1 Circuit Topology

Figure 3.1 The simplified circuit schematic of the proposed transformer–coupled stacked Cascode power amplifier.

Figure 3.1 depicts the simplified circuit schematic of the proposed PA topology. The PA is constructed using stacked Cascode cells where the RF power is coupled to the CS transistors using on-chip transformers. The design methodology and circuit operation is presented in this section.
(i) The proposed circuit employs a dynamically-biased stacked cell approach with an input transformer and a feedback biasing network that allows the gate bias of each transistor cell to float such that its value is always between the corresponding source and drain voltages. Thus, gate-oxide breakdown is prevented despite large voltage swing at the drain of the top transistor cell in the stack. The primary coils of the input transformers may be arranged in a mixture of series and parallel combinations at the input nodes with an input impedance between \( Z_{in}/n \) and \( n \ Z_{in} \) in order to achieve an input impedance matched to 50 \( \Omega \).

Figure 3.2 Simplified cross-section of the proposed stacked PA implemented in CMOS SOI technology.

(ii) Figure 3.2 plots the simplified cross-section of the proposed stacked PA implemented in CMOS SOI technology. Employing SOI technology with buried oxide layer (BOX) allows each transistor to be electrically isolated from the Si substrate. Additionally, trench oxide (TOX) regions surrounding each transistor electrically isolate SOI transistors from each other. Therefore, substrate breakdown and leakage currents are prevented. The maximum drain voltage of the top transistor cell in the stack is limited to the breakdown voltage of the BOX layer typically in the excess of 80 V in most commercial CMOS SOI technologies.

(iii) By using CMOS SOI platform, parasitic capacitances to the substrate can be significantly reduced such that their adverse effects on phase imbalance, voltage swing imbalance and instability are minimized. Note that as the frequency increases, even small parasitic capacitances may deteriorate the performance and limit the maximum
number of transistor cells that can be stacked. One way to eliminate their adverse effects is to resonate the parasitic capacitance out using parallel inductors (in series with large bypass capacitors to maintain the DC bias). This technique has its own disadvantages, namely the large area allocated to the inductors and bypass capacitors, and extra parasitics and loss and associated performance degradation introduced by these components. Furthermore, the narrowband nature of the resonator is not suitable for wideband operation.

(iv) Using dynamically-biased Cascode cells achieves a compact PA design that is stable over a wide range of frequencies. The Cascode configuration suppresses the amount of feedback from the output node to the input node through gate-drain capacitors $C_{gd}$'s and helps with the stability of the power amplifier. An additional advantage of Cascode configuration is that for each transformer in the unit cell, there are two transistors that withstand twice as much maximum drain-source voltage swing. The overall output voltage swing and output power are increased with only slight increase in the area occupied by an additional transistor.

Table 3.1 compares some important design parameters of a unit power amplifier, and PAs designed with $n$-parallel and $n$-stacked transistors. In summary, the stacked topology provides high breakdown voltage, high input and optimum load impedance which reduces the loss from matching networks and facilitates the implementation fully-integrated CMOS PAs that delivers high output power over a wide bandwidth.

Table 3.1 Design parameters comparison of an unit power amplifier and PAs designed with $n$-parallel and $n$-stacked transistors.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Unit PA</th>
<th>$n$-parallel</th>
<th>$n$-stacked</th>
</tr>
</thead>
<tbody>
<tr>
<td>Breakdown voltage</td>
<td>$V_{BK}$</td>
<td>$V_{BK}$</td>
<td>$nV_{BK}$</td>
</tr>
<tr>
<td>Output current</td>
<td>$I_{MAX}$</td>
<td>$nI_{MAX}$</td>
<td>$I_{MAX}$</td>
</tr>
<tr>
<td>Output power</td>
<td>$P_{out}$</td>
<td>$nP_{out}$</td>
<td>$nP_{out}$</td>
</tr>
<tr>
<td>Optimum load impedance</td>
<td>$Z_{opt}$</td>
<td>$Z_{opt}/n$</td>
<td>$nZ_{opt}$</td>
</tr>
<tr>
<td>Input impedance</td>
<td>$Z_{in}$</td>
<td>$Z_{in}/n$</td>
<td>$nZ_{in}$</td>
</tr>
</tbody>
</table>
3.1.2 Effect of Parasitic Capacitance

Figure 3.3 Simplified equivalent circuit schematic of a stacked power amplifier with internodal parasitic capacitance between stacked cells.

Figure 3.3 shows a simplified small-signal equivalent circuit of an n-stage stacked PA. In this figure, $G_{m,n}$ denotes the effective transconductance of each stage, $C_{GND,n}$ is the parasitic capacitance between each stacked cell and GND, $R_{in,n}$ is the input resistance of each stage and $R_L$ and $R_S$ are the load and source impedance, respectively. Each cell in the stack can be designed using Common Source (CS), Cascode or a combination of these topologies. It is assumed that the capacitive part of the input impedance of each cell is resonated out with the inductance of the secondary winding of each input transformer at the designated frequency. Moreover, the feedback capacitance is ignored in the following derivation for simplicity. Under the assumption that all transistors in the circuit have the same dimensions (thus identical $G_{m,n}$, $C_{GND,n}$), the input voltage swing for each cell would be identical and may be designed to achieve maximum power transfer. By solving the KCL at the output of the $k$-th cell ($1 \leq k \leq (n-1)$), the relation among voltages
can be expressed according to:

\[ V_{out,k+1} = V_{out,k} \cdot (2 + j\alpha) - V_{out,k-1} \]  \hspace{1cm} (3.1)

where

\[ \alpha = \omega C_{GND} \frac{R_L}{n} \]  \hspace{1cm} (3.2)

The ratio of voltage signal amplitudes and phase differences across Cell 2 and Cell 1 can be calculated according to:

\[ \frac{V_{out,2} - V_{out,1}}{V_{out,1}} = \sqrt{1 + \alpha^2} \angle tan^{-1}(\alpha) \]  \hspace{1cm} (3.3)

Applying similar analysis to Cell 3, the amplitude and phase of voltage swing across Cell 3 with respect to those of Cell 1 are:

\[ \frac{V_{out,3} - V_{out,2}}{V_{out,1}} = \sqrt{1 + 7\alpha^2 + \alpha^4} \angle tan^{-1}\left(\frac{3\alpha}{1 - \alpha^2}\right) \]  \hspace{1cm} (3.4)

Similarly, one can derive this complex ratio for the 4th Cell:

\[ \frac{V_{out,4} - V_{out,3}}{V_{out,1}} = \sqrt{1 + 26\alpha^2 + 13\alpha^4 + \alpha^6} \angle tan^{-1}\left(\frac{6\alpha - \alpha^3}{1 - 5\alpha^2}\right) \]  \hspace{1cm} (3.5)

Here, one may calculate the voltage combing efficiency of a 4-cell stacked PA with uneven amplitudes and varying phase shifts across stacked cells. Using Eqs. (3.1)-(3.5), one can write:

\[ \eta_{combine} = \frac{1}{4} \times \frac{16 + 52\alpha^2 + 16\alpha^4 + \alpha^6}{\sqrt{1 + 26\alpha^2 + 13\alpha^4 + \alpha^6}} \]  \hspace{1cm} (3.6)
Note that the above analysis of a stacked PA can be easily extended to any number of cells. For the CMOS SOS technology used in this work or CMOS SOI technologies with high resistivity Si substrate, the parasitic capacitance $C_{GND}$ is very small, leading to minuscule values of $\alpha$. For instance, for a 4-cell stacked PA on SOS substrate, an $\alpha = 0.016$ is calculated for 10 mm wide and 0.25 $\mu$m long transistors with estimated $C_{GND} = 150$ fF at operating frequency of 1.4 GHz and $R_L = 50 \Omega$. Given this low value of $\alpha$, the voltage swings across all cells are essentially the same with almost zero relative phase shifts, leading to $\sim$100% voltage combining efficiency. Instead of SOS technology, if an SOI technology with a conductive Si substrate is utilized, an estimated $\alpha = 0.25$ is calculated for 10 mm wide transistors. The relatively large value of $\alpha$ for the SOI technology leads to amplitude and phase variations across the stacked cells and a reduced voltage combining efficiency of 67%. The power combining efficiency will be an unacceptably low value of 45%. Note that the real scenario is more complicated than the above analytical approach and requires an in-depth analysis that should include the effect of parasitic capacitance at the gate terminals, the internal drain-source and gate-drain capacitances, and transmission-line effects from the interconnection lines, as well as the input signal delay. Nevertheless, this simple analysis points out to the importance of suppressing internodal parasitic capacitance of stacked PA in order to achieve high output power and high PAE.

3.2 Wideband CMOS SOI Power Amplifier

3.2.1 Introduction

The cutoff frequency $f_T$ and maximum oscillation frequency $f_{MAX}$ of advance nanoscale CMOS transistors have surpassed 200 GHz mark, enabling microwave and mm-wave circuit operation. Nevertheless, implementing fully-integrated wideband CMOS power amplifiers (PAs) remains challenging due to the following: (i) Low gate-oxide breakdown voltage and low drain-source reachthrough voltage of scaled CMOS transistors limit the maximum voltage swings across transistor terminals. As a result, the supply voltage must be lowered adequately to satisfy the breakdown requirements. (ii) Lossy on-chip matching networks or lossy power combiners often used in PA design
degrade its efficiency and bandwidth. As a result, large operating bandwidths are difficult to achieve especially when the impedance transformation ratio is high. (iii) The $f_{\text{MAX}}$ of a wide multi-finger transistor degrades rapidly as the overall transistor width increases due to layout parasitics. Therefore, the maximum transistor widths that can be practically used at high operating frequencies are limited to a few hundred microns. As a result, the output current swing is limited, leading to reduced power delivered by a single PA.

Fully-integrated PAs operating in X-band have been demonstrated in CMOS technology [21-25]. The bandwidths of these designs are often limited by their output matching networks while their power-added efficiencies (PAEs) are generally degraded when the operating frequency deviates from the center frequency. Stacked CMOS SOI transistors, on the other hand, have been proposed to provide high output impedance and high output powers. In this paper, a fully-integrated X-band PA designed with 3 stacked dynamically-biased Cascode cells implemented in 45 nm CMOS SOI technology is presented. A high output impedance, relatively close to 50 $\Omega$, allows the PA to deliver high output power while maintaining high efficiency and high linearity over a wide bandwidth from 9 to 15 GHz.

### 3.2.2 X-Band Power Amplifier Design

The circuit schematic of the stacked PA designed with 3 stacked Cascode cells (6 transistors) is shown in Figure 3.4. Buried oxide layer (BOX) in CMOS SOI technology electrically isolates the transistors from the Si substrate and trench oxide (TOX) regions surrounding each transistor electrically isolate transistors from each other. Therefore, substrate breakdown and leakage currents are prevented when high supply voltages are used. The gate terminal of each transistor in the stack is dynamically-biased using a resistor ladder to set the gate voltage within the corresponding drain and the source voltages. The dynamic biasing scheme prevents gate-oxide breakdown. The Cascode configuration ensures circuit stability and provides the advantages of high gain, high output impedance, and small chip area (one transformer for two transistors). Larger transistor widths are used at the top of the stack to obtain more or less an uniform drain-source voltage swing across each transistor. Figure 3.5 shows the simulated voltage
swings across each transistor at 12 GHz. Capacitors $C_1 = 0.3 \, \text{pF}$ provide short-circuit between gate of the common gate (CG) transistors and the source of CS transistors. Capacitors $C_2 = 0.6 \, \text{pF}$ provide DC blocks with the secondary coils of the input transformers and form voltage dividers with the gate-source capacitors of common source (CS) transistors. The capacitor values are optimized to achieve high gain and equal drain-source voltage swings across each transistor while satisfying stable operation. The PA is biased in Class AB operating region where the maximum voltage at the output node swings roughly to twice of the DC supply voltage. A supply voltage $V_{DD} \leq 4.8 \, \text{V}$ ($V_{DS} \leq 0.8 \, \text{V}$) and a low current density of $\leq 0.25 \, \text{mA/\mu m}$ are chosen to satisfy high short- and long-term reliabilities of nanoscale CMOS transistors.

Figure 3.4 The circuit schematic of the X-band stacked power amplifier.
Figure 3.5 Simulated voltage swings across each stacked transistor at 12 GHz when biased under a supply voltage of 4.8V.

The input matching network is implemented using on-chip transformers. The Cascode configuration reduces the number of transformers to half of a design using stacked CS cells. The primary coils of the transformers are connected in series to provide an input impedance close to 50 Ω. The output impedance is designed to be close to 50 Ω by optimizing the transistor widths in stack. As a result, no lossy and bandwidth limiting output matching network or power combining techniques are utilized, leading to enhanced linear and saturated output power, PAE, and operating bandwidth.
3.2.3 Measurement Results

Figure 3.6 The (a) photo and (b) block diagrams of the measurement setups for small-signal S-parameters, large-signal power measurements and two tone measurements.
Figure 3.7 Measured and simulated small-signal S-parameters and measured stability factor from 5 to 20 GHz when biased under $V_{DD}=4.8\text{V}$.

Figure 3.8 Measured output power, PAE, and gain at 12 GHz when biased under $V_{DD}=3.6\text{ V}$ and 4.8 V.

The photo and block diagrams of the measurement setup are shown in Figure 3.6. Figure 3.7 shows simulated and measured small-signal S-parameters and measured stability factor ($k$) of the PA from 9 to 15 GHz when biased under a supply voltage of 4.8 V. The PA is unconditionally stable ($1.8 \leq k$) and measures a peak gain of 9.8 dB at 12
GHz with a -3 dB bandwidth from 9 to 15 GHz. Smaller measured gain and bandwidth degradations compared to simulated values are mainly due to parasitic inductances that are not captured in the post-layout simulation. Figure 3.8 plots the measured output power, power gain, drain efficiency (DE) and PAE at 12 GHz. Under a 3.6 V supply voltage, the measured linear P1dB and saturated output power $P_{SAT}$ are 16.2 dBm and 19.8 dBm, respectively. The peak PAE and the corresponded DE are 25.7% and 40.7%, respectively. With a higher supply voltage of 4.8 V, $P_{1dB}$ and $P_{SAT}$ increase to 19.2 dBm and 22.5 dBm, respectively, while peak PAE is reduced to 19.2% due to the dynamic-biasing characteristic. Figure 3.9 plots the measured performance under different supply voltages from 3.6 to 4.8 V. The bias current of the stacked PA is set by the resistor ladder leading to the output power proportional to $\sim V_{DD}^2$.

Figure 3.9 Measured $P_{SAT}$, $P_{1dB}$, peak PAE, and corresponded DE at 12 GHz when biased under different supply voltages from 3.6 V to 4.8 V.
Figure 3.10 Measured $P_{\text{SAT}}$, peak PAE, and OIP3 from 9 to 15 GHz when biased under $V_{\text{DD}}$=3.6 V and 4.8 V.

Figure 3.10 plots the measured $P_{\text{SAT}}$, peak PAE, and OIP3 from 9 to 15 GHz. The $P_{\text{SAT}}$ over the entire frequency band is above 18.5 dBm and 21.4 dBm when biased under supply voltages of 3.6 V and 4.8 V, respectively, with peak PAE above 16.8% and 12.3%. The OIP3 of the PA is measured using two-tone signals with 5 MHz offset. The OIP3 is above 22.1 dBm and 24.5 dBm from 9 to 15 GHz when biased under 3.6 V and 4.8 V, respectively. The measured power performance indicates no significant degradation when the operating frequency is varied from 9 to 15 GHz.

3.2.4 Conclusion

The performance of the presented wideband PA is summarized in Table 3.2 in comparison with other X-band CMOS PAs reported in the literature. The presented PA delivers high output power while maintaining high efficiency and linearity from 9 to 15 GHz despite the fact that each transistor is biased under low drain-source voltage and low drain current density. Figure 3.11 shows the chip micrograph of the presented PA. The PA occupies a compact chip area of 0.22 mm$^2$ including pads.
Table 3.2 Performance comparison of X-band CMOS PAs.

<table>
<thead>
<tr>
<th>Ref.</th>
<th>Tech.</th>
<th>Freq. (GHz)</th>
<th>$P_{\text{SAT}}$ (dBm)</th>
<th>Peak PAE (%)</th>
<th>$V_{\text{DD}}$ (V)</th>
<th>Area (mm$^2$)</th>
</tr>
</thead>
<tbody>
<tr>
<td>[21]</td>
<td>0.18 µm CMOS</td>
<td>8.5-10 @ 8.5</td>
<td>23.5</td>
<td>19</td>
<td>3.3</td>
<td>1.28</td>
</tr>
<tr>
<td>[22]</td>
<td>0.18 µm CMOS</td>
<td>6-10 @ OP$_{1\text{dB}}$</td>
<td>5</td>
<td>14.4 @ OP$_{1\text{dB}}$</td>
<td>1.5</td>
<td>1.08</td>
</tr>
<tr>
<td>[23]</td>
<td>0.18 µm CMOS</td>
<td>7-12 @ 10</td>
<td>23.8</td>
<td>25.8</td>
<td>3.6</td>
<td>0.47</td>
</tr>
<tr>
<td>[24]</td>
<td>0.18 µm CMOS</td>
<td>8.6-10.3 @ 10</td>
<td>24.5</td>
<td>18</td>
<td>3</td>
<td>1.2</td>
</tr>
<tr>
<td>[25]</td>
<td>0.18 µm CMOS</td>
<td>6.5-13 @ 9.5</td>
<td>21.5</td>
<td>20.3</td>
<td>3.6</td>
<td>0.63</td>
</tr>
<tr>
<td><strong>This Work</strong></td>
<td><strong>45 nm CMOS SOI</strong></td>
<td><strong>9-15</strong></td>
<td><strong>18.5-20.1</strong></td>
<td><strong>16.8-25.7</strong></td>
<td><strong>3.6</strong></td>
<td><strong>4.8</strong></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Figure 3.11 The chip micrograph of the X-band PA.
3.3 MM-Wave PA in Nanoscale CMOS Technology

3.3.1 Introduction

The demand for high data-rate wireless communication has motivated the employment of millimeter-wave bands with large channel bandwidths. Fully-integrated millimeter-wave radios have been demonstrated in CMOS technologies. Nevertheless, designing a high performance PA in an advanced CMOS technology node remains challenging [26-31]. To date, CMOS mm-wave PAs with output powers close to 20 dBm (100 mW) have been demonstrated using power combing techniques and stacked transistors biased under relatively high drain-source voltages. Further improvement in the power performance of power combined mm-wave PAs is limited by the power delivered by combining cells and the loss and bandwidth limitations of the power combining network.

In this section, a fully-integrated PA designed with 2-stacked Cascode cells operating in 35 to 40 GHz range is implemented in a standard 45 nm CMOS SOI technology. The stacked configuration overcomes some of the limitations of power-combined mm-wave CMOS PAs and allows high linear output power at mm-wave frequencies.

3.3.2 CMOS FET RF Characteristics

The two figure-of-merits that are commonly used to characterize transistors’ RF performance across various technological processes are the cutoff frequency \( f_T \) and the maximum oscillation frequency \( f_{MAX} \). The cutoff frequency is the frequency which transistors' current gain becomes unity. The \( f_T \) of a CMOS FET can be expressed:

\[
 f_T = \frac{g_m}{C_{gs} + C_{gd}} = \frac{\sqrt{2\mu_n C_{ox} W I_D}}{L W (C_{ox} + C_{ov})} \tag{3.7}
\]

where \( C_{gs} \) and \( C_{gd} \) are gate-source and gate-drain capacitors of the transistor, \( \mu_n \) is the electron mobility, \( W \) and \( L \) are width and length of the transistor, \( I_D \) is the drain current, and \( C_{ox} \) and \( C_{ov} \) are capacitance densities for gate oxide capacitance and overlap capacitance, respectively. The maximum oscillation frequency is the frequency at which
the maximum power gain \( (G_{\text{MAX}}) \) becomes unity:

\[
 f_{\text{MAX}} = \sqrt{\frac{f_T}{2\pi r_g C_{gd}}} 
\]

(3.8)

where \( r_g \) is the gate resistance. At mm-wave frequencies, the \( f_T \) and \( f_{\text{max}} \) heavily depend on the internodal parasitic capacitance and inductance from transistor layout. Therefore, the selection of total transistor widths often leads to a trade-off between power gain and output power. The maximum transistor width that can be practically used at mm-wave frequency is generally limited to a few hundred microns in order to provide sufficient gain.

### 3.3.3 K_a-Band PA Design

![Circuit Schematics](image)

Figure 3.12 The circuit schematics of K_a band PAs designed with (a) a Cascode cell and (b) a 2-stacked Cascode cells.

A Cascode and a 2-stacked Cascode amplifiers, shown in Figure 3.12(a) and (b), respectively, are designed and implemented in a standard 45nm CMOS SOI technology to operate in K_a band frequency range. The stacked PA is designed with twice the transistor widths compared to the single Cascode PA (320 µm compared to 160 µm) to
achieve a similar output impedance. The power delivered by the two PAs can be calculated according to the following equations where a similar loss from the output L-matching network is taken into account:

\[ P_{\text{out}, Z_{\text{opt}}} = \frac{V_{\text{p-p, total}}^2}{8 \cdot |Z_{\text{opt}}|} = \frac{n^2 \cdot (V_{\text{MAX}} - V_{\text{Knee}})^2}{8 \cdot |Z_{\text{opt}}|} \]  

(3.9)

\[ P_{\text{out,50\Omega}} = P_{\text{out}, Z_{\text{opt}}} - \text{Matching Loss} \]  

(3.10)

where \( n \) is the number of stacked transistors and \( Z_{\text{opt}} \) is the optimum load impedance designed for maximum power. \( V_{\text{MAX}} \) is the maximum drain-source voltage across one transistor, which is chosen below the drain-source breakdown voltage, and \( V_{\text{Knee}} \) represents the knee voltage of the transistor.

In order to achieve maximum output power, i.e. maximum voltage \( (V_{\text{MAX}}) \) and maximum current \( (I_{\text{MAX}}) \) swings, the real part of the load impedance is designed to satisfy the following equation:

\[ \frac{I_{\text{MAX}}}{V_{\text{MAX}} - V_{\text{Knee}}} = \frac{1}{R_{\text{opt}}} - \frac{1}{R_{DS}} \]  

(3.11)

where \( R_{\text{opt}} \) is the real part of the optimum load impedance \( Z_{\text{opt}} \) and \( R_{DS} \) is the finite output resistance of the power transistor operating under a high current. The transistors in the presented PAs are biased in Class AB amplification mode where the peak to peak voltage at the output is roughly \( 2V_{\text{DD}} \). To ensure reliable operation, each transistor has to operate below a recommended safe operating voltage by the technology. As a result, the supply voltages must be lowered adequately to ensure that voltage swings across transistor terminals remain within the safe operating voltage. The stacked PA is biased with twice the DC supply of the Cascode PA and is expected to deliver roughly 6 dB (4 times) higher output power.
All transistors in the stacked PA are dynamically biased by using feedback resistors and input transformers to avoid gate-oxide breakdown. The output impedances of the PAs are matched to 50 Ω using on-chip L-matching networks. By optimizing transistor widths, optimum load impedances close to 50 Ω are designed in order to reduce the loss from the output matching networks.

3.3.4 Measurement Results

On-wafer small-signal S-parameter measurements from 30 to 40 GHz for both amplifiers are plotted in Figure 3.13. The stacked PA provides a peak small signal S21 gain of 5.2 dB at 37 GHz when biased under a supply voltage of 3.6 V. Comparing to the Cascode PA, the smaller power gain of the stacked PA is due to larger transistor widths and an additional loss of an extra input transformer.

Figure 3.13 Measured small-signal S-parameters of the Cascode PA and stacked Cascode PA from 30 to 40 GHz when biased under 1.8 V and 3.6 V, respectively.
Figure 3.14 Measured output power and PAE of the Cascode PA and the stacked Cascode PA when biased under 1.8 V and 3.6 V, respectively.

Figure 3.15 Measured $P_{\text{SAT}}$, $P_{1\text{dB}}$, and peak PAE of the stacked PA at 37 GHz biased under various supply voltages from 3 to 4.4 V.

Figure 3.14 compares the measured large signal performances of the two amplifiers. The stacked PA delivers a $P_{\text{SAT}}$ and $P_{1\text{dB}}$ of 20.2 dBm and 14.5 dBm, respectively, with a peak PAE of 11.2% when biased under 3.6 V. The stacked PA delivers ~5 dB higher output power and a slightly lower PAE compared to the Cascode PA. The degradation in PAE is mainly due to the lower power gain and a finite voltage combining efficiency caused by parasitic capacitances of the circuit. The effect of varying the supply voltage on the power performance of the stacked Cascode PA is shown in Figure 3.15 where the measured $P_{\text{SAT}}$, $P_{1\text{dB}}$, and peak PAE are plotted versus various supply voltages. By
increasing the supply voltage from 3 to 4.4 V, the measured $P_{\text{SAT}}$ increases from 16.7 dBm to 21.4 dBm (~140 mW).

### 3.3.5 Conclusion

Table 3.3 summarizes the performance of the presented $K_a$-band PAs in comparison with other mm-wave PAs reported in the literature. The stack configuration overcomes the low operating voltages of scaled transistors and allows the PA to deliver the highest output power reported at mm-wave frequency.

Table 3.3 Performance comparison of mm-wave CMOS PAs

<table>
<thead>
<tr>
<th>Ref</th>
<th>Technology</th>
<th>Freq. (GHz)</th>
<th>$P_{1\text{dB}}$ (dBm)</th>
<th>$P_{\text{SAT}}$ (dBm)</th>
<th>Peak PAE (%)</th>
<th>$V_{\text{DD}}$ (V)</th>
<th>Topology</th>
</tr>
</thead>
<tbody>
<tr>
<td>[26]</td>
<td>0.15μm GaAs HEMT</td>
<td>42</td>
<td>N/A</td>
<td>21.8</td>
<td>25 (DE)</td>
<td>5</td>
<td>Doherty CS</td>
</tr>
<tr>
<td>[27]</td>
<td>0.12μm SiGe</td>
<td>42</td>
<td>16.4</td>
<td>19.4</td>
<td>14.4</td>
<td>2.4</td>
<td>Class E</td>
</tr>
<tr>
<td>[30]</td>
<td>45nm CMOS SOI</td>
<td>45</td>
<td>N/A</td>
<td>18.2</td>
<td>23</td>
<td>4</td>
<td>3 stacked SOI transistor</td>
</tr>
<tr>
<td>[31]</td>
<td>45nm CMOS SOI</td>
<td>45</td>
<td>N/A</td>
<td>18</td>
<td>23</td>
<td>2.5</td>
<td>Doherty Cascode</td>
</tr>
<tr>
<td>[28]</td>
<td>90nm CMOS</td>
<td>60</td>
<td>18.2</td>
<td>19.9</td>
<td>14.2</td>
<td>1.2</td>
<td>4-way Power Combined</td>
</tr>
<tr>
<td>[29]</td>
<td>65nm CMOS</td>
<td>60</td>
<td>15</td>
<td>18.6</td>
<td>15.1</td>
<td>1</td>
<td>Power Combined</td>
</tr>
<tr>
<td>This work</td>
<td>45nm CMOS SOI</td>
<td>35</td>
<td>11.5</td>
<td>14.8</td>
<td>16.2</td>
<td>1.8</td>
<td>Cascode</td>
</tr>
<tr>
<td></td>
<td></td>
<td>37</td>
<td>14.5</td>
<td>20.2</td>
<td>11.2</td>
<td>3.6</td>
<td>4 dynamically-biased SOI transistors in stack</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>17.5</td>
<td>21.4</td>
<td>7</td>
<td>4.4</td>
<td></td>
</tr>
</tbody>
</table>

### 3.4 Summary

Chapter 3 presents the proposed power amplifier topology using transformer-coupled stacked Cascode cells in CMOS SOI technology. The effect of parasitic capacitance on stacked transistor configuration is qualitatively discussed. Wideband power amplifiers designed using the proposed topology are implemented in 45 nm CMOS SOI technology for X-band and $K_a$-band applications. The proposed topology overcomes the low breakdown voltages of nanoscale CMOS transistors and allows the PAs to deliver high output power over wide bandwidths while maintaining high efficiency and linearity at RF and mm-wave frequencies.
4. A BROADBAND PA IN CMOS SOI TECHNOLOGY

4.1 Introduction

Low manufacturing cost, integration capability with baseband and digital circuits, and high operating frequency of nanoscale CMOS technologies have propelled their applications into RF and microwave systems. Single-chip radio still remains elusive as implementing high-performance multi-mode multi-band and linear power amplifiers in CMOS technology is very challenging.

Distributed amplifier (DA) is one of the most common techniques to implement broadband amplifiers [32-35]. In DA designs, the input and output capacitance of transistors are absorbed in the transmission lines connected to the input and output terminals of each transistor such that a substantially wideband frequency response is achieved. Broadband DAs, nevertheless, are inefficient due to the power loss in the dummy 50 Ω load as well as varying signal swing across each stage, leading to reduced overall efficiency to less than half. Owing to the high breakdown voltages of GaAs and GaN technologies, medium power DAs with relatively low efficiencies are reported using III-V technologies [32], [33]. CMOS DAs [34], [35], on the other hand, are generally characterized by low output power levels and dismal efficiencies due to the small acceptable voltage swings across individual DA cells designed with either common source (CS) or Cascode configurations.

Other approaches implementing wideband CMOS PAs are reported in the literature, including: (i) Inverter-based Class D PA [36], and (ii) PAs based on a high-order broadband matching network [37], [38]. In the inverter-based Class D PA design, the non-linear class D operation facilitates a high efficiency with extremely
wideband operation from 4 to 50 GHz. The PA, however, may not be suitable for variable envelope modulation schemes. Moreover, as the operating frequency increases, the power dissipated by the driver amplifier stage that provides the square-wave input to the Class D PA increases, resulting in overall efficiency degradation. Stacked PAs with dynamic biasing as proposed in this work, on the other hand, have been demonstrated to deliver high linear output power with non-constant envelope input signals [36]. For PAs implemented with high-order broadband on-chip matching networks, the output power and efficiency suffer from low quality factor of on-chip inductors. Therefore, high-order matching networks are often implemented using off-chip components at relatively low operating frequencies.

In this work, a novel broadband PA is implemented in a standard 45 nm CMOS SOI technology. The PA is based on dynamically-biased electrically-isolated CMOS SOI transistors stacked on top of each other to achieve a large output voltage swing. The buried-oxide layer in the SOI technology allows the drain-source voltage swings of all transistors to be added constructively without causing breakdown to the substrate. The dynamic biasing scheme, on the other hand, prevents gate oxide breakdown of transistors at the top of the stack. A large bandwidth is achieved by selecting the size, number, and topology of stacked cells to provide an output impedance close to 50 Ω which eliminates the need for a lossy on-chip impedance transformation network and its associated bandwidth, power, and efficiency degradations. The distributed transmission line effect of interconnections partially resonates with a relatively low output capacitance (due to stacking of transistors), and leads to a high output power and acceptable overall efficiency over an ultrawide bandwidth.

4.2 Broadband Stacked PA Design

4.2.1 Stacked SOI Transistors

Figure 4.1 plots the circuit schematic of the proposed broadband stacked PA. The targeted specification is a saturated output RF power of at least 100 mW over a wide frequency range of X-Band to K-Band. The PA design is realized using dynamically-
biased stacked transistor cells, which eliminates the need for output impedance matching network. The design is carried out by optimizing three main parameters: 

(i) Width of each transistor which affects the gain, input and optimum output impedance and current drive capability. 

(ii) The number of stacked transistor cells, which determines both input and optimum output impedances and the maximum output voltage swing as well as the chip area. 

(iii) The topology used for each transistor cell. For design simplicity, in this work, the cell topology has been limited to single-stage dynamically-biased Cascode or dynamically-biased common source (CS) configurations. Multi-stage dynamically-biased cells may be utilized for higher power gain, if desired.

Figure 4.1 The circuit schematic of the proposed broadband stacked power amplifier.
Figure 4.2 The simulated output impedance $S_{22}$ from 5 to 30 GHz of NMOS transistors with various widths connected in CS and Cascode configurations. The output impedances of transformer-coupled stacked cells with 6 transistors (320 µm wide) are also shown (6 CS transistors, 3 Cascode cells, and 2 Cascode cells with 2 CS transistors). All transistors are biased under $I_d=0.2$ (mA/µm) and $V_{ds}=0.75$ V.

Figure 4.2 plots the simulated output reflection coefficient ($S_{22}$) from 5 to 30 GHz for NMOS transistors with three different widths where the inputs are terminated to a 50 Ω load. For high frequency operation, all transistors are thin-oxide floating-body type with a gate oxide thickness of slightly more than 1 nm. Both CS and Cascode configurations are simulated. In the case of Cascode configuration, the widths of the two transistors are kept the same. Varying the widths of these transistors provides an additional degree of freedom in designing stacked PA’s. The transistors are biased under drain current density $I_d = 0.2$ mA/µm and drain-source voltage $V_{ds} = 0.75$ V. Compared to the CS topology, the Cascode topology provides higher voltage gain and higher output impedance especially at low frequencies. It also provides better isolation between the input and output ports, which improves PA’s stability. The output impedance of the Cascode cell, however, decreases rapidly as the frequency increases. On the other hand, the output impedance of the CS cell is relatively constant as the frequency varies, which makes the CS cell suitable for wideband operation. Figure 4.2 shows the simulated output reflection coefficient $S_{22}$ of 6 stacked transistors each with 320 µm width implemented with different combinations of Cascode and CS cells. The output reflection coefficient of 6 stacked CS cells provides an output impedance close to 50 Ω over the entire frequency
band of 5 to 30 GHz. Unfortunately, the circuit becomes unstable for power gains exceeding 3 dB. Compared to 6 stacked CS design, a combination of stacked CS and Cascode cells can achieve higher gain without stability concerns. For instance, the design with 3 stacked Cascode cells is stable ($k > 1.4$) for the entire operating frequency band but becomes narrowband due to a large variation of its output reflection coefficient as depicted in Figure 4.2. A design with 2 Cascode cells (top two cells) and 2 CS cells (bottom two cells) as shown in Figure 4.1 is stable for the entire frequency band with $k > 1.2$, and shows a good tradeoff between bandwidth and stability and is implemented in this work.

The transistors in this design operate in Class AB amplification mode where the peak RF voltage at the top of the stack is roughly twice of the DC supply voltage. The supply voltage is chosen to ensure that drain-source voltage swings are within the safe-mode operation. Figure 4.3 shows simulated $V_{gs}$-$V_{ds}$ waveforms of the top transistor at 18 GHz when the PA is biased with a supply voltage of 4.5 V ($V_{DS,DC}$ per transistor is 0.75 V) for output powers of 12, 16, and 20 dBm. The waveforms confirm neither gate-oxide breakdown nor drain-source reach-through occurs despite the fact that the drain of the top transistor swings from $\sim 6 \times V_{knee}$ (1.2 V) to $\sim 2 \times V_{DD}$ (9 V). In case that one transistor in the stack experiences a large voltage swing beyond its drain-source reach-through voltage, the PA still operates properly as the overall current is limited by other transistors in the stack. This safety feature is not observed in PAs based on parallel combination of transistors where large swing voltages beyond the drain-source reach-through voltage cause excessive amount of current and catastrophic failure. The power delivered by the stacked PA to a 50 Ω load can be calculated according to:

$$P_{OUT,50\Omega} = \frac{1}{8 \times 50} \cdot \left( \sum_{1}^{n} V_{ds,n} \right)^2 = \frac{n^2 (V_{MAX} - V_{knee})^2}{8 \times 50} \quad (4.1)$$

where $n$ is the number of transistors in the stack, $V_{MAX}$ is the maximum drain-source voltage across each transistor and $V_{knee}$ is the transistor knee voltage (drain-source voltage...
at maximum drain current). The estimated saturated output power under a 4.5 V supply voltage is 130 mW.

Figure 4.3 Simulated $V\text{gs}-V\text{ds}$ waveforms of the top transistor in the stack at 18 GHz when biased under a supply voltage of 4.5 V ($V_{\text{DS,DC}}$ per transistor is 0.75 V) for output powers of 12, 16, and 20 dBm.
Figure 4.4 Simulated load lines of each transistor in the stack at -1 dB compression point when biased under supply voltage of 4.5 V at (a) 6 GHz ($P_{1dB} = 18.6$ dBm) (b) 18 GHz ($P_{1dB} = 19.2$ dBm) and (c) 26 GHz ($P_{1dB} = 18.1$ dBm).

Figure 4.4 shows the simulated load lines of each transistor in the stack at -1 dB compression point when biased under supply voltage of 4.5 V at 6 GHz ($P_{1dB} = 18.6$ dBm), 18 GHz ($P_{1dB} = 19.2$ dBm), and 26 GHz ($P_{1dB} = 18.1$) dBm. With an additional degree of
freedom brought by the stacked configuration, the size and number of transistors are selected to satisfy the condition $R_L \approx n(V_{\text{max}} - V_{\text{min}}) / I_{\text{max}}$ for delivering maximum output power, where $n$ is the number of stacked transistors, $n(V_{\text{max}} - V_{\text{min}}) \approx 7.2$ V, and $I_{\text{max}}$ for a 320 μm transistor is roughly 140 mA, and therefore $R_L \approx 50$ Ω. The simulated loadlines of each transistor as shown in Figure 4.4 verify the maximum power condition for the CMOS PA. At the same time, the optimization in the size and number of transistors leads to an output matching with a real part close to 50 Ω, facilitating a high power gain if the output capacitance is partially cancelled by the distributed transmission line effects of interconnections.

4.2.2 Passive Network Design

![Figure 4.5](image)

Figure 4.5 The layout of the input transformer and simulated insertion loss and return loss of the transformer-coupled input matching network from 5 to 30 GHz.

The input matching network of the proposed PA is designed using on-chip transformers. The transformers are necessary to couple the input RF power to the CS transistors in the stack while facilitating electrical isolation of each stacked cell. The primary coils of the transformers are connected in series to increase the overall input impedance. With four transistor cells and thus four input transformers, the impedance of each transistor cell should be designed to match to 12.5 Ω. The transformers are designed in Ansoft HFSS using the top two copper layers each with a thickness of roughly 1.2 μm.
The layout of one transformer and its simulated insertion loss and return loss from 5 to 30 GHz are plotted in Figure 4.5. In the simulation, the primary coil is terminated to 12.5 Ω and the secondary coil is terminated using the input impedance of a 320 μm transistor used in the stacked PA design. The dimensions of the input transformers are optimized to achieve broadband matching despite additional insertion loss. As shown in Figure 4.5, the simulated insertion loss is below 4.3 dB from 6 to 26.5 GHz. The loss of transformers, if used as power combiners at the output node, would severely degrade the saturated and linear output power and the bandwidth of the power combined PA. On the other hand, the transformers in this work are used as input matching networks where the losses impact the overall power gain and the PAE (if the gain is severely degraded) of the PA but their effect on the saturated output power is less severe.

In the stacked PA design, the distributed transmission line effects from relatively long interconnections among the stacked cells and output pads partially resonate with the inter-nodal and output capacitance, facilitating an optimum operation of the PA at the center frequency (18 GHz) with slight reduction in the bandwidth. The interconnections, on the other hand, may also cause phase imbalance among cells, which deteriorates PA’s efficiency and output power. In order to capture the effects, the interconnections are modeled as microstrip lines and simulated using Ansoft HFSS. The minimum distance between the stacked cells is mainly set by the dimensions of input transformers with diameters around 60 μm. The interconnections are implemented using thick copper layers to handle high current densities with minimum conductive loss. The simulated losses lie between 0.15 and 0.33 dB/mm for frequencies from 5 to 30 GHz. To overcome the phase imbalance due to small electrical length of each piece of interconnects, the input signal is fed from the bottom of the stack while the output signal is collected from the top of the stack (opposite side).

Time domain voltage waveforms of each transistor in the stacked PA are shown in Figure 4.6(a) and (b) at -1 dB compression point at 18 GHz (P_{1dB}=19.2 dBm) and 23 GHz (P_{1dB}=18.5 dBm), respectively. The arrows indicate the drain-source voltages of individual transistors and the overall peak-to-peak output voltage swings. The simulated
time domain waveforms confirm that the drain-source voltage swings of individual transistors are added constructively. Slight phase shifts among drain-source voltage swings are caused by the adverse effect of layout parasitics. As the frequency increases, the phase differences increase, leading to output power and efficiency degradations.

Figure 4.6 The simulated time-domain voltage waveforms of the broadband PA at -1 dB compression point at (a) 18 GHz (P_{1dB}=19.2 dBm) and (b) 23 GHz (P_{1dB}=18.5 dBm).
4.3 Measurement Results

Figure 4.7 Measured (solid lines) and simulated (dash lines) small-signal S-parameter under a supply voltage of 4.5 V.

Small-signal measurements are performed using an Agilent E8361A 67 GHz PNA from 5 GHz to 30 GHz. Figure 4.7 shows the measured (solid lines) and simulated (dashed lines) small-signal S-parameters. Interconnections among stacked cells modeled as microstrip lines and input transformers are simulated in Ansoft HFSS and then added to the circuit simulation. A good match between modeled and measured S-parameters is observed in Figure 4.7. Higher transistor gain at lower frequencies compensates for the input mismatch at lower frequencies, leading to a flat gain of 6 dB with -1 dB bandwidth ranging from 6 GHz to 26.5 GHz when the PA is biased with a 4.5 V supply voltage.
Figure 4.8 Measured and simulated output power, gain, PAE and DE versus input power under a supply voltage of 4.5 V at 18 GHz.

Figure 4.9 Measured $P_{\text{SAT}}$, $P_{1\text{dB}}$, and peak PAE under various supply voltages from 3.6 V to 7.2 V at 18 GHz.
Figure 4.10 Measured $P_{\text{SAT}}$, $P_{1\text{dB}}$, and peak PAE from 6 to 26 GHz when biased under supply voltages of 4.5 V and 5.4 V, and 7.2 V.

The large signal performance is measured using an Agilent E4448A spectrum analyzer with input power provided by an Agilent 83640L CW signal generator and a Gigatronics GT-1050A power amplifier driver. Figure 4.8 shows simulated and measured output power, power gain of the stacked PA versus input power at 18 GHz under 4.5 V supply voltage. As shown in the figure, the PA delivers a saturated output power $P_{\text{SAT}}$ of 21.7 dBm and a linear output power $P_{1\text{dB}}$ of 18.5 dBm. The power-added efficiency PAE and the drain efficiency DE at 18 GHz are also plotted in Figure 4.8. The peak PAE is 20.5% and the corresponding DE reaches 40%. The measured $P_{\text{SAT}}$, $P_{1\text{dB}}$, and PAE versus different supply voltages at 18 GHz are shown in Figure 4.9. By increasing the supply voltage to 7.2 V (1.2 V per transistor), $P_{\text{SAT}}$ and $P_{1\text{dB}}$ increase to 26.1 dBm (~400 mW) and 22.5 dBm, respectively, while the peak PAE is reduced to 11%. The dynamic-biasing scheme adjusts the bias current of the PA when the supply voltage changes, and ensures that performance does not degrade as the supply voltage varies. Due to the dynamic biasing scheme, the variation on transistor threshold voltage may lead to a variation in the bias current as well as output impedance at low input RF power levels. The effect is examined using corner simulations, which indicate a relatively small variation of ±3% on
peak PAE and ±0.35 dB on the saturated output power at 18 GHz. Although no degradation was observed during measurements, the supply voltage of 7.2 V (1.2 V per transistor) may not be compliant for long-term reliability requirements. The measured $P_{\text{SAT}}$, $P_{1\text{dB}}$, and PAE versus frequency from 6 to 26 GHz under supply voltages of 4.5 V, 5.4 V, and 7.2 V are shown in Figure 4.10. The measured $P_{\text{SAT}}$ and $P_{1\text{dB}}$ are above 21.2 dBm and 16.5 dBm, respectively, with peak PAE above 17% for several measured frequencies in the range of 6 to 20 GHz when biased under $V_{\text{DD}}=4.5$ V. The chip micrograph of the fabricated stacked PA is shown in Figure 4.11 and occupies a compact active area of 0.16 mm$^2$ with the input and output terminals placed on the opposite sides.

Table 4.1 summarizes the performance of the stacked PA under three different supply biases in comparison with other reported broadband power amplifiers. The PA achieves the highest saturated power among other broadband CMOS PAs at X-Band and K-Band. The high output power is achieved despite using low breakdown voltage transistors in a topology that ensures neither source-drain reachthrough nor gate-oxide breakdown occurs. Relatively good efficiency is maintained over the entire operating frequency of X-Band to K-Band.

Figure 4.11 Micrograph of the broadband PA.
Table 4.1 Performance comparison of broadband PAs.

<table>
<thead>
<tr>
<th>Ref</th>
<th>Technology</th>
<th>Frequency (GHz)</th>
<th>P_{1dB} (dBm)</th>
<th>P_{SAT} (dBm)</th>
<th>Peak PAE (%)</th>
<th>V_{DD} / #Device (V)</th>
<th>Topology</th>
</tr>
</thead>
<tbody>
<tr>
<td>[39]</td>
<td>SiGe 0.35 μm</td>
<td>7-18 @17.2GHz</td>
<td>N/A</td>
<td>17.5</td>
<td>11.2</td>
<td>1.2</td>
<td>Push-pull</td>
</tr>
<tr>
<td>[34]</td>
<td>CMOS 0.13 μm</td>
<td>2-8</td>
<td>3.5</td>
<td>N/A</td>
<td>N/A</td>
<td>2</td>
<td>Distributed amplifier</td>
</tr>
<tr>
<td>[32]</td>
<td>GaAs HEMT 0.15 μm</td>
<td>15-50</td>
<td>15.1-19.1</td>
<td>18-22</td>
<td>N/A</td>
<td>4</td>
<td>Distributed amplifier</td>
</tr>
<tr>
<td>[40]</td>
<td>CMOS 0.18 μm</td>
<td>4-17</td>
<td>15-17</td>
<td>16-18</td>
<td>11-16</td>
<td>1.8</td>
<td>Darlington cascode</td>
</tr>
<tr>
<td>[41]</td>
<td>CMOS 90 nm</td>
<td>5.2-13 @ 8 GHz</td>
<td>22.6</td>
<td>25.2</td>
<td>21.6</td>
<td>1.4</td>
<td>Push-pull with transformer</td>
</tr>
<tr>
<td>[35]</td>
<td>CMOS 0.18μm</td>
<td>35 @5GHz</td>
<td>8.6</td>
<td>12.4</td>
<td>N/A</td>
<td>1.4</td>
<td>Distributed amplifier</td>
</tr>
<tr>
<td>[25]</td>
<td>CMOS 0.18 μm</td>
<td>6.5-13 @ 9.5GHz</td>
<td>20.2</td>
<td>21.5</td>
<td>20.3</td>
<td>1.8</td>
<td>Push-pull with transformer</td>
</tr>
<tr>
<td>[42]</td>
<td>GaAs HEMT 0.15 μm</td>
<td>17-35</td>
<td>22-22</td>
<td>22.5-23.5</td>
<td>30-40</td>
<td>4</td>
<td>Synthesized transformer matching</td>
</tr>
<tr>
<td>[36]</td>
<td>CMOS 45 nm SOI</td>
<td>4-50 @ 15GHz</td>
<td>N/A</td>
<td>22.5</td>
<td>24.2</td>
<td>1.1</td>
<td>Stacked Class D Differential output</td>
</tr>
<tr>
<td></td>
<td>This work</td>
<td>6-26 @ 18GHz</td>
<td>18.5</td>
<td>21.7</td>
<td>20.5</td>
<td>0.75</td>
<td>Dynamically-biased stacked PA</td>
</tr>
</tbody>
</table>

4.4 Summary

A novel broadband PA is proposed using dynamically-biased stacked transistors and is implemented in a 45 nm CMOS SOI technology. By optimizing the number, size, and topology of transistor cells in the stack, the output impedance is designed to be close to 50Ω without utilizing an output matching network, which generally causes bandwidth, efficiency and output power degradations. The degradations in bandwidth, efficiency, and output power caused by on-chip matching network are eliminated by directly targeting a 50 Ω output impedance optimized by the number, size, and topology of transistor cells in the stack. As a result, high saturated output powers (> 20 dBm) over an ultrawide bandwidth (~ 20 GHz) are achieved. The high supply voltage used in the stacked PA design, is a disadvantage of this approach for mobile battery supported applications. Nevertheless, a high supply voltage may be readily available in radars and satellite systems.
5. A WIDEBAND CMOS PA WITH ALN SUBSTRATE

5.1 Introduction

Most CMOS RF PAs reported to date have been demonstrated using high breakdown voltage CMOS transistors with thick gate oxide and long gate lengths \((L_{\text{gate}} > 0.18 \, \mu\text{m})\). Either power combining circuits or off-chip matching networks are used at the output of these PAs in order to reach Watt-level performance at RF frequencies. High insertion loss and relatively large parasitics of output matching networks or power combiners generally degrade the performance of CMOS PAs to narrow-band and low linearity characteristics and make them unsuitable for multiband multi-mode applications. To overcome some of the above limitations, especially transistors low breakdown voltage and to boost the output voltage swing as well as output impedance of CMOS PAs, stacked power amplifiers have been proposed. With the exception of our previous work [1], the number of stacked transistors in stacked PAs has been limited to only four transistors. If this number is increased, output voltage swing and output power will further increase. At the same time, by directly matching the stacked PA to a 50 \, \Omega load impedance, output matching network or output power combiners may be eliminated from the PA circuit and higher efficiencies, higher output powers and wider bandwidths are achieved. Three simple modifications to previously reported designs are proposed to help achieving PAs to stack more than 4 transistors efficiently: 

(i) The gate of each transistor in the stack is dynamically biased from the corresponding drain and source terminals such that its voltage follows those of drain and source. 

(ii) Parasitic capacitances to the substrate at all internal nodes are reduced or eliminated. 

(iii) The stack design includes Cascode transistor cells to ensure circuit stability.
In this Chapter, a wideband RF PA is implemented using a stack of 16 dynamically-biased thin-oxide ($t_{ox} \sim 1\text{nm}$) CMOS SOI transistors (8 Cascode cells) to achieve a large output voltage swing and output impedance close to 50 $\Omega$ without using any output impedance matching network. The Si substrate is etched away and replaced by a semi-insulating AlN substrate in order to eliminate the adverse effects of parasitic capacitances and further improve the PA performance including its output power, efficiency and linearity.

### 5.2 The Effect of Parasitic Capacitance

In an ideal situation, drain-source voltages of all transistors in an $n$-stacked PA design are identical with no phase difference. The output signal swings between $\sim 0$ V and a maximum value of $n \times V_{\text{breakdown}}$, where $V_{\text{breakdown}}$ is the breakdown voltage of a CMOS transistor. Parasitic capacitors among internodal terminals and the common GND break the symmetry of the stacked PA design and force both amplitude and phase variations of drain-source voltages of individual transistors. As an example, simulated time-domain drain-source voltages of a 4-stacked CS PA with identical CMOS SOI transistors ($L = 45 \text{nm}$, $W = 2 \text{mm}$) with and without parasitic capacitance to GND (estimated by the post-layout parasitic extractor) are shown in Figure 5.1 (a) and (b), respectively. When the effect of parasitic capacitors are taken into account, the top transistor in the stack has a substantially higher voltage swing ($V_{\text{DS,4}}$) compared to others and may experience premature breakdown before other transistors can contribute significantly to the output voltage swing. Additionally, the variations in the phase caused by these parasitic capacitors reduce the drain-source voltage swing combining efficiency. The overall voltage swing can be expressed by taking the effect of phase variations into account.

$$V_{\text{total}} = V_{\text{DS,1}} + V_{\text{DS,2}}\text{EXP}(j\phi_1) + \cdots + V_{\text{DS,n}}\text{EXP}(j\phi_n)$$

If all transistor cells are conducting with the same phase, a combining efficiency of 100% is achieved. With phase differences among the drain-source voltages of stacked transistors, however, the combining efficiency drops rapidly and limits the power-added efficiency. While in principle, stacking more transistors should allow a larger voltage
swing and thus a larger output power, in practice, variations in the amplitude and phase of drain-source voltages of stacked transistors caused by parasitic capacitors limit the output power and efficiency of stacked PAs with large number of transistors.

Figure 5.1 Simulated drain-source voltage swings of each transistor in a PA designed with 4 stacked CS cells (a) with and (b) without the effect of parasitic capacitances.
5.3 Wideband Stacked Power Amplifier Design

5.3.1 Dynamically-Biased SOI Transistors

Figure 5.2 The simplified circuit schematic of the RF CMOS SOI PA designed with 8 stacked dynamically-biased Cascode cells (16 transistors).

The simplified circuit schematic of the proposed PA is shown in Figure 5.2 where 16 electrically-isolated thin-oxide ($t_{ox} \sim 1\text{nm}$) floating body CMOS SOI transistors are stacked to increase the total output voltage swing as well as the output impedance. If gate-oxide breakdown does not occur and output voltage signal is distributed evenly among all transistors, it can swing to as high as the sum of the drain-source reach-through voltages of individual transistors $n \times V_{\text{breakdown}}$. Capacitors $C_1$ and $C_2$ are bypass capacitors. Their values are optimized to achieve high gain and stable operation. In addition to drain-source reach-through voltage, transistors in a stacked PA are also susceptible to gate-
oxide breakdown. In the presented PA, the gate-oxide breakdown is prevented by a
dynamic biasing scheme as each transistor in the stack is self-biased with a resistor
feedback network. Feedback resistors \((R_1 \text{ to } R_3)\) set the gate voltage of each transistor
within its source and drain voltages \(V_{G,n} = \frac{R_3}{R_1+R_3} \cdot (V_{D,n} - V_{S,n})\), preventing gate oxide
breakdown if the drain-source voltage is limited to values less than reachthrough voltage.

\[
\text{Figure 5.3 Simulated } V_{ds}-V_{gs} \text{ voltage swings of the top transistor in the stack of 16}
\text{transistors with input powers of 10, 15, and 20 dBm.}
\]

Figure 5.3 plots the simulated \(V_{ds}-V_{gs}\) waveforms of the top transistor in the stack for
input powers of 10, 15, and 20 dBm. Despite the fact that the drain of this transistor
swings between \(-0\text{V}\) and \(-2\times V_{DD}\), the other terminal voltages swing along with the drain
voltage such that neither gate-oxide breakdown nor drain-source reach-through occurs.
Other transistors in the stack behave similarly.

The output power of a stacked PA delivered to a 50 Ω load impedance is expressed
according to:

\[
P_{out} = \frac{(n \cdot V_{ds})^2}{8 \times R_L} = n^2 (V_{MAX} - V_{knee})^2 \quad (5.2)
\]

where \(n\) is the number of transistors in the stack, \(V_{MAX}\) is the maximum drain-source
voltage across one transistor typically close to the breakdown voltage of the transistor
\( V_{\text{breakdown}} \), and \( V_{\text{Knee}} \) is the minimum drain-source voltage of the transistor under its maximum current. An effective method to reduce the knee voltage is to reduce the current density of the transistor. In the proposed PA design, the stacked transistor configuration allows using high voltage power supply (15 V) while transistors are biased under a low current density of \( \sim 0.2 \) mA/\( \mu \)m, which suppresses the knee voltage of each transistor for improved power performance and long-term reliability.

With the dynamic biasing scheme, the limiting mechanism in stacking a large number of transistors is the buried oxide breakdown (\( \sim 80 \) V in this technology). As the number of stacked transistor increases, however, parasitic capacitance and inductance to the substrate stemming from transistor layout and interconnections and distributed effects of transformers cause significant phase variations across drain-source voltages of individual transistors. This phase imbalance is more significant as the frequency of operation increases and leads to inefficient power combining and hence degradation of output power and PAE.

The CMOS SOI PA is designed using 8 stacked Cascode cells as opposed to 16 stacked CS transistors. The overall gain of the presented PA is equal to the gain of a single Cascode cell. Compared to 2 stacked CS transistors, each Cascode cell offers higher voltage gain, higher output impedance, better isolation between input and output and hence better stability, and a more compact area since only one transformer is required to drive two transistors. The input RF power is coupled to the PA using on-chip transformers. The primary coils of input transformers are connected in series to increase the overall input impedance. The transformers are designed using the top two copper layers provided in the process with thicknesses of \( \sim 1.2 \) \( \mu \)m. Using HFSS simulation, we have confirmed that the coupling coefficient of each transformer is above 0.7 in the frequency range of 1 to 3 GHz. The loss from input transformers degrades the gain of the PA (\( \text{Loss} \sim 3 \) dB) but does not affect the maximum output power and efficiency as long as the power gain is not significantly degraded.
Figure 5.4 measured input (red curves) and output (blue curves) reflection coefficients from 1 to 5 GHz of a 2 mm transistor, one Cascode cell implemented with 2 mm transistors, and the PA designed with 8 stacked Cascode cells.

The distributed effect of the interconnections between stacked cells are modeled as microstrip lines and simulated in Ansoft HFSS. The simulated loss is 0.1 dB/mm at frequencies around 2 GHz. The relatively low loss confirms that the design is not significantly degraded by the interconnections at the operating frequencies of 1.5 to 2.4 GHz. The output impedance of the stacked PA is then calculated as the sum of output impedances of individual cells in the stack.

\[ Z_{out} = n \cdot Z_{cell} = 8 \cdot Z_{out, cascode} \quad (5.3) \]

Figure 5.4 plots the measured input (red curves) and output (blue curves) reflection coefficients from 1 to 5 GHz of a 2 mm NMOS transistor, one Cascode cell based on two 2 mm NMOS transistors, and the PA designed with 8 Cascode cells in the 45nm CMOS SOI technology. As shown in the figure, by optimizing the size (2 mm) and number of the transistors (8 Cascode cells) in the stacked PA, an output impedance close to 50 Ω over a wide range of frequencies is achieved. Therefore, no output matching circuit is required and output power, PAE, and bandwidth degradations caused by lossy on-chip impedance transformation networks are prevented.
5.3.2 Substrate Transfer Technology

The PA is fabricated in a standard CMOS SOI technology with a minimum gate length of 45 nm. In order to demonstrate the importance of eliminating internodal parasitic capacitances to GND in improving the performance, a few samples are post-processed to remove the Si substrate and replace it with an Aluminum Nitride (AlN) substrate. The higher thermal conductivity (\(k=285 \text{ W/m} \cdot \text{K}\)) and slightly lower dielectric constant (\(\varepsilon_0=8.9\)) of AlN substrate compared to silicon (\(k=145 \text{ W/m} \cdot \text{K}, \varepsilon_0=11.68\)) combined with its semi-insulating characteristics make it an ideal substrate to implement RF power circuits. By substituting the conductive Si substrate with the semi-insulating AlN substrate, all parasitic capacitances will be in series with very large resistors, effectively eliminating their adverse effects on the PA circuit.

The post-processing technology that transfer the device layer including the buried oxide layer of SOI chips to an AlN substrate is shown in Figure 5.5(a) where no photolithography step is necessary. The backside Si substrate is completely etched using Xenon Difluoride (XeF\(_2\)) silicon dry etching process with an etch rate of 5 \(\mu\text{m/min}\) at room temperature. The process does not generate plasma, a possible cause for transistor performance alteration during etching. The process also has high selectivity between Si and silicon dioxide (1000:1 selectivity); hence, the etching stops at the SOI buried oxide. After the etching, the SOI flake with a thickness of \(\sim 10 \mu\text{m}\) is bonded at \(80 \degree\text{C}\) to an AlN substrate by applying a thin adhesive layer (100nm) of PMMA. No air gap between the thin SOI flake and the AlN substrate should be created during the bonding process. Figure 5.5 (b) shows the chip micrograph and photo of the RF CMOS SOI PA and the SOI flake bonded to AlN substrate. The PA including its pads occupies a chip area of 1.2 \(\text{mm}^2\).
Figure 5.5 (a) Post-processing steps for AlN substrate transfer. Step 1: Bonding the chip to a temporary substrate using photo-resist. Step 2: Etching the backside silicon substrate using XeF$_2$. Step 3: Releasing the chip from the temporary substrate using acetone. Step 4: Bonding the chip to an AlN substrate using a thin adhesive PMMA layer on a heat plate at 80$^\circ$C. (b) The photo and the chip micrograph of the CMOS SOI chip with substrate transferred to AlN.
5.4 Measurement Results

5.4.1 Active Device Measurement

Figure 5.6(a) and (b) shows $I_D-V_{DS}$ and $I_D-V_{GS}$ characteristics, respectively, of a 640 µm NMOS transistor with a finger width of 0.5 µm implemented in the 45nm CMOS SOI technology before and after substrate transfer. No performance degradation of the active device is observed after substrate transfer to AlN. Figure 5.7 compares measured $H_{21}$ and maximum available gain MAG before and after substrate transfer. $H_{21}$ and MAG were calculated from measured S-parameters.

![Graph showing $I_D-V_{DS}$ characteristics](attachment:image1.png)

**Figure 5.6** Measured transistor (a) $I_D-V_{DS}$ and (b) $I_D-V_{GS}$ characteristics before (red curves) and after (blue curves) substrate transfer.
5.4.2 Power Amplifier Measurement

A major concern in implementing power amplifiers in CMOS SOI technology is the transistor self-heating effect caused by the power dissipated in the transistor and low thermal conductivity of the buried oxide (SiO$_2$) layer. Transistor power gain degrades when it operates at high bias currents and high drain voltages over time. In this particular CMOS SOI technology, the maximum voltage difference permitted across drain-source terminals is about 1.2 V at 105°C. For long-term reliability concerns, a maximum supply voltage of 15 V is selected to ensure that the RMS voltages across transistor terminals are within the safe operating range. Additionally, the circuit is designed to operate at low current density of 0.2 mA/μm to avoid high junction temperature (<105°C) across each transistor. On-wafer small-signal S-parameter measurements are performed using a 67 GHz Agilent E8361A network analyzer with SOLT calibration from 1 to 5 GHz. As shown in Figure 5.8, the PA provides a small signal power gain of 12.2 dB at 1.8 GHz with -3 dB bandwidth from 1.5 to 2.6 GHz when biased under $V_{DD} = 12$ V ($I_D=0.15$ mA/μm). The gain is slightly smaller under 9 V ($I_D=0.1$ mA/μm) power supply mainly due to smaller drain current flowing in each transistor (current is set by the self-bias mechanism). The PA is unconditionally stable over the entire operating frequency as indicated by the stability factor ($k > 1.2$).
Figure 5.8 Measured small signal S-parameters of the stacked PA under 9 V and 12 V supply voltage.

Figure 5.9 Measured $P_{\text{SAT}}$, $P_{1\text{dB}}$, and PAE versus input power at 1.8 GHz under 15 V supply voltage before (dash curves) and after (solid curves) post-processing.
The large signal performance is measured using an Agilent E4448A spectrum analyzer with input power provided from an Agilent 83640L CW signal generator. Figure 5.9 compares the power measurement results for CMOS SOI PAs on both Si and AlN substrates. Transferring the substrate to AlN reduces the adverse effects of parasitic capacitance and thus boosts both output power and power-added efficiency. The substrate transferred PA delivers a $P_{\text{SAT}}$ of 30.2 dBm, a $P_{\text{1dB}}$ of 27.8 dBm and a peak PAE of 23.8% at 1.8 GHz.

![Figure 5.10 Measured $P_{\text{SAT}}$, $P_{\text{1dB}}$, and peak PAE of the PA under various supply voltages at 1.8 GHz before (dash curves) and after (solid curves) post-processing.](image)

The effect of varying the supply voltage on the PA performance is shown in Figure 5.10 where measured $P_{\text{SAT}}$, $P_{\text{1dB}}$, and peak PAE of the PA on both Si and AlN substrates at 1.8 GHz are plotted vs. supply voltage. By increasing the supply voltage from 9 to 15 V, $P_{\text{SAT}}$ increases from 25.5 to 30.2 dBm while the peak PAE is between 23.5% and 25.7%. Similar observations are made for the PA on Si substrate with slightly degraded performance compared to the one on AlN substrate. Also note that the bias current is controlled by the dynamic-biasing scheme which ensures that both linearity and efficiency of the PA do not degrade as the supply voltage varies.
Figure 5.11 Measured $P_{\text{SAT}}$, $P_{\text{1dB}}$, and peak PAE under 12 V supply voltage at various frequencies from 1.5 to 2.4 GHz before (dash curves) and after (solid curves) post-processing.

The measured $P_{\text{SAT}}$, $P_{\text{1dB}}$, and PAE versus frequency from 1.5 to 2.6 GHz under $V_{\text{DD}}=12$ V are shown in Figure 5.11 for PAs on both Si and AlN substrates. Note that the amplifier demonstrates a wideband power performance, as the measured $P_{\text{SAT}}$, $P_{\text{1dB}}$, and PAE remain constant over the measured frequency range of 1.5 to 2.4 GHz. The wideband power performance is attributed to the stacked design that achieves matched input and output impedances over a wide range of frequencies. For the PA on AlN substrate, $P_{\text{SAT}}$ and $P_{\text{1dB}}$ are above 27.9 dBm and 24.8 dBm, respectively, with peak PAE above 20% for the measured frequency range of 1.5 to 2.4 GHz. Similar observations are made for the amplifier on the Si substrate, with slightly degraded performance attributed to amplitude and phase differences of the drain-source voltages caused by internodal parasitic capacitances.
Figure 5.12 Measured WCDMA output spectra at 1.8 GHz before (red curve) and after post-processing (blue curve).

The PA is measured using a WCDMA signal with a chip rate of 3.84 Mcps provided by an Agilent E4433B signal generator. The ACLR is measured at 5 and 10 MHz offsets from the center frequency. Figure 5.12 compares the measured ACLR before and after substrate transfer at the saturated output power for each amplifier. The substrate transferred to AlN technology reduces the effect of parasitic capacitances to GND and thus improves the overall linearity of the PA. For the substrate transferred PA, ACLR of -40.6 and -54.2 dBc are measured at 5 and 10 MHz offset, respectively. An improvement of 2 to 4 dB is observed compared to the PA on the conductive Si substrate.

Table 1 compares the performance of the presented PA with other reported RF PAs implemented in various CMOS technologies. The PA achieves the large bandwidth and comparable power performance amongst the reported CMOS RF PAs.
Table 5.1 Performance comparison of Watt-level CMOS PAs

<table>
<thead>
<tr>
<th>Ref.</th>
<th>Technology</th>
<th>Frequency (GHz)</th>
<th>P_{SAT} (dBm)</th>
<th>P_{1dB} (dBm)</th>
<th>Peak PAE (%)</th>
<th>V_{DD} (V)</th>
<th>Topology</th>
</tr>
</thead>
<tbody>
<tr>
<td>[3]</td>
<td>0.35 μm CMOS (L_{gate}=0.35 μm)</td>
<td>2.4 BW=0.51</td>
<td>33.4</td>
<td>N/A</td>
<td>31</td>
<td>2</td>
<td>Power combined</td>
</tr>
<tr>
<td>[12]</td>
<td>0.25 μm CMOS SOS (L_{gate}=0.25 μm)</td>
<td>1.88</td>
<td>21</td>
<td>18.3</td>
<td>44</td>
<td>3.9</td>
<td>Stack of 3 SOS transistors</td>
</tr>
<tr>
<td>[5]</td>
<td>0.13 μm CMOS SOI I/O transistor (L_{gate}=0.28 μm)</td>
<td>1.9</td>
<td>32.4</td>
<td>30.8</td>
<td>47</td>
<td>6.5</td>
<td>Stack of 4 SOI transistors w/ off-chip matching</td>
</tr>
<tr>
<td>[43]</td>
<td>65 nm CMOS I/O thick-oxide transistor (L_{gate}=0.25 μm)</td>
<td>6.5</td>
<td>27.4</td>
<td>N/A</td>
<td>16.5</td>
<td>3.6</td>
<td>Stack of 3 transistors w/ off-chip matching</td>
</tr>
<tr>
<td>[44]</td>
<td>65 nm CMOS Thick-oxide transistor (L_{gate}=0.23 μm)</td>
<td>1.78 BW=1.64-1.95</td>
<td>29.4</td>
<td>N/A</td>
<td>51</td>
<td>3.4</td>
<td>Stack of 4 transistors w/ off-chip matching</td>
</tr>
<tr>
<td>[45]</td>
<td>32 nm CMOS Thick-oxide transistor (L_{gate}=0.18 μm)</td>
<td>2.75</td>
<td>28</td>
<td>26.5</td>
<td>31.9</td>
<td>1.8</td>
<td>Power combined</td>
</tr>
<tr>
<td>[46]</td>
<td>0.18 μm CMOS Thick-oxide transistor (L_{gate}=0.18 μm)</td>
<td>2.5 900 MHz</td>
<td>30.8</td>
<td>29.1</td>
<td>30.6</td>
<td>3.3</td>
<td>Power combined</td>
</tr>
<tr>
<td>[47]</td>
<td>90 nm CMOS Thick oxide transistor (L_{gate}=0.25 μm)</td>
<td>930 MHz</td>
<td>29.4</td>
<td>27.7</td>
<td>25.8</td>
<td>2</td>
<td>Power combined</td>
</tr>
<tr>
<td>[48]</td>
<td>90 nm CMOS (2.5-V thick-oxide transistor)</td>
<td>1.97 GHz 1.6-2.6</td>
<td>27.1</td>
<td>N/A</td>
<td>43</td>
<td>2.8</td>
<td>Inverse class-F PA</td>
</tr>
<tr>
<td>This work</td>
<td>45 nm CMOS SOI Thin-oxide transistor (L_{gate}=40 nm)</td>
<td>1.8 BW=1.5-2.4</td>
<td>28.4</td>
<td>26.9</td>
<td>25.7</td>
<td>12</td>
<td>16 dynamically-biased CMOS SOI on AlN substrate</td>
</tr>
</tbody>
</table>

5.5 Summary

A fully-integrated wideband power amplifier is implemented using dynamically-biased stacked Cascode cells in a 45 nm CMOS SOI technology. By optimizing the number, and size of transistors in the stack, the output impedance is directly matched to 50 Ω without utilizing an output matching network. As a result, the PA achieves good power performance and wide bandwidth suitable for multi-mode multiband applications. Using a simple post-processing technology, the Si substrate is substituted by an AlN substrate to remove the effects of parasitic capacitances and further improve the PA performance. A Watt-level output power with a wide operating bandwidth from 1.5 to 2.4 GHz is achieved despite using low breakdown thin-oxide CMOS transistors in an advanced technology node.
6. A HIGH POWER LINEAR AMPLIFIER IN CMOS SOS TECHNOLOGY

6.1 Introduction

Integrated CMOS PAs for multi-band multi-mode applications often require high linear output power, high efficiency, and wide operating bandwidth among other performance metrics. Implementing CMOS PAs with several Watts of output power, however, remains challenging mainly due to the low breakdown voltages of scaled CMOS transistors. Stacked PAs have been demonstrated to deliver high output power over wide bandwidths. The maximum number of stacked transistors, however, is generally been limited to 3–4 due to effect of parasitic capacitance.

This Chapter presents a wideband power amplifier (PA) implemented in 0.25 μm CMOS silicon-on-sapphire (SOS) technology. The PA is designed with 4 stacked dynamically biased Cascode cells to increase the overall output voltage swing as well as the optimum load impedance. The insulating substrate in the SOS process significantly suppresses the effect of parasitic capacitance and hence minimizes the amplitude and phase differences among drain-source voltage waveforms across each transistor. The spatial thermal distribution of the stacked PA under DC and RF excitation is examined using a thermoreflectancef imagina technique. The thermal images confirm the voltage swings are equally distributed across each stacked transistors with no thermal runaway when operating under high power density.
6.2 High Power Wideband PA design

6.2.1 Number and Size of Stacked FETs

Figure 6.1 shows the simulated optimum load impedance ($Z_{opt}$) and output reflection coefficient ($S_{22}$) at 1.4 GHz for different number of stacked Cascode with different transistor widths. By stacking more cells with the same width (plotted in red curves), the optimum load impedance increases while the real part of output impedance increased and the overall output capacitance reduced. Increasing the widths of the transistors (plotted in blue curves), on the other hand, shows an opposite effect on optimum load impedance as well as the real and imaginary parts of the output impedance. Therefore, selecting the number of size of stacked transistors results in the trade-offs between output power, efficiency, and bandwidth.

![Diagram showing simulated optimum load impedance ($Z_{opt}$) and output reflection coefficient ($S_{22}$) for different number and size of stacked Cascode transistor cells at 1.4GHz.](image-url)
Two simulations are performed to optimize the design of the stacked PA. Figure 6.2(a) shows the simulated $P_{\text{SAT}}$, peak PAE, and Gain of stacked PAs designed with a fixed optimum load of $Z_{\text{opt}}$ of 20 $\Omega$ with different number of stacked Cascode cells. As a result, the matching bandwidth and matching loss in this simulation are similar for all the PAs. Note that in order to maintain the fixed optimum load impedance as the number of stacked cells is increased, the widths of the transistors are also increased. As shown in the figure, $P_{\text{SAT}}$ increases quadratically with the number of stacked cells (proportional to square of supply voltage). The power gain, however, is decreased due to an increase in the widths of the transistors (lower $f_{\text{max}}$ due to layout parasitics).

Figure 6.2(b), on the other hand, shows the simulated $P_{\text{SAT}}$, peak PAE, and -3 dB output power fractional bandwidth (FBW) of PAs designed with different number of stacked Cascode cells with the same transistor width of 10 mm. The output matching networks are designed to achieve maximum output power using a single-stage LC section in the simulation. As shown in the figure, the peak PAE and FBW for one Cascode cell are dominated by the loss from the output matching network with high output impedance transformation ratio. By increasing the number of stacked cells, the FBW and PAE improve due to reduced loss from the output matching network. The output power increases proportional to $V_{\text{DD}}$ since the supply voltage doubles as the number of stacked cells doubles while the bias current is kept constant. For 8 stacked cells, however, the voltage combining efficiency of stacked cells drops leading to reduced overall efficiency. The drop in combining efficiency is attributed to a finite phase delay through the transformer ladder leading to small phase shifts to the drain-source voltages across each transistor. In this work, 4 stacked cells with 10 mm width are selected to achieve highest PAE $\sim$40% and a high output power above 34 dBm.
Figure 6.2 (a) Simulated $P_{\text{SAT}}$, peak PAE, and Gain of stacked PAs designed with different number of stacked cells with a fixed $R_{\text{opt}} = 20 \, \Omega$. (b) Simulated $P_{\text{SAT}}$, peak PAE, and fractional bandwidth of stacked PAs designed with different number of stacked cells with the same transistor width of 10 mm.
6.2.2 Stacked PA design

The circuit schematic and chip micrograph of the fully-integrated CMOS PA designed with 4 stacked transformer-coupled Cascode cells is depicted in Figure 6.3. The PA occupies a compact chip area of 2.2 mm² including pads. The transistors are biased in Class AB amplification mode where the voltage at the top of the stack swings to \(2 \times V_{DD}\). To prevent gate oxide breakdown, the gate terminal of each transistor is dynamically-biased through resistor dividers (R₁ to R₃), which set the gate voltage between the...
corresponding drain and source voltages. Comparing to the designs using stacked CS cells, the Cascode cells provide high power gain and high output impedance and ensure stability. The transistors are implemented with a total width of 10 mm to provide sufficient gain as well as the necessary output impedance for the operation in 1 to 2 GHz frequency range. The high output impedance of the stacked Cascode cells minimizes the loss from the output matching network and facilitates wideband impedance matching to a 50 Ω load using a single-stage on-chip LC matching network. The input power is coupled to the Cascode cells using an array of on-chip transformers with primary coils connected in series to provide high input impedance close to 50 Ω.

The input transformers couple the RF signal to each stacked cells while providing input matching. The inductance of the secondary winding of each transformer is designed to resonate with the input capacitance of CS transistor. The input is matched to 50 Ω by connecting the primary coils of the four input transformers in series. The transformers and their interconnecting transmission lines are simulated using Ansoft HFSS. The layout and dimension is shown in Figure 6.4. Each transformer occupies a chip area of 157×157 mm² and provides a coupling coefficient above 0.75 from 1 to 2 GHz.
6.3 Measurement Results

Figure 6.5 shows the simulated and measured S-parameters of the stacked PA. The PA achieves a gain of 11.8 dB at 1.8 GHz. The difference between the post layout simulation and the measurement is mainly due to a lack of inductance from the interconnections.

Figure 6.5 Measured and simulated small-signal S-parameters of the PA.

Figure 6.6 Measured output power, gain, PAE, and DE at 1.4 GHz when biased under two supply voltages of 13.5 and 16V.
Figure 6.6 plots the measured output power, gain, PAE, and DE of the stacked PA under two supply voltages under CW at 1.4 GHz. The current densities of $I_D = 17$ mA/mm and $I_D = 8$ mA/mm at 13.5 V and 16 V, respectively, are selected to achieve highest peak PAE. The PA under a supply voltage of 16 V measures a saturated output power of 34.4 dBm (2.75 Watts) with peak PAE and corresponding DE of 38% and 48%, respectively. Figure 6.7 plots the measured $P_{SAT}$, gain, peak PAE, and corresponding DE from 1 to 2 GHz with the two supply voltages. When the PA is biased under $V_D = 16$ V, the measured $P_{SAT}$ is above 33 dBm from 1 to 1.8 GHz with peak PAE and corresponding DE above 24.5% and 33%, respectively.

The stacked PA is measured using uplink WCDMA signal with chip rate of 3.84 Mcps. The adjacent channel leakage ratio ACLR is measured at 5 MHz offsets from the center frequency. Figure 6.8 plots the measured ACLR and DE versus output power at 1.4 GHz under two supply voltages. The PA delivers an output power of 29.2 dBm and DE of 29.1% at ACLR of -33 dBc when biased under $V_D = 16$ V. Figure 6.9 shows the measured linear output power above 26.8 dBm from 1 to 2 GHz with WCDMA input signal at ACLR of -33 dBc. The measurement results confirm the stacked PA maintains good linearity across the entire bandwidth.
Figure 6.8 Measured ACLR and DE versus output power using WCDMA signal at 1.4 GHz when biased under two supply voltages of 13.5 and 16V.

Figure 6.9 Measured linear output power using WCDMA signals under two supply voltages of 13.5 and 16V.

Figure 6.10 plots the measured ACLR and DE at 1.4 GHz using uplink 10 MHz QPSK LTE signal with peak-to-average ratio of 7.2 dB. The measured output power and DE are 26.3 dBm and 16.7%, respectively, while satisfying the ACLR requirements of -33 dBc and error vector magnitude EVM of 4.48% under supply voltage of 13.5 V. Figure 6.11 plots the spectrum of the LTE with 10MHz bandwidth. At output power of 26.3 dBm, the
ACLR and ALPR achieved are -33 dBc and -59.4 dBc, respectively. As indicated by these measurements, the linearity of the PA satisfies the requirements of WCDMA and LTE standards.

Figure 6.10 Measured ACLR and DE versus output power using LTE signal at 1.4 GHz when biased under a supply voltage of 13.5 V.

Figure 6.11 Measured output spectrum of the PA with LTE signal when biased under a supply voltage of 13.5 V.
6.4 Thermal Imaging Technique

Figure 6.12 (a) The waveform of the pulse power supply and (b) the photo of the thermal measurement system setup.

The time-domain simulation waveforms of the stacked PA confirm that the high-resistivity substrate eliminates the effect of parasitic capacitance and the overall voltage swing is equally distributed across each stacked transistors. However, direct measurement of the voltage signal is difficult to perform at RF frequencies without interfering the
operation of the circuit. An alternative approach that indirectly observe the power dissipation across each cell is proposed through thermal imaging [49]. An even thermal distribution would be an indication of identical voltage swings across the stacked transistors since they conduct the same current.

A thermal reflectance technique is utilized in this work to examine the spatial thermal distribution. The thermal measurement system (Microsanj NT 410A) consists of a pulse generator, pulsed LEDs, and a microscope and an image sensor. The time-domain waveforms of the pulse generator and the photo of the measurement setup are in plotted in Figure 6.12. Accurate temperature can be extracted by monitoring the change in surface reflection coefficient between hot and cold frames of the thermal images. Temperature resolution of ~0.1 °C has been demonstrated in [50].

Figure 6.13 The thermal image at output power of 30 dBm and the temperature profile across the each stacked cell under DC and RF excitations with different output power levels.
The thermal measurements are performed at DC with no input RF power applied as well as at different output power levels (10 dBm, 20 dBm and 30 dBm). The thermal image measured at output power of 30 dBm is shown in Figure 6.13(a). Figure 6.13(b) plots the distribution of average temperature across each transistor in the stacked PA for both DC and RF excitations. Under DC excitation, there is a slight variation of the average temperature across each transistor. The small temperature difference within each Cascode cell is likely due to the fact that Common Source transistors in the Cascode cell have slightly larger drain-source voltage compared to Common Gate transistors (all transistors have the same DC drain current). Transistors 1 and 8 (and transistors 2 and 7 to some degree) are slightly cooler than others due to their vicinity to the probe pads and probes, which help remove the heat from the stacked PA. Note that the Sapphire substrate is not a very good heat conductor, thus the cooling effect of probes is observed. When the PA is excited with an input RF signal, the shape of the thermal distribution across the PA remains more or less the same with a slight hump around the center of the stack. Even under high output RF power of 30 dBm, the shape of the temperature distribution is exactly the same as other cases with average temperature.
6.5 Summary

Table 6.1 Performance comparison of high power linear amplifiers

<table>
<thead>
<tr>
<th>Ref</th>
<th>Technology</th>
<th>Frequency (GHz)</th>
<th>$P_{\text{SAT}}$ (dBm)</th>
<th>Peak PAE (%)</th>
<th>Supply Voltage (V)</th>
<th>WCDMA $P_{\text{OUT}}$ (dBm)</th>
<th>LTE/WLAN $P_{\text{OUT}}$ (dBm)</th>
</tr>
</thead>
<tbody>
<tr>
<td>[51]</td>
<td>CMOS 130 nm</td>
<td>1.85</td>
<td>32</td>
<td>15.3</td>
<td>5.5</td>
<td>28 @ -38.7 dBc ACLR</td>
<td>LTE 24.9 @ -34.9 dBc ACLR</td>
</tr>
<tr>
<td>[52]</td>
<td>CMOS 180 mm</td>
<td>1.95</td>
<td>30.5</td>
<td>42.1</td>
<td>3.4</td>
<td>28 @ -35 dBc ACLR</td>
<td>N/A</td>
</tr>
<tr>
<td>[53]</td>
<td>CMOS 180 nm</td>
<td>2.4</td>
<td>34</td>
<td>34.9</td>
<td>3.3</td>
<td>N/A</td>
<td>WLAN 23.5 @ -25 dB EVM</td>
</tr>
<tr>
<td>[47]</td>
<td>CMOS 90 nm</td>
<td>0.93</td>
<td>29.7</td>
<td>25.8</td>
<td>2</td>
<td>N/A</td>
<td>LTE w/DPD 26 @ -25 dB EVM</td>
</tr>
<tr>
<td>[54]</td>
<td>0.8μm GaN</td>
<td>2.25-3.075</td>
<td>24-27</td>
<td>DE 50-58</td>
<td>15</td>
<td>N/A</td>
<td>N/A</td>
</tr>
<tr>
<td>[1]</td>
<td>CMOS 45 nm</td>
<td>1.5-2.4</td>
<td>30.2</td>
<td>23.8</td>
<td>16</td>
<td>25.1 @ -40.6 dBc ACLR</td>
<td>N/A</td>
</tr>
<tr>
<td>[5]</td>
<td>CMOS SOI 130 nm</td>
<td>1.9</td>
<td>32.4</td>
<td>47</td>
<td>6.5</td>
<td>29.4 @ -33 dBc ACLR</td>
<td>N/A</td>
</tr>
<tr>
<td>[12]</td>
<td>CMOS SOS 0.25 μm</td>
<td>1.88</td>
<td>21</td>
<td>44</td>
<td>3.9</td>
<td>IS-95 16.3dBm @ -42 dBc</td>
<td>N/A</td>
</tr>
<tr>
<td>[55]</td>
<td>0.8μm GaN</td>
<td>1-6</td>
<td>33±0.8</td>
<td>DE 37</td>
<td>20</td>
<td>N/A</td>
<td>N/A</td>
</tr>
<tr>
<td>This work</td>
<td>CMOS SOS 0.25 μm</td>
<td>1 - 1.8*</td>
<td>34.4</td>
<td>PAE 38</td>
<td>16</td>
<td>29.2 @ -33 dBc ACLR</td>
<td>LTE 26.3 @ -33 dBc ACLR</td>
</tr>
</tbody>
</table>

Table I summarizes the performance of the presented stacked PA implemented in CMOS SOS technology in comparison with other fully-integrated PAs reported in the literature. The insulating Sapphire substrate effectively eliminates the internodal parasitic capacitance of the stacked PA leading to identical signal amplitudes and zero phase differences across the stages of the stacked PA. The high input and optimum load impedance of the stacked PA facilitates a large operating bandwidth. The thermal imaging performed for the stacked PA verifies the optimum design of the PA in achieving balanced amplitude from each stacked cells. The PA achieves the highest output power in GHz operating range (2.75 Watts), while also achieving the highest BW for delivering more than 2 Watts of output power from 1 to 1.8 GHz. The stacked scheme eliminates the large area required for on-chip power combiners and hence achieves a compact design with a relatively high power density (1.27 W/mm$^2$) for CMOS PAs.
7. CONCLUSION

The increasing demand for high data-rate communication has recently drawn attention to the large available spectrum in the mm-wave frequency band as a potential candidate for achieving orders of magnitude higher capacity than the existing 3G and 4G standards. Implementing high performance broadband PAs in nanoscale CMOS technology, however, remains the main challenge to achieve highly integrated SoCs.

In this dissertation, a novel topology is proposed that uses transformer-coupled dynamically-biased stacked transistors in CMOS SOI and SOS technologies. The drain-source voltages of each stacked transistor cell are added constructively to increase the total output voltage swing. The buried-oxide layer in the SOI technology and the insulating substrate in SOS technology allow the drain-source voltage swings of all transistors to be added constructively without causing breakdown and leakage current to the substrate. High optimum load impedances are achieved by selecting the size, number, and topology of stacked cells which minimize the loss from on-chip impedance transformation networks and the associated performance degradations. Fully-integrated PAs implemented using the proposed topologies have been successfully demonstrated and achieve high linear output powers, high efficiencies with wide operating bandwidths at RF and mm-wave frequencies.

The future works of this dissertation may include the following areas:

(i) Implement high efficiency DC-DC converters: Although high voltage power supplies may be readily available in base station, radars, and satellite systems, generating high supply voltages efficiently in mobile and battery supported applications, nevertheless, remain challenging. The future work may include implementing high efficiency DC-DC converters integrated with the proposed stacked power amplifier.
(ii) **Modify circuit topology and architecture:** The stack topology has successfully demonstrated high performance at RF and mm-wave frequencies. Further increase of the number stacked cells, however, is limited by several mechanisms including the loss from additional input transformers, the input signal delay as the signal propagates through the transformer ladder, the loss from additional interconnections between stacked cells, and the \( f_{MAX} \) of large periphery transistors. The degradations, however, can be minimized by modifying the design of each transistor cell. For example, each cell can be designed with 2 or 3 CG transistors. The additional CG transistor does not require additional transformer and the penalty in layout area is much less severe. Another potential solution to further increase the output power is to incorporate the stacked PA design with power combining architecture. High output power in the range of several 10s of Watts may be achieved by combining the output power of several unit stacked PAs. Nevertheless, the power combiners would require careful design to minimize the efficiency and bandwidth degradation.

(iii) **Improve efficiency at power back-off:** Linear power amplifiers often achieve its peak efficiency near the saturated output power level. Hence PAs often suffer from low average efficiencies when transmitting high PAR (peak-to-average) signals. Various techniques such as envelope tracking (ET) and polar transmitter architectures have been proposed to improve the efficiency at power back-off. The average may be significantly improved by combining the linearization technique with the proposed PA. Other linearization techniques such as digital pre-distortion (DPD) can also be employed in the PA measurements to satisfy the stringent ACPR and EVM requirements of communication standards with complex modulation schemes.

(iv) **Improve the substrate transfer technology:** The proposed post-processing procedure presented in chapter 5 demonstrated a simple technique to remove the effects of parasitic capacitances and further improve the PA performance. Additionally, the high resistivity substrate may also reduce the substrate loss, and reduce substrate coupling. So far, the technique has been performed at chip level. A more sophisticated procedure may be developed to perform the substrate transfer at wafer level.
LIST OF REFERENCES
LIST OF REFERENCES


VITA
VITA

Jing-Hwa Chen received the B.S. degree in Electrical Engineering from National Central University, Taiwan, in June 2007, and the Ph.D. degree in Electrical and Computer Engineering from Purdue University in December 2013. In Fall 2012, he worked as an intern engineer at Peregrine Semiconductor, responsible for power amplifier design in CMOS technology. In Fall 2013, he worked as an intern engineer at Broadcom Corporation, responsible for low-dropout regulator (LDO) design in SiGe BiCMOS technology. His research interests include analog and RF circuit design, and CMOS power amplifier design.
PUBLICATIONS

**Patent**


**Journals**


**Conference Proceedings**


