Joint Beamforming Design and Bit Allocation in Massive MIMO with Resolution-Adaptive ADCs
Abstract
Low-resolution analog-to-digital converters (ADCs) have emerged as a promising technology for reducing power consumption and complexity in massive multiple-input multiple-output (MIMO) systems while maintaining satisfactory spectral and energy efficiencies (SE/EE). In this work, we first identify the essential properties of optimal quantization and leverage them to derive a closed-form approximation of the covariance matrix of the quantization distortion. The theoretical finding facilitates the system SE analysis in the presence of low-resolution ADCs. We then focus on the joint optimization of the transmit-receive beamforming and bit allocation to maximize the SE under constraints on the transmit power and the total number of active ADC bits. To solve the resulting mixed-integer problem, we first develop an efficient beamforming design for fixed ADC resolutions. Then, we propose a low-complexity heuristic algorithm to iteratively optimize the ADC resolutions and beamforming matrices. Numerical results for a MIMO system demonstrate that the proposed design offers improvement in both SE and EE with fewer active ADC bits compared with the uniform bit allocation. Furthermore, we numerically show that receiving more data streams with low-resolution ADCs can achieve higher SE and EE compared to receiving fewer data streams with high-resolution ADCs.
Index Terms:
Beamforming, bit allocation, massive MIMO, low-resolution ADCs, spectral efficiency, energy efficiencyI Introduction
Massive multiple-input multiple-output (MIMO) is a crucial physical-layer technology for wireless communications at both sub-6GHz and millimeter wave (mmWave) frequencies [heath2016overview], addressing the increasing demand for high data rates [jiang2021road]. The large number of antenna elements in massive MIMO significantly improves spatial multiplexing gain through beamforming techniques. Digital beamforming (DBF) architectures, which deploy a dedicated radio-frequency (RF) chain for each antenna element, can enable high spectral efficiency (SE) but incur substantial energy costs due to power-intensive RF components, especially analog-to-digital converters (ADCs). For instance, a high-speed ADC operating at Gsample/s with high resolution (e.g., – bits) can consume several Watts [li2017channel]. Furthermore, its power consumption increases linearly with the signal bandwidth and exponentially with the number of resolution bits [murmann2015race, Atz21_hw], posing a significant challenge to the system’s energy efficiency (EE). Consequently, the integration of low-resolution ADCs and DBF has emerged as an effective strategy to curtail power consumption without unduly compromising the SE [liu2019low].
Another attractive solution in this regard is to utilize hybrid beamforming (HBF) architectures, where a small number of RF chains is connected to the antenna array through a network of phase shifters or switches [mendez2016hybrid, ma2021closed, ma2022switch]. However, HBF architectures have limited multiplexing capabilities and strongly depend on the calibration of the analog components [roth2018comparison]. Consequently, DBF requires less circuit cost to achieve a SE similar to HBF [yan2019performance], which makes the former more energy efficient, especially when using low-resolution ADCs [roth2018comparison, castaneda2021resolution]. The water-filling (WF) power allocation achieves the capacity of a full-resolution MIMO system with perfect channel state information (CSI) at both the transmitter and receiver [tse2005fundamentals]. However, it becomes suboptimal in the presence of quantization, necessitating a more efficient design. Furthermore, adopting resolution-adaptive ADCs can enable order-of-magnitude power savings for realistic mmWave channels [castaneda2021spawc]. These considerations motivate us to focus on the design and analysis of fully digital architectures with resolution-adaptive ADCs.
I-A Prior Works
Recent years have witnessed a proliferation of studies on low-resolution massive MIMO transceivers, exploring various quantization techniques including one-bit, multi-bit, mixed-bit, and variable-bit quantization. One-bit quantized systems have been extensively investigated in the literature due to their simplicity and tractability [singh2009limits, mo2015capacity, mezghani2008analysis, mezghani2007ultra, li2017channel, atzeni2021channel]. Specifically, Mo et al. [mo2015capacity] derived the exact channel capacity with perfect CSI at both the transmitter and receiver of a multi-input single-output system. It was shown in [mezghani2008analysis] that with only receive CSI, quadrature phase-shift keying (QPSK) signaling is the capacity-achieving distribution in single-input single-output systems, unlike in the full-resolution case where a Gaussian codebook is optimal. At low signal-to-noise ratio (SNR), the mutual information of a one-bit quantized MIMO system decreases by a factor of compared with a full-resolution one [mezghani2007ultra]. Although one-bit quantization has low power consumption and hardware cost, it significantly limits the SE performance [orhan2015low, li2017channel]. In this regard, it was shown in [li2017channel] that an error floor exists at high SNR for the channel estimator due to coarse quantization and that at least – times the number of antennas is required to attain an SE comparable to that of a full-resolution system.
The limitations of one-bit quantization have sparked widespread interest and research on low-resolution systems with multi-bit (– bits) quantization. It was shown in [singh2009limits, jacobsson2017throughput] that a system using very few bits can approach the performance of a full-resolution one. Mezghani et al. [mezghani2012capacity] derived a closed-form lower bound for the capacity of a point-to-point MIMO system. More recent works focused on beamforming designs [mezghani2009transmit, jacobsson2017quantized, ling2019performance]. Furthermore, mixed-ADC systems, which simultaneously deploy one-bit and high-resolution ADCs, are shown to perform better than fixed-resolution architectures, especially at high SNR [zhang2016mixed, zhang2017performance, pirzadeh2018spectral]. On the other hand, variable-resolution ADCs have been studied in [bai2013optimization, ahmed2017joint, choi2017resolution, nguyen2020energy, prasad2020optimizing, castaneda2021resolution, castaneda2021spawc] to flexibly balance the SE-EE tradeoff of low-resolution systems. For instance, it was shown in [bai2013optimization, ahmed2017joint, choi2017resolution, nguyen2020energy] that efficient bit allocation strategies can offer a higher EE compared with uniform-resolution architectures. Castañeda et al. [castaneda2021resolution] developed a resolution-adaptive fully digital receiver within an application-specific integrated circuit (ASIC). Furthermore, they demonstrated that a -antenna base station with resolution-adaptive ADCs serving users allows to reduce the power consumption by times compared with a traditional fixed-resolution design [castaneda2021spawc]. Additionally, the gain in SE can be achieved by jointly optimizing the transmit power and ADC resolutions [prasad2020optimizing].
Many of the aforementioned works utilize the arcsine law [jacovitti1994estimation] to facilitate the analysis and design of one-bit systems. For systems with a few bits, two primary methods are used to model quantization, i.e., the additive quantization noise model (AQNM) [gersho2012vector, fletcher2007robust, orhan2015low] and the Bussgang decomposition [bussgang1952crosscorrelation]. Both the two approaches approximate the (nonlinear) quantization function with a linear model. However, in the literature, there are two distinct linear approximations referred to as the AQNM. The first is [gersho2012vector]
(1) |
where and denote the quantization function quantization error, respectively. The second is [fletcher2007robust]
(2) |
where is a constant depending on the quantizers and on the distribution of , and represents the quantization distortion (QD). Both (1) and (2) can be employed to analyze the worst-case system performance [diggavi2001worst, hassibi2003much] assuming that or is a Gaussian variable uncorrelated with . Model (2) was first derived in [fletcher2007robust] and named AQNM later in [orhan2015low]; it was also called the pseudo-quantization noise model in [zhang2016mixed]. Although (2) and Bussgang decomposition were developed from separate technical lineages, it was shown in [demir2020bussgang] that the former is nothing but the latter tailored for the case of quantization. Therefore, we call the model in (2) as the Bussgang-based AQNM (BAQNM) while we refer to (1) as the AQNM for distinction. The AQNM is typically less accurate than the BAQNM because the assumption that is uncorrelated with is generally not satisfied. In contrast, is uncorrelated with based on the properties of the Bussgang decomposition. Furthermore, the QD covariance is a key ingredient for the performance analysis and optimization with the BAQNM. A diagonal approximation of the QD covariance matrix was derived in [mezghani2012capacity, bai2013optimization], which has since then been widely used in the literature. However, the error arising from this diagonal approximation can be substantial in some cases, raising a major concern about the reliability of the corresponding results [demir2020bussgang, prasad2020optimizing].
I-B Contributions
Previous works [mezghani2009transmit, jacobsson2017quantized, ling2019performance, bai2013optimization, choi2017resolution, nguyen2020energy] focus on either beamforming design [mezghani2009transmit, jacobsson2017quantized, ling2019performance] or bit allocation [bai2013optimization, choi2017resolution, nguyen2020energy]. The joint optimization of the two aspects is promising to achieve higher SE and provide deeper insights into the SE-EE tradeoff, as shown in [ahmed2017joint, prasad2020optimizing]. However, the transmitter design was not considered in [ahmed2017joint], whereas in [prasad2020optimizing] the receive beamforming was omitted. Unlike previous studies, this paper focuses on analyzing the BAQNM and the quantization distortion, alongside the joint design of the transmit-receive beamforming and bit allocation for point-to-point MIMO systems utilizing resolution-adaptive ADCs. The specific contributions of this paper are summarized as follows:
-
•
We first identify the essential properties of optimal quantization. Leveraging these properties and the Bussgang decomposition, we reestablish the BAQNM and the diagonal approximation of the QD covariance matrix, offering a new perspective compared to [mezghani2012capacity, bai2013optimization]. The analysis shows that the BAQNM and the QD covariance approximation typically hold under the assumption of Gaussian signals undergoing optimal quantization. Furthermore, we examine the connections between applying BAQNM and the arcsine law to one-bit quantization. The consistency in results obtained from these two methods validates our findings.
-
•
Building upon the above theoretical findings, we consider the joint transmit-receive beamforming design and bit allocation problem to maximize the SE subject to the constraints on the transmit power budget and total active bits of ADCs. This design problem is inherently complex due to its mixed-integer nature. We address this by first determining the beamformer under fixed ADC resolutions. Subsequently, we propose a low-complexity algorithm to iteratively optimize the ADC resolutions and the beamforming matrices.
-
•
Extensive numerical simulations verify the superiority of the proposed schemes. Specifically, the results show that the proposed beamforming design significantly outperforms conventional WF solutions in low-resolution systems, especially with one-bit quantization and high SNR. Furthermore, the benefit from bit allocation is clearly demonstrated. For example, in a MIMO system, the proposed design offers improvement in both SE and EE, while requiring fewer active ADC bits compared with uniform bit allocation. When using a total of bits over the RF chains, the former achieves improvements of in SE and in EE compared to the latter. Moreover, the SE-EE comparison shows that receiving more data streams with low-resolution ADCs can achieve higher SE and EE than receiving fewer data streams with high-resolution ADCs.
I-C Organization and Notations
The rest of this paper is organized as follows. In Section II, we present the signal model and quantization model. The BAQNM and the approximation of the QD covariance are then derived in Section III. We delve into the joint transmit-receive beamforming and bit allocation design in Section LABEL:sec:transceiver_design. Finally, we provide simulation results and conclusions in Sections LABEL:sec:simulation and LABEL:sec:conclusion, respectively.
Scalars, vectors, and matrices are denoted by the lowercase, boldface lowercase, and boldface uppercase letters, respectively. Furthermore, we use , , , and to represent the conjugate, transpose, conjugate transpose, and matrix inverse operators, respectively. signify the Frobenius norm for matrices. In addition, the expectation and trace operators are represented by and . We use and to denote the absolute value of the scalar and the determinant of matrix , respectively. The real and imaginary part operators are denoted by and , respectively. Moreover, yields a diagonal matrix with its diagonal entries being the elements of , while returns a vector with its elements being the diagonal entries of . Finally, we use and to represent the cross-covariance matrix between and and the auto-covariance matrix of , respectively.
II System Model
II-A System Model
We consider a point-to-point MIMO system where a transmitter (Tx) with antennas communicates with a receiver (Rx) with antennas. We assume that the Tx is equipped with high-resolution digital-to-analog converters while low-resolution ADCs are deployed at the Rx. Let () be the transmitted signal vector. We assume that follows the Gaussian distribution and . Furthermore, let be the precoding matrix with the power constraint . Here, denotes the transmit power budget of the Tx. The received signal (without quantization) at the Rx can be written as
(3) |
where denotes the channel between the Tx and the Rx, and denotes the additive white Gaussian noise (AWGN) vector, , with being the noise power. Here, we assume that is quasi-flat during each coherence time. Furthermore, to characterize the system performance bound, we assume the availability of perfect CSI at both the Rx and the Tx [mo2015capacity, ling2019performance]. Channel estimation with adaptive-resolution ADCs was studied in [wang2022channel]. Furthermore, an ASIC receiver integrating both resolution-adaptive ADCs and a channel estimation module was developed in [castaneda2021resolution].
II-B Signal Model with Quantization
We denote the codebook of a scalar quantizer of bits as , where is the number of output levels of the quantizer. The set of quantization thresholds is , where and allows inputs with arbitrary power.111In practice, the input signal of ADCs outside the range can be clipped into the range of where is an adjustable parameter depending on the constraints of hardware components, e.g., the automatic gain control (AGC). Let denote the quantization function associated with and . For a complex signal , we have , with , where for . is obtained in a similar way.
The Bussgang decomposition applied to a vector space in the complex domain is presented in [demir2020bussgang]. Specifically, let denote a scalar quantization function and be the quantized output of . We can write or equivalently , where and denote the -th element of and , respectively; represents the associated quantization function. For the circular-symmetric Gaussian random vector , the Bussgang decomposition implies
(4) |
where denotes the Bussgang gain, and the distortion term is uncorrelated to . In (4), represents the QD vector with its covariance matrix given by
(5) |
Furthermore, under some mild assumptions, the Bussgang gain is shown to be diagonal, as detailed in the following lemma.
Lemma 1 ([jacobsson2017quantized, bjornson2018hardware, demir2020bussgang])
Consider a jointly circularly symmetric Gaussian random vector fed into scalar quantizers. With (4) modeling the quantization, we have with being the -th element of .
Substituting (3) into (4), we obtain the quantized version of the signal received at the Rx, expressed as
(6) |
where represents the effective noise with covariance matrix . The post-combined signal at the Rx is expressed as
(7) |
where denotes the combining matrix. Although is Gaussian distributed, does not follow a Gaussian distribution because of the non-linear quantization distortion. However, we can treat the effective noise vector as a Gaussian variable and obtain a lower bound of the SE as [hassibi2003much]
(8) |
It is observed that the Bussgang gain and the QD covariance matrix are necessary for further analysis and optimization of the SE performance. For one-bit quantization, closed-form expressions for and can be derived based on the arcsine law [li2017channel]. However, obtaining those for multi-bit quantization is significantly more challenging. A closed-form expression of and a diagonal approximation of were developed in [mezghani2012capacity, bai2013optimization] under the assumption that the quantizer satisfies the following properties:
(9) | |||
(10) |
where . However, the validity of these assumptions remains unclear, and thus the applicability of these results to general signal distributions and quantizers is uncertain. In the next section, we derive the BAQNM and diagonal approximation from a new perspective, aiming to clarify this uncertainty.
III BAQNM and Approximation of the QD Covariance
In this section, we first identify the fundamental properties of optimal quantizers in Lemma 2 and Lemma 3 and then leverage them to obtain the BAQNM and the approximation of the QD covariance. Furthermore, we elaborate on the nuances between applying the BAQNM and the arcsine law to one-bit quantization.
III-A Properties of Optimal Quantizers
We first recall the definition of the optimal quantizer [max1960quantizing] below.
Definition 1 ([max1960quantizing])
Consider a real-valued random variable . Let denote its probability density function (PDF), and let be its quantized approximation, where satisfies . The mean square error (MSE) for the quantization can be expressed as
(11) |
The optimal quantizer is the one that minimizes .
By setting the derivatives of with respect to and to zeros, we obtain
(12) | |||
(13) |
which are referred to as the nearest neighbor condition and the centroid condition, respectively, [gersho2012vector, Chapter 6]. They are necessary for the optimal quantizer, also known as the Llyod-Max quantizer [max1960quantizing] or the optimal non-uniform quantizer. The latter term follows the fact that the optimal quantizer is generally non-uniform. The uniform quantizer that minimizes in (11) is referred to as the optimal uniform quantizer.
Remark 1
The centroid condition requires that the output of the quantization for each interval is its mean value. This condition can also be written as [gersho2012vector]
(14) |
which was used in [fletcher2007robust] as a basic assumption for deriving the model (2). Therefore, the BAQNM is limited to the optimal quantizer.
The Llyod-Max algorithm [max1960quantizing] iteratively updates and based on (12) and (13) to find the optimal quantizer for a specific input signal. However, this iterative method requires a long run time, especially for high-resolution quantization. In what follows, we propose an optimal quantization without running the Lloyd-Max algorithm. To this end, we begin with identifying the fundamental properties for the optimal quantization of Gaussian signals in the following lemma.
Lemma 2
Let be a real-valued, zero-mean, and unit-variance random variable, and let . Then, we have
(15) | |||
(16) |
where and denote the optimal quantized output of and , respectively.
Proof:
See Appendix LABEL:prof:scaling_and_distortion_invariance. ∎
We refer to as the distortion factor and the properties in (15) and (16) as the scaling property and distortion invariance, respectively. Utilizing the scaling property and the optimal quantizer for the standard Gaussian signal [max1960quantizing], we can derive the optimal quantization for any Gaussian signal with a known variance. For example, we can obtain the optimal element-wise quantization of the received signal vector in (3) with covariance matrix
(17) |
Regarding the distortion factor, we note the following property.
Lemma 3
For a zero-mean complex random variable with variance , assume that and are independent and identically distributed (i.i.d.) with the same variance and are independently quantized by two identical Llyod-Max quantizers . With , we obtain
(18) | |||
(19) | |||
(20) |