BEAMS

Graduate Wireless Networks Final Project - Aditya Tummala and Arjun Damerla, Prof. Parekh

Beamforming and Enhanced Algorithms for MmWave Systems

Table of Contents

 
GitHub Slides

Project Inspiration: https://ieeexplore.ieee.org/document/9024543

Abstract

Millimeter-wave (mmWave) small cell networks play an important role in 5G wireless communication systems. By densely deploying a large number of mmWave small cell base stations (SBSs), thousands of connections and high transmission rates are supported to provide a variety of local services. The SBS provides short-range communications to mobile terminals (MTs) to reduce the propagation loss of signal transmission. With the help of mmWave, multiple SBSs can utilize a large number of antennas to form directional analog beams to MTs and provide concurrent transmissions simultaneously. However, as the number of SBSs and MTs increases, it becomes increasingly difficult to use traditional signal processing methods to improve performance.

In the original paper, the authors follow a 3-step process:

This process runs faster than traditional channel estimation algorithms on average.

For our project, we attempt to utilize machine learning (ML) to show a novel ML-based method for concurrent transmission in mmWave small cell networks and compare various methods with one another.

Terminology

Data Generation

The first (and typically most important) step of any ML-based project is to collect lots of good data.

Problem is... we don't have access to this data, and most papers on beamforming do not share their data. As a result, we had to implement our own data generation script. This meant curating a large dataset that followed a HPPP. Here are the system parameters:

In addition to that, we can now derive that the average number of SBS's in a circular area around our MT is

$N_{\mathrm{S}}=\lfloor\lambda_{\mathrm{S}}\pi R^{2}\rfloor$

Furthermore, if we define the data stream from the $k$th SBS to our MT as $d_{S, k}$, where $1 \leq k \leq N_S$, and the transmit power of our SBS as $P_{S, k}$, then our downlink signal of our SBS can be derived as

$s_{\mathrm{S}\mathrm{B}\mathrm{S},k} = c_{S, k}d_{S, k}, \text{where } c_{S, k} \in \mathbb{C}^{N_{\mathrm{S}\mathrm{B}\mathrm{S}}\times 1}$

where $c_{S, k}$ is the $k$th SBS's analog beam (which directly points to the MT through the use of phase shifters). Our channel propagation for the $k$th SBS is actually based on the Saleh-Valenzuela model, a narrow band clustered channel model:

$\mathrm{H}_{\mathrm{S},k}=\gamma\sum_{l=1}^{L}\alpha_{\mathrm{S},k,l}\mathrm{a}_{\mathrm{M}\mathrm{T}}(\phi_{\mathrm{M}\mathrm{T},k,l})[\mathrm{a}_{\mathrm{S},k}(\emptyset \mathrm{s},k,l)]^{H}$

where $\gamma = \sqrt{\frac{N_{SBS}N_{MT}}{L}}$, and $\alpha_{\mathrm{M}\mathrm{T}}$ is the complex gain from the $l$th path.

Algorithm 1: mmWave Dataset Generation with Beam Selection
1. Calculate number of SBS:
$N_S = \lfloor\lambda_S \pi R^2\rfloor$
2. Generate DFT-based codebook $\mathbf{C} \in \mathbb{C}^{N_{SBS} \times N_C}$:
For $i = 0$ to $N_C - 1$:
$\theta_i = -\pi/2 + (i \cdot \pi)/N_C$
$\mathbf{c}_i = [1, e^{jkD_{SBS}\sin(\theta_i)}, \ldots, e^{jkD_{SBS}(N_{SBS}-1)\sin(\theta_i)}]^T/\sqrt{N_{SBS}}$
3. Generate channel matrix H for each sample:
For each path $l = 1$ to $L$:
$\phi_{AOA,l} \sim \mathcal{U}(-\pi/2, \pi/2)$ // Angle of Arrival
$\phi_{AOD,l} \sim \mathcal{U}(-\pi/2, \pi/2)$ // Angle of Departure
$\alpha_l \sim \mathcal{CN}(0, 1/\sqrt{2})$ // Complex path gain
$\mathbf{a}_{MT,l} = [1, e^{jkD_{MT}\sin(\phi_{AOA,l})}, \ldots, e^{jkD_{MT}(N_{MT}-1)\sin(\phi_{AOA,l})}]^T/\sqrt{N_{MT}}$
$\mathbf{a}_{SBS,l} = [1, e^{jkD_{SBS}\sin(\phi_{AOD,l})}, \ldots, e^{jkD_{SBS}(N_{SBS}-1)\sin(\phi_{AOD,l})}]^T/\sqrt{N_{SBS}}$
4. Construct complete channel matrix H:
$\gamma = \sqrt{N_{SBS} \cdot N_{MT}/L}$
$\mathbf{H} = \gamma \sum_{l=1}^L \alpha_l \cdot \mathbf{a}_{MT,l} \cdot \mathbf{a}_{SBS,l}^H$
5. Calculate optimal beam index for each channel:
For each beam $i$ in codebook:
$SNR_i = (P_S \cdot \|\mathbf{H} \cdot \mathbf{c}_i\|^2)/N_0$
$optimal\_beam= argmax_i(SNR_i)$
6. Feature vector construction for each sample:
$\mathbf{x} = [ \phi_{AOA,1}, \ldots, \phi_{AOA,L}, \quad$ // AoA features
$\quad\quad\;\; \phi_{AOD,1}, \ldots, \phi_{AOD,L}, \quad$ // AoD features
$\quad\quad\;\; \Re\{\alpha_1\}, \ldots, \Re\{\alpha_L\}, \quad$ // Real path gains
$\quad\quad\;\; \Im\{\alpha_1\}, \ldots, \Im\{\alpha_L\}, \quad$ // Imaginary path gains
$\quad\quad\;\; \vec(\Re\{\mathbf{H}\}), \vec(\Im\{\mathbf{H}\}) ]$ // Vectorized channel matrix
Output:
Features matrix $\mathbf{X} \in \mathbb{R}^{N_{samples} \times N_{features}}$
Labels vector $\mathbf{y} \in \mathbb{Z}^{N_{samples}}$, where $y_i \in [0, N_C-1]$
Notes:
1. The noise power $N_0$ is set to $10^{-13}$ W
2. Channel matrix $\mathbf{H} \in \mathbb{C}^{N_{MT} \times N_{SBS}}$
3. Each codebook vector $\mathbf{c}_i$ is normalized: $\|\mathbf{c}_i\|^2 = 1$
4. Path gains $\alpha_l$ follow complex normal distribution
5. Feature vector dimension: $N_{features} = 2L + 2L + 2(N_{MT} \cdot N_{SBS})$

Many of these parameters are tunable and can be adjusted to form either easier or harder problems. Furthermore, we can specifically curate a variety of channels the SBS can have in order to form a more diverse dataset. However, we stay with these parameters to simplify the problem to focus on the beam selection problem.

Why use SMOs?

One of the main problems we face here is that small base stations are very densely deployed, and their placements are changing with every snapshot.

This leads us to create training samples that use multiple snapshots of SBSs with the same density, as it allows us to improve our variance in our samples.

Each training sample will contain the following, assuming we have $L$ propagation paths:

We normalize our samples in order to prevent any inconsistencies from differing range values. For instance, all angles will be between $0$ and $2\pi$, all power will be measured in dB, etc.

In total, every sample will be formatted as a vector of dimension $1 \times (4L + 2)$.

If we were to use a traditional SVM, then we'd end up using $N_C$ separate classifiers in order to find hyperplanes that can cleanly separate our data. However, our data can be incredibly imbalanced (such as some beams having much fewer samples than other beams), which can introduce heavy bias and inaccurate predictions.

Therefore, by using a data-driven iterative SVM classifier, we can better address the issue of imbalanced data and ensure better prediction accuracy for analog beam selection.

Specifically, we can perform resampling/reweighting with our samples to deploy a more iterative process with each learning step to improve refinement.

Data-Driven Iterative SVM classifier

We make use of the Sequential Minimal Optimization algorithm, which solves the quadratic programming problem that comes up when training SVM's.

We first choose a subset of two candidate vectors from our set of $N_C$ candidate vectors, and use these two vectors to classify our training samples into two groups.

After each classification, one vector from our chosen two vectors is replaced by a new candidate vector not choosed before, and we repeat this process. In fact, this process will continue for $N_C - 1$ iterations (because by then each of our $N_C$ candidate vectors would have been chosen), making use of the SMO algorithm to train our SVM at each iteration.

Naive NN

The MmWaveNN is a fully connected feedforward neural network designed for beam selection in mmWave communication systems. It uses a series of linear layers, activation functions, and dropout to prevent overfitting and enhance generalization.

Training Process

The model is trained using the following process:

Overall, a very Naive NN with some basic features.

Advanced NN

The AdvancedMmWaveNN is a more advanced neural network model designed for beam selection in mmWave communication systems. It combines feature extraction, attention mechanisms, and residual learning to improve classification performance.

Model Components

Training Process

The training procedure includes:

Training Results

Below are the training histories for the different NN complexities, the normal SVM classifier, and Data-Driven Iterative SVM classifier:

Training History
Figure 1: Training History for the Naive NN (Best Acc: 90.47%)
Advanced Training History
Figure 2: Training History for the Advanced NN (Best Acc: 90.89%)
Training History
Figure 3: Training History for the normal SVM classifier (Best Acc: 91.18%)
Advanced Training History
Figure 4: Training History for the Data-Driven Iterative SVM classifier (Best Acc: 94.01%)

Final Thoughts and Future Modifications

From the data, we see that the iterative data-driven SVM classifier with SMO had the best accuracy. Furthermore, this version would scale the best out of the four, as the iterative nature of this implementation allows for more candidate vectors to be added to the network (via a HPPP distribution) without the runtime scaling up exponentially. Our normal SVM implementation, since it uses the "one-to-one" approach, will increase dramatically in runtime with every new candidate vector that gets added.

However, one of the main hurdles in our project was generating data that stayed true to the data used in the original paper. We went through many different types of datasets before we finally landed on a script that gave us the most accurate dataset (relative to the paper's dataset). It would've been very helpful if the paper provided the dataset they used for their own simulations.

In terms of future improvements, one idea we could implement is PCA, as this would help reduce the dimenionality problem (there are many features that need be juggled when running these models), and thereby reducing runtime while still keeping accuracy relatively high.

Furthermore, the paper had many more than just one mobile terminal, resulting in a more realistic network in terms of concurrent transmission. While we attempted to have multiple MT, this dramatically increased runtime, making our models unusable. If we were to get a hold on more sophisticated GPU's for faster computing, we could implement our models to house multiple MT's.