The Federal University of Pará (UFPA), the Federal University of Southern and Southeastern Pará (UNIFESSPA) and The University of Texas (UT), Austin, invite you to participate in the ITU Artificial Intelligence
Detailed information about it can be found on the Challenge website, which includes the document “ITU AI/ML 5G Challenge – Applying AI/ML in 5G networks. A Primer”.
The companion channel estimation challenge using Raymobtime datasets is described in the “Site-specific channel estimation with hybrid MIMO architectures challenge (ML5G-PHY@NCSU [channel estimation])“.
In the subsequent sections, we present the “Machine Learning Applied to the Physical Layer of Millimeter-Wave MIMO Systems” (ML5G-PHY [beam selection]), which is part of the “ITU Artificial Intelligence/Machine Learning in 5G Challenge.” and is based on Raymobtime datasets.
The ML5G-PHY [beam selection] task consists of selecting a set of k good beams (top-k classification) given information about the environment. Participants are encouraged to design their own feature extraction module to process the “raw” data, provided as Raymobtime datasets. Alternatively, if the participant wants to focus only on the machine learning module, he/she can use feature sets obtained with baseline feature extraction modules.
The ML5G-PHY [beam-selection] challenge assumes a millimeter (mmWave) MIMO system operating at 60 GHz on downlink. The scenario is a vehicle to infrastructure (V2I) network where a single transmitter (Tx) is located at a base station on the street curb and the receivers (Rx) are positioned in vehicles with their antennas on the top of the vehicle. A single-user analog MIMO architecture is adopted. Both Tx and and all Rx’s are equipped with uniform linear arrays (ULAs) with Nt=32 and Nr=8 antennas, respectively. The MIMO communication channel is assumed to be narrowband (it is not frequency selective). Therefore, each channel H corresponds to a complex-valued matrix of dimension Nr x Nt.
There are two sets of pre-defined DFT (discrete Fourier transform) codebooks (or sets of vectors), one for the Tx with Nt vectors and another for the Rx with Nr vectors. There is no noise involved (equivalent to an infinite signal-to-noise ratio). Among all possible 32 x 8 = 256 pairs of indices, the k-top beam-selection task is to select a subset S of k pairs (i,j) for communication, where i is the i-th vector of the Tx codebook and j is the j-th vector of the Rx codebook, such that the optimum index pair (i*,j*) that leads to the strongest combined channel is contained in S. In case k=1, the problem can be interpreted as conventional classification.
Details of the communication system model are provided in references [1-3] and references therein.
The ML model to be developed (e.g., a neural network or decision tree) takes input features extracted from the so-called s008 Raymobtime multimodal dataset, composed of LIDAR point clouds, images from cameras and positions of vehicles as indicated in Figure 2. Some baseline features are provided, but the participants are encouraged to design their own feature extraction system using a single modality (e.g. LIDAR data) or fusing data from distinct modalities. The ML model must output a probability distribution (for example, using a softmax activation in the output layer) over all pairs of indices (i,j). This distribution allows one to rank the pairs and select the k with highest probability. A k-top error is observed when the best pair (i*, j*) is not in the k-top subset suggested by the ML model.
We provide deep neural networks as baselines. You will find open-source implementations in Python for the baselines here.
We provide data at two distinct “levels”: the s008 Raymobtime dataset, which can be seen as raw data, and an easier-to-use set of baseline features that were extracted from s008 and are compatible with the baseline ML models.
The adopted s008 dataset can be downloaded from here and some of its main characteristics are listed at the Raymobtime site. Note that other Raymobtime datasets are available (you can use for your research, work, etc.), but this challenge is restricted to using s008 for training (you cannot use s009 or any other dataset).
The participants can aim at improved performance by creating their own feature extraction modules (or frontends) and generating customized feature sets for the ML stage. For that, one needs to write feature extraction code (for example, to convert LIDAR point clouds into sensible features) and work directly with raw files. Note that all s008 raw files correspond to more than 100 GB (the compressed LIDAR point clouds are the largest file).
Alternatively, the participant can stick with the provided baseline features and do not implement his/her own feature extraction module. The baseline features for the test set will be provided. These features will be obtained with the same baseline frontend used to generate the training set baseline features. Therefore, when sticking to the baseline features / frontend, the participant can focus on ML only, avoiding the download of large raw files and writing frontend code.
The test dataset (s009) can be download from here, and some of its main characteristics are listed at the Raymobtime site. Recall that one should use s009 as a “validation” set, to evaluate the model and perform model selection (choose the model hyperparameters, such as the number of neurons, layers, etc.), but not use s009 data directly for training.
For assessing beam-selection we will use top-k classification error (for example, called top_k_categorical_accuracy in Keras), with k=10 beam pairs. In other words, instead of requiring the ML model to always indicate the best pair of beams, we allow it to output a list of k candidates and aim that the best is among the candidates. The top-k classification was used in , while  and  adopted regular classification (equivalent to k=1).
Along with the code to reproduce the result, the participant must provide the predicted output for the test set as a CSV (comma-separated values) file. Each row corresponds to the ML output for the given input. Because there are 256 possible beam pairs, each row corresponds to a distribution over the 256 “output classes” (that is, each row has 256 real numbers between 0 and 1, and summing up to one).
The participants will be provided a test set in two distinct formats:
raw datasets and baseline features. The latter aims at the participants who do not want to explore feature extraction, and will use the provided baseline features to explore distinct ML models (different deep neural network architectures, etc.). Alternatively, the participant can deal directly with the raw features and design his/her own feature extraction module. In the test stage, the participant can assume the input data is using either representation: raw data or baseline features.
The test dataset
is not labeled s009 is disjoint from the provided training set s008. The dataset s009 incorporates not only the data that is used to generate the inputs to the ML model, but also the ray-tracing data. Having the ray-tracing data, the participants can generate the correct labels and assess / tune the models trained with s008. The labeled training set s008 can be split by the participant into training and validation sets, such that the validation can be used for model selection. Or, alternatively, s009 can be used for validation.
In case of a draw among distinct teams with respect to top-10 classification, we will break it using top-1 classification.
Models must be trained only with data included in the provided s008 datasets. It is not allowed to use additional data extracted from other datasets such as s009, but automatic data augmentation techniques can be eventually used. Previously trained models (for example, VGGNet, ResNet, Inception, etc.) can be incorporated to the solution only if they are publicly available and their URLs are properly cited in the submitted report.
You can participate in teams. The team members should be announced at the enrollment stage and will be considered to have an equal contribution.
The enrolled participants will receive access to upload files to a cloud storage server. Each team is required to upload (see extra information and requirements here):
- CSV file with the model output for the provided test set. This ASCII file will have Ne rows with 256 values each, where Ne is the number of examples in the test set. The ordering scheme of these 256 values is discussed here. In case the participant adopts a row-wise ordering, the organizers will try to convert to the correct ordering, but the participants are strongly encouraged to use the correct ordering.
- Their own ML model, which can consist of a neural network, decision trees, or any other model trained solely with the provided data.
- Report describing the proposed solution (up to a maximum of 3 pages) written preferably in English, although Portuguese, Spanish or French are also allowed. This report will not be disclosed such that the participant can eventually publish it elsewhere if desired.
- Source code of the proposed solution for the test stage, including both frontend and ML modules in case of adopting customized input features, or only the ML module if adopting the baseline frontend.
- Source code of the proposed solution for the training stage, including both frontend and ML modules in case of adopting customized input features, or only the ML module if adopting the baseline frontend.
In summary, the provided code and ML model must allow us to reproduce the reported results for the test set s010 and also regenerating (training) the ML model based on s008.
To actually rank the models we will use the s010 dataset, which will not be disclosed to the participants and was generated in a similar way as s009 (and s008, same 3D scenario, traffic pattern, etc.).
All participants of the ML5G-PHY [beam selection] task are required to register at the ITU website and also enroll the teams using the following email: email@example.com. In the email, please inform the team name, the name of each participant (recall that each one must have registered individually at the mentioned ITU website) and an email for contact (if not the one used for enrollment). We will then send a confirmation email of team enrollment in less than four days.
[NEW] Extra and important information about the submission process can be found here.
[NEW] An example of submission is available here (at github), in folder “submission_baseline_example”. This is an example that uses the baseline features (there is no customized frontend involved), and the scripts to obtain the frontend features are not included. And in the folder “evaluation_example” in the same github repository, one can find scripts that indicate how the submissions will be evaluated. This minimizes problems with details such as the method to reshape matrices into vectors, as discussed in the script made available here.
Training datasets → June 15, 2020
Baseline datasets and code → June 20, 2020
Test datasets → September 30, 2020
Submission (Global round) → October 15, 2020 at 23:59 UTC-10 (https://time.is/en/UTC-10).
Award (Global round) → October 2020, to be defined by ITU
Instead of an email list, the ML5G-PHY [beam selection] problem will be discussed at https://itu-challenge.slack.com. Instructions to join the Slack channel are available at https://join.slack.com/t/itu-challenge/shared_invite/zt-eql00z05-CXelo7_aL0nHGM7xDDvTmA.
Webmasters and support
Carlos Eduardo Dias
When using Raymobtime datasets/codes or any (modified) part of them, please cite this first paper:
 A. Klautau, P. Batista, N. González-Prelcic, Y. Wang and R. W. Heath Jr., “5G MIMO Data for Machine Learning: Application to Beam-Selection using Deep Learning” in Information Theory and Applications Workshop (ITA), Feb. 2018. DOI: 10.1109/ITA.2018.8503086. PDF preprint is also available here.
 Y. Wang, A. Klautau, M. Ribero, M. Narasimha and R. W. Heath, “MmWave Vehicular Beam Training with Situational Awareness by Machine Learning,” 2018 IEEE Globecom Workshops, Abu Dhabi, United Arab Emirates, 2018.
 A. Klautau, N. González-Prelcic and R. W. Heath, “LIDAR Data for Deep Learning-Based mmWave Beam-Selection,” in IEEE Wireless Communications Letters, vol. 8, no. 3, pp. 909-912, June 2019.
 M. Y. Takeda, A. Klautau, A. Mezghani and R. W. Heath, “MIMO Channel Estimation with Non-Ideal ADCS: Deep Learning Versus GAMP,” 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), Pittsburgh, PA, USA, 2019, pp. 1-6.