Artificial Intelligence – The Announced Revolution in Communications

Mehdi Bennis

  (1)University of Oulu, Finland

Rui Campos

  (2)INESC TEC, Faculty of Engineering of the University of Porto

Artificial Intelligence (AI) is currently changing our everyday life in multiple domains, be it in the form of personal assistants, autonomous cars, or smart texting. This powerful technology will now be embedded in the communications networks that are the nervous system of the digital world in which we are immersed. AI will indeed be crucial for next generation networks. It will be there by design and enable network self-management and self-control towards fully autonomous networks operating similarly to other autonomous systems such as autonomous cars and autonomous aircrafts. For example, through AI the creation of the so-called network slices in 5G – i.e., application-oriented virtual networks over the physical infrastructure – will be fully autonomous, towards the intent-based networking paradigm envisioned for 6G.

The need for AI in next generation networks comes from the ever-increasing complexity of the underlying technologies, including an ever-increasing number of parameters that can be controlled and whose optimization according to the networking context is yet to be explored. Also, there is a need for making networks efficient and scalable from multiple perspectives including performance, energy consumption and privacy. This complexity leads to an increasing need for holistic solutions that take advantage of AI techniques and of the available computational capacity, either in the cloud or in the edge, as the only means to self-optimize the network operation dynamically and in real-time.

To solve this massive scalability challenge while addressing privacy, latency, reliability, energy and bandwidth efficiency, machine learning at the edge is of utmost importance! To date, progress in AI/Machine Learning (ML) has been driven by the availability of huge amounts of data and computing power, where a single powerful server typically located in a centralized and remote data center has access to a global dataset. However, the new breed of intelligent devices and mission-critical applications cannot rely on cloud-based AI/ML due to their real-time characteristics, intolerance to long latency, and high reliability. This has led to a huge interest in Federated Learning (FL), a new paradigm in which training data is stored across a large number of geographically dispersed devices.

Federated Learning (FL) is a distributed machine learning training framework, originally proposed by Google with use cases spanning a whole range of fields and applications, such as healthcare, intelligent transportation, industrial automation, and telecommunication networks. FL is designed to periodically upload devices’ model parameters (e.g., neural networks weights), during local training, to a parameter server that performs model averaging and broadcasts the resulting global model to all devices. FL is about training a high-quality centralized model in a decentralized manner – where training data is unevenly distributed, never leaves the device and every device has access to a tiny fraction of it –, while taking into account constraints in terms of bandwidth, convergence time (latency), energy and so forth.

There are several advantages in using this approach:

1. local inference at the network edge is bandwidth-efficient, reduces latency and enhances reliability. Rather than sending data to the cloud, inference is run directly on the device, and data is sent to the cloud only when additional processing is required;

2. making inference accurate and fast is instrumental in swiftly responding to local critical events;

3. device-generated private data can be utilized for training, while keeping data locally on the device, and the global model is learned by aggregating locally computed updates among devices or via a parameter server (base station, access point or any infrastructure node).



There are a myriad of challenges facing FL that must be addressed in terms of training algorithms, architectures, network topologies and conflicting requirements. For instance, a learning model may have a million parameters (as in self-driving vehicles), and hence a model update can be bandwidth-consuming, especially for thousands of IoT devices. Slow devices can undermine the training process due to poor computing capabilities or poor path-loss to the federating server. While vanilla FL is essentially about training a global model, adapting to local dynamics and generalizing to other tasks is key. FL is essentially a semi-centralized solution with a single point of failure, calling for a fully distributed and scalable learning framework. Moreover, since devices are resource-constrained, on-device FL must go beyond accuracy maximization and explore the entire gamut of accuracy, energy, robustness, and privacy tradeoffs.

Unlike centralized AI/ML, in edge AI/ML training, the data samples are generated and privately owned by each device. In this case, direct data sample exchange may be prohibited; hence, each device’s current model needs to be shared over stochastic and intermittent wireless links, subject to devices’ resource constraints and privacy guarantees. Furthermore, in contrast to centralized ML in which the training dataset is independent and identically distributed (IID), data is likely to be non-IID which may hinder convergence. Owing to the distributed nature of the data, wireless connectivity between the helper and devices and/or among devices may be intermittent, bandwidth or interference limited, hindering training convergence and the overall system performance. To address such problems, joint scheduling of devices can be used, transmitting only informative models instead of transmitting model updates at every time-slot, exploring peer-to-peer learning instead of solely focusing on centralized topologies, and the list goes on. These are the challenges the community will continue addressing in the future.

Edge AI will play a pivotal role in 5G and 6G, namely when it comes to the distributed, autonomous management and control of the network, towards the intent-based networking paradigm. Yet, cloud AI and the upcoming quantum AI will be crucial to solve problems that we cannot solve today and to complement edge AI. In order to continue to do so, a holistic approach is needed, not focusing exclusively on maximizing the learning accuracy and predictions, but also considering the impact on energy efficiency, resiliency, privacy, scalability and accessibility. This will be of utmost importance to bring new impactful players to AI/ML applied to communications networks and reduce the time and energy consumption of the training process of the AI models.