Cognitive network slice management

The SliceNet approach

                               

Salvatore Spadaro                                                       Kenneth Nagin
Technical University of Catalonia                                     IBM Israel
spadaro(at)tsc.upc.edu                                             nagin(at)il.ibm.com

One of the enablers of 5G and beyond networks is the provisioning of network slices with proper Quality of Experience (QoE) guarantees to meet the requirements of vertical use cases. It poses several challenges to the proper management of the network slices. This is particularly challenging when provisioning multi-domain 5G slices in which several network service providers are required to provision their end-to-end slices.

Traditional manual network management techniques are not adequate for handling the demands of these more complex and time-sensitive scenarios. It is thus mandatory for network slice providers to adopt cognitive network slice management that leverages machine-learning techniques to proactively maintain the network infrastructures and assure the end-to-end QoE of slice users.

Architecture for machine learning-aided slice management

5G networks have undergone a major paradigm shift by adopting softwarization, virtualization and cloud computing technologies. While this new paradigm leads to important benefits, such as reduced capital expenditure, reduced operating expense and improved flexibility, the management of such networks involves huge technical challenges due to the significant complexity introduced. Among others, the challenges include estimating QoE Key Performance Indicators (KPIs) from monitored metrics and reconfiguration operations (remedial actuations) required to support and maintain the desired quality levels. Consequently, conventional network management models dominated by human interventions have become prohibitively expensive and even unviable in many cases. The new trend in managing 5G networks, and in general softwarized/virtualized networks, is to leverage the promising capabilities empowered by Artificial Intelligence (AI) and Machine Learning (ML) techniques in achieving network automation and autonomous network management.

In this context, the 5G PPP project SliceNet [1], has focused on the research and development of AI/ML-based network management for 5G networks, to meet the challenges presented by 5G use cases like Smart Grid, eHealth, Smart City, and others [2].

SliceNet is devoted to the provisioning and management of network slices with Quality of Experience (QoE) guarantees on top of a shared 5G network infrastructure, contemplating a multi-role scenario, in which multiple Network Service Providers (NSPs) offer their infrastructure capabilities to the upper layer Digital Service Provider (DSP), who interfaces with the verticals and brokers among the multiple NSPs, in order to materialize end-to-end (E2E) network services requested by vertical services. To this end, SliceNet has defined a Cognition Plane, a framework which enables 5G control and management systems with the capacity of QoE awareness of slice provisioning and life-cycle management. In particular, it covers the functions needed to monitor, estimate and predict relevant metrics that affect the QoE of the provisioned network services as well as the functions and modules that govern runtime (re-)configurations of the underlying physical and virtual infrastructure to maintain the required QoE levels.

Cognition Plane

In general, the Cognition Plane embraces the monitor-analyse-plan-execute process governed by a knowledge-base (MAPE-K, [3]) approach for automated and autonomic network management. In particular, it has been designed to support machine learning for the monitoring and analysis steps, as well as for creating new knowledge (see figure). QoE monitoring separates the acquisition of monitoring data from the processing of the data and transforms it into slice QoE metrics. The analysis step uses the acquired knowledge to assess the slice QoE and possible impact on corrective actions. The planning and execution steps (called Actuation Framework in the figure) are governed through a Policy Framework, which states declarative rather than imperative rules that need to be applied to maintain the expected QoE levels. All in all, this Cognition Plane defines a holistic solution that leverages machine learning for the optimised management of slices that support 5G vertical services.


The figure depicts a schematic of the Cognition Plane along the main relationships with other SliceNet sub-systems.

The central piece of the Cognition Plane is the data acquisition system. In this regard, a Data Lake approach is followed [3], which acts as the Knowledge-base (KB) of the MAPE-K loop. All data sources are logically merged into one data store, and analysis outcomes are shared through it. Monitoring functions extract data from the underlying 5G infrastructure through the capabilities of the SliceNet Control Plane (NSP level), which is then normalized and aggregated for its storage at the shared Data Lake (DSP level). This paradigm is used at the NSP level to manage slices deployed at its underlying infrastructures to offer a Network Slice as a Service (NSaaS) towards the DSP level. The DSP constructs and manages E2E slices and offers an NSaaS to its vertical customers. Infrastructure metrics outputs are collected and persisted to support traditional monitoring as well as for ML model training and for extracting QoS metrics.

External feedback from verticals

Moreover, a vertical’s feedback mechanism that allows for the vertical customers of the deployed slices to express their experience with the provisioned infrastructure is also supported. The mechanism to support inclusion of external feedback from the vertical is among the innovations of the SliceNet project, and it allows to progress towards the inclusion of the Verticals in the whole management and control of deployed slices. In fact, this vertical feedback approach is under discussion with the ITU-T Focus Group on Machine Learning for Future Networks including 5G.

The feedback from the verticals is combined with internal QoS metrics to allow the data processing applications to assume the role of QoE sensors, learning and estimating the verticals’ perception. Data-operations applications may be deployed for each slice to filter relevant data, aggregate slice metrics, etc. As such, flexible QoE sensors may be employed, ranging from simple aggregation and transformation tasks to inference of elaborate ML models.

The Data Lake enables the loose coupling of the analysis and reaction functions. The Analyzer component holds all the trained ML models used to gain insights about the underlying physical/virtual infrastructure and the provisioned slices. The multiple ML models implement analytical functions that serve as advanced monitoring functions for the Actuation Framework. As such, the ML models poll the data stored at the centralized Data Lake and, after their analysis and learning, insert their insights and predictions as elaborated data back to the Data Lake.

Actuation Framework

This elaborated data is then employed as stimulus for the Actuation Framework. The Actuation Framework is the part of the Cognition Plane responsible for planning required (re-)configurations of the network infrastructure as well as deployment of new elements and functions on the configured services to remedy undesired situations, like faults or underperformance. To this goal, the Actuation Framework focuses on determining the required changes to E2E slices to support the verticals’ QoE, communicating the required (re)configurations with the SliceNet Orchestration Plane, which in turn will enforce all desired actions onto the SliceNet Control Plane.

The Actuation Framework is implemented through two main components: i) a Policy Framework which implements rules that define what actions are executed in response to system and network slice events. Policies follow the Event-Condition-Action (ECA) approach (e.g. [5]) which indicates, for which events and conditions what actions must be enforced. These policies are then disseminated to multiple decision points across the layered infrastructure; 2) A QoE Optimizer component, one per deployed end-to-end slice, that is responsible for all (re-)configurations necessary to maintain the QoE of a specific E2E slice. Thus, given the rules specified by the Policy Framework, and monitoring data gathered from the Data Lake (raw monitoring, ML model outputs or verticals’ feedback), the QoE Optimizer triggers the necessary actions to carry out the desired remedial actuations by engaging with the Orchestration Plane. Such an approach allows for a clean separation of responsibilities, in which the Analyzer defines the “when” that triggers the QoE Optimizer which determines “what” must be done, which in turn interacts with the Orchestration Plane which determines the “how” to reconfigure.

The Cognition Plane is a flexible framework for QoE-aware management of network services that may suit the requirements of multiple roles and administrative entities. Indeed, the components of the Cognition Plane may be instantiated at both NSP and DSP levels independently, potentially articulating MAPE-K loops per administrative role. This is possible thanks to its modularity and replicability that allows to instantiate selected components within the plane per administrative role.

Conclusions

This article describes the Cognition Plane within SliceNet’s management layer as an architecture that allows for the QoE/QoS ML-aided management of 5G networks and vertical services. To this end, we described the several elements that constitute the phases of the full cognition MAPE-K-based loop. The main goal of the Cognition Plane is to enable the automated and QoE-aware management and control of end-to-end network slices as offered by DSPs to their vertical customers. In this regard, it is essential to provide the means for analysis of the underlying network slice and its components to determine its quality levels as well as for applying (re-)configurations when needed to maintain the desired quality levels.

References

[1] SliceNet website – https://slicenet.eu/

[2] SliceNet’s 5G Uses Cases – https://slicenet.eu/5g-use-cases/

[3] J. O. Kephart and D. M. Chess, ‘‘The vision of autonomic computing’, Computer, vol. 36, pp. 41–50, January 2003

[4] N. Miloslavskaya, A. Tolstoy, “Big Data, Fast Data and Data Lake Concepts”, Elsevier Procedia Computer Science, vol. 88, pp. 300-305, October 2016

[5] IETF Network Modeling Working Group, A YANG Data model for ECA Policy Management, Internet Draft, November 2019