Saturday, November 7, 2015

Performance Management in ETSI MANO

ETSI NFV ISG has defined NFV architectural framework for CSP environment. ETSI NFV architecture has MANO (management & orchestration) components to provide VM life cycle management capabilities. MANO consists of VNF(virtual network function) manager, VNF infrastructure manager (VIM) & traditional EMS(element management system). VNF is responsible of Application virtualization layer events. VIM is responsible for virtual infrastructure layer events, while EMS is monitors application performance.
As shown in figure 1, ETSI MANO consists major management segments:

Figure 1

ETSI MANO Correlation Requirement

In NFV domain, Fault and performance management functionalities are distributed over EMS, VNF manager and VIM. EMS collects Application related counters, VNF manager collects VNF service related counter and VIM collects virtual & physical infra related counters.

To derive end to end performance issues ( as described below) Correlation among VNF Manager, EMS and VIM is highly required, as shown in figure 2.
  • Call drops per VM,  
  • Application performance impact due to failure of particular CPU or, 
  • Utilization ratio of virtual CPU/physical CPU etc, 


Figure 2


Correlation challenges
In traditional Telco network, OSS/BSS platform capture data from downstream EMS directly . Being tightly coupled with hardware, EMS system has end to end view of underneath application and hardware.
In NFV environment Application layer, VNF layer and Virtual Infra layers are based on different technologies and thus have different monitoring systems, different measurement & analytics tools and different Ownership, as shown in figure 3.

Figure 3 


Global VM ID as Correlation Key
The challenge for correlation between Cloud performance data ( VNFM & VIM) and Telecom measurements (EMS) is to find common parameters that can serve as Correlation Keys.
Following are two common attributes, which can be used for correlation across NFV environment:

1) Event Time stamp: time of event occurrence
2) VM_ID (virtual machine ID): virtual machine ID, distributed in VNFD(VNF descriptor).
To utilize VM_ID as correlation key, VM_ID should be unique in entire NFV deployment.

CSP should enforce policy of having unique VM_ID for entire NFV deployment including, NFV orchestration systems, VNF on-boarding, EMS systems, SDN controller, VIM and all other involved tools, and systems.

At time of VM instantiation, NFV orchestrator should obtain VM_ID from global Inventory management. It should distribute VM_ID among NFV MANO elements and downstream SDN controller, during VM instantiation as part of VNFD(VNF Descriptor).
As part of network policy, NFV MANO elements should able to change the VM_ID during scenarios as Inter/Intra host live migration, VM evacuation, etc . Henceforth NFV elements will use the unique VM ID during entire VM lifecycle management.
Following Figure shows the VM ID distribution, User Request can be manual request from Dashboard Or API call from another system, as shown in figure 4
Figure 4



USE CASE : VM_ID based Fault Management Correlation

Following use case describes need for correlation among Application EMS, & VIM to assess performance impact of failure of Physical CPU’s scheduler on VM application performance.
As shown in figure 5:


  1. Application EMS sends call events to Analytics manager. Report IE(information element) contains VM_ID=ABCD, timestamp, Application ID= vMME Release code: Drop etc. Analytics manager calculates the KPI, and finds out call drops for VM_ID is exceeding 0.1% (KPI threshold) per hour.
  2. VNFI forwards virtualization layer & hardware related alerts to Analytics Manager.
  3. Correlation engine at Analytics Manager correlates the EM alerts and VNFI alerts, finds that VM_ID ABCD is observing physical CPU scheduler fault, which is resulting in increased drop calls.
  4. Analytics manager co-ordinates with Policy manager for resolution.
  5. Policy manager forwards rule to migrate the VM to new location for VM_ID ABCD.
  6. Analytics Manager Co-ordinates with Inventory Management to get hardware details for new VM. Hardware details include new VM location ( node, line card & VM number), RAM, CPU & Memory details as described in VM affinity rules in VNFD. New VM_ID will be based on new location.
  7. Analytics Manager will forward the details to VNFI manager.
  8. VNFI manager instruct hypervisor to spawn new VM, with VM ID as XYWZ.

Reference
  • Network Functions Virtualization (NFV); Infrastructure Overview(GS NFV-INF 001)
  • Network Functions Virtualization (NFV); Architectural Framework(GS NFV 002) Network Functions Virtualization (NFV); Management and Orchestration(GS NFV MAN 001)
  • Network Functions Virtualization (NFV); Virtual Network Functions Architecture(GS NFV SWA 001)

This blog represents personal understanding of subject matter.

What is VNF Silo ???

VNF (virtual network function) is composition of one or many VMs to realize Telecom network function on virtualized platform. 

Definition of VM, VNF and Virtual Service is specified in ETSI GS NFV 002 as shown in Figure 1:
Virtual Machine (VM): virtualized computation environment that behaves very much like a physical computer/server. A VM has all its ingredients e.g processor, memory/storage, interfaces/ports etc of a physical computer/server and is generated by a Hypervisor.

Virtual Network Function: A VNF is a virtualization of a network function , e.g. EPC functions such as Mobility Management Entity (MME), Serving/Packet Gateway (S/P GW) ; and conventional network functions such as DHCP servers, firewalls, etc. VNF lifecycle events are managed by VNF Manager.

Virtual Service: Combination of various VNFs form virtual service e.g virtual VoLTE, by integrating IMS VNFs & EPC VNFs.
Figure 1:


VNF Architecture
The VNF architecture depends on VNF Provider’s strategy . For example, one VNF Provider may implement a VNF as a monolithic, vertically integrated single VM, while another VNF Provider may implement the same VNF by, decomposing application functions into separate VMs, as shown in figure 2.
Figure 2:
 

Monolithic VNFs are easy to deploy since less VMs required to instantiate, hence simpler task for NFV orchestrator. While decomposed VMs brings complexity in VNF instantiation, it also provides opportunity to introduce open source elements into VNF Architecture. e.g using No-SQL database instead of Telco application’s proprietary database to preserve application state, once state persistence is decoupled from Application logic. 

Decomposition brings software modularity and provides opportunity for VNF reusability.

Decomposed VNF Architecture
The objective behind designing VNF is to hold software functionality, while decomposing software into manageable modular blocks and decouple software from hardware. Figure 3 shows one example of VNF decomposing. 
Figure 3
In Legacy world, Telco software components are deployed on proprietary Line Cards(LC), where LCs are installed in hardware shelf. These LCs are interconnected by backplane switches to make internal communication possible. While in NFV world, software will be deployed on virtual machines and VMs will be interconnected by virtual Switch ( e.g OpenVswich, vRouter) and chaining of VMs will form VNF to realize element functionality, as shown in figure above.

VNF Reusability
In legacy world application’s software is written for particular hardware, and hence to reuse it’s component for another hardware required time-consuming customization. Thus CSP’s network converted into plethora of hardware boxes, each running specific application to offer specific functionality.

In virtual world, VNF Decomposition offers opportunity for VNF reusability. As software becomes more modular and decoupled from hardware, it’s components can run on industry standard hardware with little or no customization. This will make service deployment faster. 

As shown in figure below, GTP handing VM is reused in EPC core, with minimum customization.
Figure 4



Further down the line, We can present software modular blocks as service catalogue and application developer can pick & choose, necessary functions to design application.

VNF On-Boarding
VNF on-boarding means procedures to instantiate VNF in cloud environment. 
Following points to be noted in order to design VNF on-boarding :
  1. VM instantiation flow (Booting Order) e.g sequential or parallel instantiation of VMs
  2. Service chaining of VMs in order to realize VNF functionality e.g VNF Architect should have good understanding of packet traversal in VNF chain & individual VM’s functionality, in order to create service chaining.

Figure 5, shows high level steps for VNF on-boarding:

  1. User logs in VNF catalog GUI and raises request for VNF.
  2. Template Generator will generate Heat or Tosca template based on request. This also called VNFD(VNF descriptor), as defined by ETSI. Tosca required to convert to Heat template (HOT) eventually.
  3. Cloud Orchestrator e.g. Openstack will instantiate VNF, by co-ordinating with Virtual Infra manager, for compute, network and storage requirements, as described in HOT. Template will also define Affinity rules such as VM’s placement on physical host for HA requirement.
  4. Orchestrator will also co-ordinate with VNF manager to service chain VNF as described in HOT.
  5. Once VNF is instantiated successfully with required resources and network, EMS system will configure the application, hosted on VMs. 
VNF’s compute(RAM, Memory etc) , Storage, Networking(vNIC ports), Affinity rule(Physical host selection for VM), Auto healing mechanisms, Service chaining details etc are prescribed in VNF Template. Cloud Orchestrator will instantiate the VNF based on VNF template. This template can be HOT (heat template) or TOSCA(Topology Orchestration Specification for Cloud Applications). TOSCA template can be converted into HOT for openstack based cloud orchestration.( https://github.com/openstack/heat-translator).

VNF Silo
Primarily VNF manager is responsible for managing VNF lifecycle. VNF manager can belong to VNF provider solution such as contrail for Juniper products, Ericsson Cloud Manager etc. Problem with this approach is VNF Silo.

Consider a service such as VoLTE consist of multiple VNF providers as Ericsson, Alcatel Lucent, Cisco, Juniper.. and each VNF provider comes with own VNF manager. This will create VNF Silo as shown in figure below.

Figure 5

Concept of Open VNF manager is one of the solution, where singular VNF Manager interacts with cloud orchestrator using Heat & Tacker APIs. Open VNF manager framework consists Vendor plugins to manage their VNFs. Tacker is a generic VNF manager service for Openstack managed Cloud. More details are https://wiki.openstack.org/wiki/Tacker



Reference
ETSI GS NFV 003 V1.2.1 (2014-12): NFV Terminology
ETSI GS NFV 002 V1.2.1 (2014-12): NFV Architectural Framework
ETSI GS NFV-SWA 001 V1.1.1 (2014-12): NFV Virtual Network Functions Architecture.

(This blog represents my personal understanding of the subject matter).