Most media streaming services are composed by different virtualized processing functions such as encoding, packaging, encryption, content stitching etc. Deployment of these functions in the cloud is attractive as it enables flexibility in deployment options and resource allocation for the different functions. Yet, most of the time overprovisioning of cloud resources is necessary in order to meet demand variability. This can be costly, especially for large scale deployments. Prior art proposes resource allocation based on analytical models that minimize the costs of cloud deployments under a quality of service (QoS) constraint.
However, these models do not sufficiently capture the underlying complexity of services composed of multiple processing functions. Instead, we introduce a novel methodology based on full-stack telemetry and machine learning to profile virtualized or cloud native media processing functions individually. The basis of the approach consists of investigating 4 categories of performance metrics: throughput, anomaly, latency and entropy (TALE) in offline (stress tests) and online setups using cloud telemetry.
Machine learning is then used to profile the media processing function in the targeted cloud/NFV environment and to extract the most relevant cloud level Key Performance Indicators (KPIs) that relate to the final perceived quality and known client side performance indicators. The results enable more efficient monitoring, as only KPI related metrics need to be collected, stored and analyzed, reducing the storage and communication footprints by over 85%. In addition a detailed overview of the functions behavior was obtained, enabling optimized initial configuration and deployment, and more fine-grained dynamic online resource allocation reducing overprovisioning and avoiding function collapse. We further highlight the next steps towards cloud native carrier grade virtualized processing functions relevant for future network architectures such as in emerging 5G architectures.