Improving Unified Origin’s performance with cloud storage using Apache subrequests

5 min read

Improving Unified Origin’s performance with cloud storage using Apache subrequests

February 1, 2021

As streaming service providers move to replace local storage solutions with cloud storage services, Unified Streaming is increasingly asked for best practices to optimize performance, reduce the cost and intelligently scale cloud-based video on-demand (VOD) deployments.

To better address these needs we developed a new feature to deliver significant performance improvements and increased flexibility for setups that rely on cloud storage services.

Historically, Unified Origin has relied on libcurl to manage all HTTP(S) requests to cloud storage. A robust solution but with one disadvantage: each request from origin to the cloud storage requires a new connection. This establishing of a new connection is the most performance limiting step of the request process, in terms of network latency.
After our decision in 2018 to focus support on Apache over other web servers, we saw an opportunity to optimize this approach.

An option was to update the existing libcurl implementation. But doing so posed a significant technical challenge, and a costly commitment to ongoing support. We researched alternative solutions, finding one in Apache’s internal subrequests. As they can be configured to keep established connections alive between Origin and cloud storage and so reduce the frequency of connection setup to improve performance.

For insight into performance gains, we built a test setup to compare Apache subrequest and libcurl configurations. This article provides a summary of the setup and the observed results. A future publication will cover the research in detail.

Test Setup

The test setup we built, comprised of common features including:

Unified Origin running on an AWS EC2 instance [C5n.large]
Storage on cloud-based storage e.g. S3 storing the files as fragmented MP4 [CMAF]
Many clients requesting segments directly to Unified Origin

To isolate testing of the load from Origin to the storage backend, we omitted the following elements:

A Content Delivery Network (CDN)
Computing nodes (servers) were not improved/tuned at the kernel level or web server level
A shield cache between the client and Origin

Cloud Storage Configuration

We selected the Amazon instance C5n.large for Unified Origin deployment, primarily because of its high network bandwidth and its lightweight virtualization technology provided by AWS Nitro.

A C5n.large instance is composed of 2vCPUs~1physical core, 5.25GiB of RAM, up to 25Gbps network bandwidth.

Methodology

The testbed included our load generation tool running on one or more cloud instances, which allowed for the emulation of requests from a large and variable number of users. We used a workflow manager to deploy the workflow configuration and run experiments in an automated manner.

In total 250 tests were performed by increasing the load (the number of users) in steps of 5, 10, 20, or 50 every 5 or 10 seconds to a maximum total between 50 and 600 emulated users.

Performance evaluations were performed for both DASH and HLS outputs.

Media Source

The test content’s ABR ladder tracks were each packaged to a CMAF media container and stored on Amazon S3. The media assets contained a frame rate of 24fps, with a total content duration of 4891 seconds.

Tests Performed

Two cache-less configurations were A/B tested:

libcurl
Subrequests

Tested Advantages of Apache Subrequests

Overall, the evaluation of Unified Origin in combination with Apache subrequests delivered gains in three key areas:

Higher throughput
Latency reduction
Higher stability of TCP connections

Throughput

Apache subrequests provided throughput gains of up to 24% in comparison to libcurl configurations, alleviating causes of poor user experience, such as lowering of bitrate quality, continued rebuffering and stalling of playback.

The following figure illustrates the average throughput towards the client by the four tested configurations for DASH and HLS outputs:

Response Time

Test results demonstrated a latency reduction of up to 27% for subrequests compared to libcurl.

The following figure presents an HLS client emulation of the configurations (results for DASH show a similar difference):

Note: considering the load conditions of the experiment and the computing power available from the chosen instances, the average response time as reported by the clients, is higher than what can be achieved in a production setup.

TCP Connections

The following figure presents the average number of active TCP connections by the tested configurations for HLS (results for DASH show a similar difference):

In this performance evaluation of Unified Origin, the subrequests used fewer open TCP connections in comparison to libcurl. They also demonstrated greater stability and higher throughput towards clients. This improvement relies on the reuse and keeping alive of connections by subrequests.

Closing Thoughts

Taking all the results from our research into account, we observed that Apache internal subrequests provide significant performance gains over libcurl by,

Increasing throughput towards the client
Reducing the response time for client manifest and media segment requests
Reducing the number of TCP/IP connections opened by Origin
Providing higher stability of Origin
Giving more flexibility in the setup configuration
Reducing hardware resource consumption such as memory and CPU usage

Combined, these gains allow for more simultaneous client requests per Unified Origin and therefore in the long term decrease infrastructure costs. This should be true for all setups switching to subrequests, but even greater gains in performance can be achieved by fully tuning your setup. The methodologies and performance techniques involved in that will be discussed in a future publication.

Overall, we advise everyone using Unified Origin to enable the new subrequests functionality. This can be done in a matter of minutes.

Apache subrequests are available from release 1.10.28 onwards. For further information on installation and configuration, please consult our documentation.

‍