Ensuring QoS for interactive services over the Internet by coding at rates that compensate for casca
An interesting fact about some of the most pervasive forms of digital telephony is that; instead of transmitting the raw bits derived from the digitization of the speech signal, a reduced set of (coded) parameters representing the speech production model is transmitted and used to reconstruct the analogue speech signal at the receiver in (near) real time [Link to A 12 Week Course in Speech Coding and Recognition, on Code-Excited Linear Predictive (CELP) coders]. Such an elaborate scheme for reducing the transmission bit rate is necessitated by various requirements such as making efficient use of limited network resources (e.g., spectrum in a mobile network) and reduced transmission power in battery powered (mobile) devices. The CELP parametrisation process works by processing a latched 20ms segment of speech whilst the next 20ms is in the processes of digitisation via sampling and quantisation. This in turn means that the received speech has an inherent lag of say 20ms due to coding (at either end of the interactive application); a lag which is anecdotally imperceptible to the human ear. It is therefore entirely conceivable that an effective strategy can be adopted where a fixed lag is introduced during audio/video coding to compensate for an anticipated data transfer delay (e.g., by coding arbitrarily larger segments of the payload). If that coded information can then be transferred reliably at a predetermined minimum rate, then the unpredictability associated with statistical multiplexing (and specifically probabilistic queuing) can effectively be eliminated at the application level. Of course such an elaborate scheme is necessitated by the need to meet Quality of Service (QoS) requirements on the Internet, particularly for the current (and future) interactive applications such as voice, video and virtual reality.
Despite some initial misgivings about the suitability of the TCP-IP model for all data services [Link to the Paper: Is IP going to take over the world (of communications)], it appears that the TCP-IP model is poised to converge all of the typical communications service (voice, data and tv/streaming video – a.k.a. triple play services) on the Internet. An interesting article [Link to Vox Blog: 40 Maps that explain the Internet] describes the development of the Internet from the Advanced Research Project Agency Network (ARPANET) which was developed to test the then-novel technology called packet switching. From an initial four nodes in 1969, ARPANET grew rapidly to around 100 nodes in 1982, at which point the military decided to ask the computer scientists Robert Kahn and Vint Cerf to develop a new standard that would allow ARPANET to be reorganised into a decentralised “network of networks.” On January 1st 1983, ARPANET switched to using the newly developed TCP-IP standard marking the birth of the modern Internet.
In essence, packet switching allows for a greater number of subscribers to share a resource such as a network link – simply by allowing for queues to develop (as a simple analogy, one can imagine the difference between managing traffic via traffic lights vs by implementing roundabouts – where packet switching is the latter). In the traditional circuit switching paradigm, a 1 Mbps link can be shared by a maximum of 10 subscribers if each subscriber requires a 100 kbps data transfer rate. But if a subscriber is actively sending information only 10% of the time, then the probability that there are 10 (or more) out of 40 subscribers sharing the link at any given time is 0.001 [Link to Computer Networking a Top Down Approach, Chapter 1]. In this case, 40 subscribers can share the same link with a high probability (0.999 or 99.9%) that the demand will not exceed the maximum 1 Mbps link capacity. In the (rare) event that the demand does exceed 1 Mbps, buffer occupancy (contention) and; in extreme cases, buffer overflow (congestion) will result. This method of probabilistically sharing a limited resource (and thus increasing its efficient use) is referred to as statistical multiplexing.
An important solution to the congestion problem was put forward by Van Jacobson in a 1988 publication when he proposed the implementation of a set of algorithms that allowed TCP to recover from packet loss in a congested network without collapsing the network (by which we mean: a vicious cycle in which re-transmitted packets further congest the network resulting in further losses). The only credible incarnation of this problem was reported in 1986 when the data throughput from the Lawrence Berkeley Laboratory to UC Berkeley dropped from the then prodigious 32 kbps to a truly glacial 40 bps [Link to Van Jacobson Paper on Congestion Avoidance and Control]. The resulting TCP algorithm (Reno as well as other types of TCP Foo) are not only credited with preventing a congestion collapse of the Internet to date (anecdotally, at the very least), but these algorithms also demonstrably lead to extremely high and verifiable resource utilization ratios of greater than 98% in network simulations [Link to Previous Blog on over-provisioning the Backbone].
Had such an Armageddonic congestion collapse occurred since then, data transfer rates would have dropped precipitously during busy hour as a result of failures in the TCP-IP model, rather than being the result of physical limitations such as the lack of adequate network resources (specifically, router line speed or capacity). These two distinct causes of congestion are frequently (and unhelpfully) conflated when discussing performance problems arising from the convergence of services on the Internet and network congestion. As it happens; and as discussed in a previous blog [Link to Previous Blog on Solving Performance Problems], all it took to solve a major network congestion event in 2014 was for Comcast (the ISP) and Netflix (the content service provider) to reach an agreement on the (re)distribution of servers. This simple step resulted in a dramatic increase in video service quality – notably without the requirement for any modifications to the TCP-IP model which is implemented by Netflix. The remaining paragraphs of this Blog will explain how the physical dimensions of the network are both necessary and sufficient for ensuring service reliability in the TCP-IP model.
Given that the distributed content or “pull model” dominates the Internet with streaming services such as Netflix and YouTube accounting for 70% of busy hour traffic in the US as of 2015 [Link to the VB Blog with Sandvine Analysis], it is unsurprising that many comments received in response to a Federal Communications Commission (FCC) Public Notice on defining broadband [Link to FCC Document, released August 20, 2009] described data transfer speed as one of the defining characteristics of broadband. It is the author’s opinion that the reliability of Internet data transfer rates (speeds) has (until recently) been made difficult by the complexity of the analysis required when many TCP connections share a bottleneck link [Link to the Book on Internet Congestion by Subir Varma, Chapter 2]. This is an important analytical consideration as the overarching objective for distributed content is that access to remote content (e.g., a tv series episode via the Open Connect Netflix server network) should be (indistinguishably) as fast as accessing the same content via a local source such as a docked pen drive.
The trouble with the connection/data transfer speed is that it can be a variable quantity for any given connection as TCP has to gradually increase the data rate in order to fully utilize its share of the network capacity resource. Apart from network architecture, link and node capacities and traffic management design, the sustainable connection speed is subject to usage-related variables including: total traffic demand placed on the network, the distribution of the average demand between different users, the mix of applications in use and the protocols over which they run. In fact, by correctly characterising these variables and reducing the complexity of the analysis by replacing all the nodes along a path by a single bottleneck link, it is possible to accurately predict the dimensions of the network that ensure a reliable and sustainable data transfer speed.
The efficacy of this reduced complexity approach is easily verified by network simulations which demonstrate unequivocally that even when an additional bottleneck induces large queuing delays (e.g., queuing delays that are roughly equivalent to increasing the progagation delay by more than the distance from New York to San Francisco!), that the average data transfer speeds are indeed predictable, stable and sustainable. Starting with queues at R1 (please refer to the figure below), the effect of cascaded queuing delays was evaluated by reducing the sub-optimal line speed (capacity) at R2 by between 15% and 30% for three different speed tiers. This resulted in a predictable increase in the cascaded aggregated packet delays at both R1 and R2, but (interestingly) not a significant deterioration in the minimum rate at which information was transferred between the end nodes (results in the table below). A further decrease just below the 30% maximum in the sub-optimal line speed at R2 resulted in the localisation of a bottleneck entirely at R2, but again with no significant deterioration in the required data transfer rate. Please note that we have assumed Gigabit Ethernet everywhere apart from at the router output ports. It should be noted that queuing and packet drops can occur both at the input and output ports of the router depending on the traffic load, relative speed of the switching fabric and the line speed.
Results obtained from network simulations suggest that TCPs self-clocking and dynamic window operation have the effect of filling all the pipes through (or in other words, distributing the data over several network nodes along the path) in such a way that the user does not perceive a change in the rate at which information is delivered to their device. In this sense, each network node along the path functions as a “cascaded distribution node” by caching a sufficient amount of the payload such that the content can continuously stream between end nodes. This process is called pipelining and it ensures a very high link utilization (a utilization that was calculated at above 98% in a previous Blog [Link to Previous Blog on over-provisioning the Backbone]).
As far as the TCP-IP application is concerned, the network is an un-intelligent entity that delays and occasionally loses packets. In order to bestow the network with "intelligence," we need Quality of Service (QoS) mechanisms. Given that dropping data from interactive services such as voice results in poor service quality whilst dropping packets from batch/non-interactive services such as streaming video results in a reduced data rate (and ultimately reduced service quality), QoS is implemented to determine how a router or switch deals with different types of packets. This objective can be achieved by “controlling” several network parameters namely – bandwidth, delay, jitter and loss. In order to deal with these parameters several mechanisms are implemented, namely – classification & marking, queuing, shaping/policing and congestion avoidance. Classification & marking is required to distinguish different traffic types, queuing packets in different queues allows for differential services such as greater bandwidth allocation, shaping/policing effects the bandwidth allocation whilst congestion avoidance controls the extent to which packet loss occurs when TCP attempts to fill a pipe.
The results obtained in this Blog illustrates that it is necessary and sufficient to dimension the network in order to ensure service reliability for arbitrary data services running over the Internet via the TCP-IP model. This assertion is based on the observation that cascaded queuing delays need not have a perceptible cumulative effect on interactive services such as voice, video conferencing or virtual reality. Such a conclusion supports the industry trend for the convergence of services in super-fast networks as envisioned; for example in the fifth generation networks, the much anticipated 5G.
Some Interesting Historical Facts:
On October 29 1969, the first message "Lo" was sent between the first two ARPANET nodes; the SDS Sigma 7 Host Computer at the University of California - Los Angeles and the SDS 940 host computer at the Stanford Research Institute. The sending of this first message crashed the system! [Link to the Article: The Day the Infant Internet Uttered its First Words]
Each ARPANET site had a router known as an Interface Message Processor. These cost $82,200, or half a million dollars in today's money. [Link to the Blog: 40 Maps that explain the Internet]