###### Recent Posts

###### Featured Posts

Typically, the cost of higher data rate links is less (unit cost wise) than the cost of a lower data rate links – for example; the bandwidth of a T3-link is 28 times that of a T1-link, but the cost of a T3-link is much less than 28 times the cost of a T1-link. However, the requirement for an acceptable link utilisation threshold places a restriction on the demand (and hence the revenues) that can be supported (respectively generated) by a given network link… if Quality-of-Service (QoS) constraints such as a predefined maximum packet delay are to be met. Clearly, the relaxation of QoS constraints without compromising the performance (and specifically the Internet connection speed) has significant implications for the business case for broadband provisioning.

In 2014, Netflix reached a deal with Comcast by which Netflix servers that were otherwise sitting in third party data centres were instead directly connected to the Comcast network in an initial total of 10 out of 18 co-location facilities (a.k.a. Point of Presences/Core Network Nodes – see the figure below) [Link to the Blog on the Comcast Netflix Deal with Data and Numbers] & [Link to Youtube Tutorial on ISP PoP Design]. The reduced propagation delay between user devices and Netflix servers had a dramatic impact on the subscriber Internet connection speed and hence the video quality [Link to Article on CDNs and the Effect of Round Trip Time on Performance]. In addition, server load balancing algorithms [Link to the Akamai Graphic on Load Balancing] that boost application performance by routing client requests away from congested servers must have also had significant implications for the flow of the large volumes of video data in the Comcast core network.

The design of core networks includes determining the location of routers, their inter-connectivity (i.e., the network topology) and link capacity such that packets can be moved efficiently through the network. Determining the capacity to install for demands arising from multiple network nodes (expressed as a traffic matrix) is a crucial problem that can be tackled in multiple stages [Link to CISCO Paper on Backbone Capacity Design].

Before tackling multiple nodes, it is instructive to consider the requirements when two network nodes are connected via a single link. If it is assumed for the moment that packet arrivals follow a Poisson process, then the link can be thought of as the famous M/M/1 queuing system which happens to have a simple analytical formula for computing the average amount of time that a packet spends in a queue (buffer). Specifically, if the average packet size is given by Kp bits, and the link capacity is given by C bits per second (e.g., T1-rate: 1.54 Mbps), then the average transmission/service rate per packet is μp = C/Kp pps. If the average arrival rate is denoted by λp pps, then the average delay in seconds D(λp, μp) is given by D(λp, μp) = 1/( μp ̶ λp). Note that this is the total amount of time the packet is in the queue/buffer, not the total time the packet spends in the system (at the link) which would include transmission time. This simple relation can provide very useful insights into what happens when the transmission or service rate is increased to meet the demand from an increased arrival rate. For example, if the average packet size is 1 KiloByte (i.e., 8 Kilobits), then an arrival rate of 100 pps (800 kbps) and service rate of 190 pps (approx. 1.54 Mpbs, the T1-link rate) results in an average queuing delay of 11ms. If there is a ten-fold increase in the demand and a matching ten-fold increase in the service rate (note that there is no actual communication link speed at this rate), then the average delay reduces to one-tenths of its original value since D’(λp, μp) = 1/( 10μp ̶ 10λp) = 0.1 D(λp, μp).

The above analysis suggests that there is a performance gain often referred to as a statistical multiplexing gain which arises from aggregating traffic. Unfortunately, measurements from the Internet indicate that the packet arrival process does not follow the Poisson process and the delay is worse than the one calculated using the Poisson assumption. In fact, network simulations conducted by the author of this Blog show that there is no significant statistical multiplexing gain when TCP reaches a steady-state i.e., as the arrival rate increases, a corresponding increase in the service rate does not significantly decrease the average packet delay D(τ) for fixed-range, end-to-end path propagation delay τ. Note that we will use D(τ) instead of D(λp, μp) to denote the average aggregated packet delay for Internet traffic due to the implications of the results in the table below.

Based on Little’s theorem (the average number of packets in the system is equal to the average packet arrival rate × the average packet delay), it can be expected that the buffer occupancy is higher for higher speed tiers but that the network dimensions increase the transmission/service time resulting in a higher perceived internet connection speed for the end-user – and hence the significance of a subscriber-centric definition of network congestion. Additionally, several key observations can be made about the results from the network simulation:

The store and forwarding process has a transmission and queuing component and it is the former rather than the latter which affects the subscriber’s perceived performance

It is possible to increase the dimensions (capacity) of the network for a maximum demand between any pair of nodes such that the connection speed is increased, despite stagnating aggregated average packet delays

Statistical-multiplexing gain is almost non-existent for TCP in steady-state, meaning that if the per-subscriber demand for a given speed tier is defined as the network capacity divided by the number of subscribers, it is possible to estimate the demand for different levels of subscriber aggregation

In the case of a network of nodes, the traffic demand between any two pair of nodes is determined by the way packets are forwarded on a per-hop basis. In a traditional IP-based network, each router along the path from source to destination performs a look-up of the packet’s destination IP address in the routing table and selects the lowest cost path to forward the packets. This method has a significant disadvantage: if a given link happens to lie on several optimal paths, numerous routers in the network will "prefer" to use that link when forwarding packets even when there are several other routes available with links that are relatively underutilised. Due to reduced complexity and efficiency, Interior Gateway Protocols (IGPs) are designed to choose the least cost path to forward packets so that when the cost of a link (i.e., the link metric used in the routing protocol) is too low, a link will be on the shortest path causing certain delayed transmission of packets for a given source-destination pair and in the worst case, buffer overflows and packet loss (which would then cause TCP to drastically throttle back on the sending rate via a multiplicative decrease rule) [Link to the Packet Design E-Book]. Recalling the average delay formula, the typical objective of solving Network Design Problems (NDPs) for IP-based networks is to: determine the metric of the links so that the shortest paths are determined in such a way as to minimise the maximum aggregated packet delay across the network [Link to Routing, Flow and Capacity Design Book by Pióro et. al. 2004]. However, and as it has already been pointed out, packet arrival process does not follow the Poisson process and self-similarity makes it difficult to model packet delays using a closed-form expression.

When Traffic Engineering (TE), rather than per-hop routing decisions are required, the network operator’s headend ingress router controls the path taken by traffic between a source and a destination. Instead of sending traffic through a least-cost (but congested path), TE directs traffic through “tunnels” which provide mechanisms for creating paths with the certain characteristics, such as paths with adequate resources, say in terms of bandwidth. Once tunnels are created, packets are assigned MPLS labels and forwarded across the network based on these labels (Multi-protocol Label Switching). In the TE/MPLS lexicon, Label Edge Routers (LERs) are routers that operate at the edge of an MPLS network whilst Label Switching Routers (LSRs) are routers that lie along the switching path from source to destination [Link to the Packet Design E-Book]. The NDP for TE/MPLS-networks therefore consists of determining the minimum cost network (number and location of core LSRs) and the dimensions required for the links along different Label Switched Paths [Link to Routing, Flow and Capacity Design Book by Pióro et. al. 2004]. Whereas the first objective can be met on a network equipment cost-basis, the second objective (determining how much bandwidth is required for a specific performance level) is a non-trivial challenge when closed-loop congestion control mechanisms such as TCP are implemented at the layer above the network layer (i.e., the data transport layer).

It is worth noting that NDPs can cover a wider scope of objectives and include restoration design where link failures are explicitly take into account in the solution of the NDP. Also, communications networks are configured in a multi-layer fashion forming a hierarchical structure with each layer being a proper network on its own. For example, in the NDP problems that will be discussed subsequently, paths for particular demands between nodes in the upper (demand) layer are formed using physical links and nodes in the lower (network) layer.

The distinction between Demand Volume Units (DVUs) and Link Capacity Units (LCUs) is strictly necessitated by the spatial multiplexing gain as discussed above. But because the utilization is very high (greater than 98% - i.e., demand is approx. equal to capacity) and there is no significant spatial multiplexing gain when TCP reaches a steady-state (i.e., the average packet delays stagnate for increasing arrival and transmission rates), the two units (DVU/LCU) can effectively be conflated for aggregated traffic flows resulting in a significant simplification of the NDPs, as well as maximising utilisation (revenue generation from the available capacity). The DVU (or LCU) per subscriber per speed tier can be estimated effectively as the ratio of the link dimensions to the level of subscriber aggregation.

If multiple shortest paths are available for IP-based routing, then the demand is split among all the shortest paths according to the “equal cost multi-path rule (ECMP)” rule [Link to Routing, Flow and Capacity Design Book by Pióro et. al. 2004, Chapter 7]. However, because IP networks carry TCP traffic from end computers, it is important to ensure that a particular TCP session is not drastically affected if its packets take two different paths through an IP-based network – in other words, it is conceivable that different paths could have different latencies, even if both paths are marked as ECMP paths in the routing table. For this reason, the ECMP split may be decided based on a per-TCP session basis rather than on a packet-by-packet basis. If it assumed for simplicity that ECMP means an equal split in terms of the traffic volume, then the task when solving NDPs in the IP-based scenario is to: determine link metrics such that the resulting demand flows minimise the maximum demand within the network (and thus the associated data transport/equipment costs). The NDP that solves for this min-max demand objective function traffic can be summarised as follows [Link to Routing, Flow and Capacity Design Book by Pióro et. al. 2004, Adapted from Section 3.1].

Because of the implicit relationship between the link load and routing link metrics, the NDP is not a typical linear programming problem, but there exists a variety of meta-heuristic algorithms such as Simulated Annealing which can be employed to determine the optimal link weights and the resultant maximum traffic volume demand for the network. The figure below shows the results generated by the author for an illustrative example of aggregating 3 demands where the aggregated traffic volumes represent several subscribers at different speed tiers with all measurements in Gbps. Note that the optimised result P11 = {1 4 2}, P21 = {1 3}, P31 = {2 3}, P32 = {2 4 3} (where Pdp denotes a feasible path p for demand d, refer to [Link to Routing, Flow and Capacity Design Book by Pióro et. al. 2004], Section 2.4 for an explanation of the NDP notation) resulting in a min-max demand of 3 Gbps on the link between node 1 and 3 can only be determined by a rigorous analysis, particularly for complex networks.

Linear programming solutions are applicable when considering MPLS over Wavelength Division Multiplexing (WDM) systems for which the network design problem consists of determining the number and location of LSRs, and the number and route of light-paths or lambdas. In a Dense WDM (DWDM) network, one wavelength (light-path or lambda) can support a data rate of 10 Gbps (although technologies exists that can deliver 100 Gbps per light-path), a single fibre strands can support anywhere between 32 and 192 lambdas (we can assume 64 lambdas), and the number of strands per cable can be anywhere between 24 and 192 (we will assume 128 strands). In this scenario, a single fibre optic cable can support a staggering 82 thousand Gbps (i.e., 128 × 64 × 10 Mbps) making the DWDM technology a robustly future-proof option. Light-paths or lambdas have a limitation on their physical extent due to the various transmission impairments (e.g., attenuation, crosstalk, dispersion and nonlinearities). The nodes in a WDM network are called Optical Cross-Connects (OXCs) and core LSRs are collocated with OXC wherever two or more LSPs intersect [Link to Routing, Flow and Capacity Design Book by Pióro et. al. 2004, Adapted from Section 3.7].

The MIP problem is complex as all the feasible paths between LSRs must be considered, where the least cost path in terms of the number of OXC hops is not necessarily the least distance-based cost network. The figure below depicts an illustrative example which was solved using the GLPK Library to determine the aggregation of 3 demands between network LERs.

Concluding remarks – tradition network planning has centred on minimising the queuing packet delay at a given node as one of the three delay components leading to performance degradation (the other two being the propagation and the node processing/transmission delay). However, in the subscriber-centric view of congestion, performance is defined in terms of the arrival rate of packets at the end-user device (i.e., the perceived internet connection speed) which is a function of the network dimensions (and the fixed propagation delay). In the economics of Content and Application Provision, providers such as Netflix currently pursue a strategy of letting the demand drive the capacity – in other words, subscribers in a new market will access content in the Open Connect Netflix server system until the demand increases to a point where Netflix deploys servers closer to the subscriber eyeballs [Link to the Article on The Counterintuitive Technology Behind Netflix's Worldwide Launch]. Understanding how the end-to-end propagation delay affects capacity is an effective strategy of determining how the reliable data transfer & congestion control mechanisms implemented in the transport layer can be supported by the underlying network layer with implications for the cost of network build-out and revenues generated.

“This (universal) language will be the great instrument of reason for when there is a dispute amongst persons, we can simply say: ‘Let us calculate, without further ado and see who is right.’”

Liebniz - The Art of Discovery (1685)

###### Follow Us

I'm busy working on my blog posts. Watch this space!