Showing posts with label Sigcomm. Show all posts
Showing posts with label Sigcomm. Show all posts

Short papers

Towards SmartFlow: Case Studies on Enhanced Programmable Forwarding in OpenFlow Switches

Problems
The limited capabilities of the switches renders the implementation of unorthodox routing and forwarding mechanisms as a hard task in OpenFlow => the goal is to inspect the possibilities of slightly smartening up the OpenFlow switches.

Case studies
Add new features (matching mechanism, extra action) to flow tables

  • Using Bloom filter for stateless multicast 
  • Greedy routing => performed at switch rather than at controller
  • Network coding
An Openflow-Based Engergy-Efficient Data Center Approach

Problem
IaaS providers suffer from the inherent heterogeneity of systems and applications from different customers => different load + traffic patterns has to be handled

Solutions
  1. Overprovision to sustain a constant service quality => only applied with huge budget and a lot of resources
  2. Smart resource management => ECDC = machine information + network devices + environment data
Plug-n-Serve: Load-Balancing Web Traffic using OpenFlow

Problems
In data center or a dedicated web-hosting service, the HTTP servers are connected by a regular, over-provisioned network; the load-balancer usually does not consider the network state when load-balancing across servers => this way is not true for unstructured network (enterprise, campus) => traffic affects to performance of load-balancing and increase the response time of HTTP request

Solutions - Plug-n-Serve
Load-balances over arbitrary unstructured networks and minimize the average response time by considering the congestion of network and the load on the network + server.
  • It determines the current state of the network and the servers, including the network topology, network congestion, and load on the servers.
  • It choose the appropriate server to direct requests to, and controls the path taken by packets in the network, so as to minimize the response time
OpenFlow-Based Server Load Balancing Gone Wild

Problem
The switch in SDN network is used for load balancing and overloaded by a huge number of forwarding rules if each rule is installed for each connection.

Plug-n-Serve approach intercepts first packet of each connection and use network topology + load to determine the target replica before forwarding the traffic => many rules (as said above), delay (since involving the controller for each connection). This approach is called "reactive"

Solutions
Use wildcard rule to direct requests for larger groups of clients instead of each client/connection.
  • Based on the share of requests among replicas (called weight), the author  proposed an partition algorithm to divide client traffic efficiently.
    • Building a IP-prefix tree with the height is log of sum of replica weight
    • Assign leaf nodes to replicas based on the proportion of its weight.
    • Reduce the number of rules by using a wildcard form (for example: we can use 01* instead of two leaf nodes 010* and 011* to create a corresponding rule for a replica)
  • How to handle when we need to move a traffic from a replica to another => note: exisitng connection should complete at original replica => two ways:
    • Controller inspects the incoming packet, if it is a non-SYN packet, then it will keep being sent to old replica. If not, the controller will install a rule to forward the packet (and the ones belong to same connection) to new replica
    • Controller installs high-priority rules to switch to forward traffic to old replica, and low-priority ones to forward to new replica. After soft deadline with no any traffic, that high-priority rule at switch is deleted.
The author also consider the non-uniform traffic as well as the case in which network is composed of multiple switches rather than two (one for gateway receiving client traffic and the other for load balancing)

SDN-based Application-Aware Networking on the Example of YouTube Video
Streaming

Problem
Northbound API enables the application information exchange between applications and network plan => determine how different kinds of information (such as per-flow parameter, app's signature, app quality parameter) can support a more effective network management in an SDN-enabled network

Solution
Conduct 6 experiments: pure, with interfering traffic, round-robin path selection (controller has no external, information, just automatically change switch ports sequentially for incoming packets), DPI experiments (network information) and Application-aware path selection (application state).

VL2: A Scalable and Flexible Data Center Network

General Problems
  • Cloud requires the agility of data center
  • Data center with conventional network architecture can't fulfill that demand
    • Different branches of network tree is required different capacity (switch at core layer is oversubscribed by factor 1:80 to 1:240 while ones at lower layer is 1:5 or more)
    • Does not prevent traffic flood by one service from affecting the others (commonly have to suffer collateral damage)
    • Conventional networks achieve scale by assigning servers IP addresses and dividing them into VLANs => migrating VMs requires reconfiguration, human involvement requires reconfiguration => limit the speed of deployment
Realizing this vision concretely translates into building a network that meets the following three objectives:
  • Uniform high capacity
  • Performance capacity
  • Layer-2 semantics
For the compatibility, changes to current network hardware is limited, except the software and operating system on data center servers.

Using a layer 2.5 shim in server's network stack to work around limitations of network devices.

VL2 consists of a network built from low-cost switch ASICs arranged into a Clos topology [2] that provides extensive path diversity between servers. To cope with this volatility, we adopt Valiant Load Balancing (VLB) to spread traffic across all available paths without any centralized coordination or tra c engineering.

Problems in production data centers

To limit overheads (packet flooding, ARP broadcast) => use virtual LAN technique for servers. However, it suffers from 3 limitations:
  • Limited server-to-server capacity (due to servers locate in different virtual LAN): idle server cannot be assigned to overloaded services
  • Fragmentation of resources: spreading a service outside a single layer-2 domain frequently requires reconfiguring IP addresses and VLAN trunks => avoid by reserving resource for each service to respond to overloaded cases (demand spike, failure). This in turn incurs significant cost and disruption
  • Poor reliability and utilization:  there must be sufficient remaining idle capacity on a counterpart device to carry the load if an aggregation switch or access router fails => each device and link to be run up to at most 50% of its maximum utilization
Analysis and Comments

Traffic: 1) The ratio of entering/leaving traffic volume is 4:1. 2) Computation is focused on where high speed access to data is fast + cheap even though data is distributed across multiple data centers (due to cost of long-haul link). 3) Demand of bandwidth between servers inside a data center is growing faster than the demand for bandwidth to external host. 4) The network is a bottleneck to computation.

Flow distribution: Flow size is around 100MB no matter the total size of flows is GB. This is because the file is broken into chunks and stored in various servers. The percentage of machine with 80 concurrent flows is 5%, and more than 50% of the time, a machine has about 10.

Traffic matrix: N/A

Failure Characteristics: failure is defined as a event which is logged for a > 30s pending function. Most failures are small in size (involve few of devices) but downtime can be significant (95% of failures are resolved in 10 min but 0.09% last > 10 days). VL2 moves 1:1 redundancy to n:m redundancy.

VL2

Design principles:
  • Randomize to cope with volatility: using VLB to do destination-independent (e.g. random) traffic spreading across multiple intermediate nodes
  • Building on proven networking technology: using ECMP forwarding with anycast address to enable VLB with minimal control plane messaging or churn.
  • Separate names from locators: same as Portland
  • Embracing end systems

Scale-out topology
- Add intermedia nodes between two Aggregate switches => increase the bandwidth. This is an example of Clos network.

- VLB: take a random path up to a random intermediate switch and a random path down to a destination ToR switch

VL2 Addressing and Routing
  • Packet forwarding, Address resolution, Access control via the directory service
  • Random traffic spreading over multiple paths: VLB distributes traffic across a set of intermediate nodes and ECMP distributes across equal-cost paths
    • ECMP problems: 16-way => define several anycast address, switch cannot retrieve five-tuple values when a packet is encapsulated with multiple IP headers => use hash value
  • Backwards compatibility
VL2 direactory system
Store, lookup and update AA-to-LA mapping

Evaluation
  • Uniform high bandwidth: using goodput, efficiency of goodput
  • VLB fairness: evaluate effectiveness of VL2's implementation of VLB in splitting traffic evenly across the network.