OpenFlow – Can It Scale?

OpenFlow Scale

The real question is can OpenFlow be used to build large scalable networks. To answer the question, we need to first understand OpenFlow. The ONF is responsible for the OpenFlow specification which defines the following components:

  • Remote Controller
  • Flow-based Switches
  • Protocol  between the Controller and Switches

The following picture shows the components of an OpenFlow 1.3 switch.

 

Components of an OpenFlow 1.3 Switch

Figure 1 – Components of an OpenFlow Switch

The specification does not specify the number of switches controlled or the proximity of the Switch and Controller. In fact, the specification does not even tell us how to set up flows across a network. These are important points because OpenFlow does not dictate network design.  Rather, it is a tool for enabling SDN (i.e. separation of control and data planes).

OpenFlow detractors like to highlight the following items as reasons OpenFlow doesn’t scale

  • Number of switches managed by a controller
  • Number of flows a switch can support

What’s needed is a review each of these items to determine if they are real issues and, if so, how they can be solved.

Number of switches managed by a controller

 

Switches Managed by a Controller

The number of switches which can be controlled by a single OpenFlow controller is determined by the number of TCP sessions and CPU performance.

Given today’s server architecture, we can realistically expect a server to handle tens of thousands of TCP connections. This indicates a single server could connect with many thousands of switches.  If more connectivity is needed, then additional servers can be added to form a cluster. The number of connections should not be an issue with good connection architecture.

Current generation Intel Xeon class servers have the ability to support 2 or more processor sockets with up to 16 cores per socket, providing massive amounts of processor cycles.  Intel claims a single server can handle 160 million packets per second which translates to 100+ Gbps performance.  This type of performance should be sufficient to cover thousands of switches.  And, when the controller is architected and implemented correctly it will be capable of scaling up as servers are added to the network.

For the sake of argument, let’s assume a single controller could only manage a thousand switches. We can then take a page from either Network Management or IP/MPLS network designs, which must also scale to large numbers of nodes. In the case of Network Management, large networks are controlled by a “Manager of Managers” design (i.e. hierarchical design). The same design could be leveraged for OpenFlow (i.e. “Controller of Controllers” model) to scale out a network as shown below.

Controller of Controllers

Figure 2 – Controller of Controllers

Another approach is to take the lesson from the IP/MPLS networks where Autonomous Systems become peers as shown in the following diagram.

Autonomous Systems

Figure 3 – Autonomous Systems

 

In this case, the OpenFlow controllers would peer with each other using a peering Protocol. This type of design is used today to interconnect worldwide IP networks.

While the use of hierarchy and peer-to-peer networking can be used for scalability, other approaches similar to current Web Scale database technology could also be used to scale out an OpenFlow controller. These are proven techniques that allow enterprises to support millions of customers today. The point is that a proper network and controller architecture is the key factor for scalability, which is not, in fact, limited by the OpenFlow protocol.

Number of Flows Handled by a Switch

The number of flows a switch can handle is limited by the size of its flow table.  OpenFlow 1.0 specifies a single flow table for the switch must match on 12 fields. Because of this requirement, most early implementations used a ternary content addressable memory (TCAM) for the flow table. These TCAM-based tables were limited to just a few thousand entries.  If the table was exceeded, the packets would be handled by the switch’s software. This severely limited the scalability and performance of early OpenFlow switches. Many of these scalability limitations have been resolved by using existing lookup tables when matching strictly at L2 and L3.

The ONF recognized the scalability issues imposed by having a single flow table and, with the OpenFlow 1.1, they introduced the concept of multiple flow tables. The specification went even further and allowed each different flow table to match different fields. This flexibility provides a pathway for more flows per switch, as TCAMs could be replaced with large lower cost RAM devices.

The ability to move away from the TCAM limitation allows switch designers to now scale their data plane for targeted applications while still supporting OpenFlow control. For example, a switch typically used to create MEF Carrier Ethernet services requires the ability to match packets on ingress port and VLAN ID. This can now be implemented using OpenFlow 1.1 with direct memory access to a large RAMs, which is consistent with how today’s Carrier Ethernet Platforms operate.

Modern switches and routers support forwarding information bases (FIB) in range of 64K to 512K entries.  With OpenFlow 1.1, a switch can now use these same FIB tables for flows managed by an off-board controller, which alleviates the second concern about the number of flows a switch can support.

OpenFlow – It Can Scale!

So the answer is “yes!” With proper network design and intelligent controllers, OpenFlow can be used to build very large networks.

Guide to Acronyms:

FIB – Forwarding Information Base

MEF – Metro Ethernet Forum

MPLS – Multi-Protocol Label Switching

Check out more from Contributors on SDNCentral:

CONTRIBUTED ARTICLE DISCLAIMER:

Statements and opinions expressed in articles, reviews and other materials herein are those of the authors; the editors and publishers. 

While every care has been taken in the selection of this information and reasonable attempts are made to present up-to-date and accurate information, SDNCentral cannot guarantee that inaccuracies will not occur. SDNCentral will not be held responsible for any claim, loss, damage or inconvenience caused as a result of any information within this site, or any information accessed through this site.

The content of any third party web site which you link to from the SDNCentral site are entirely out of the control of SDNCentral, and you proceed at your own risk. These links are provided purely for your convenience. They do not imply SDNCentral’s endorsement or association. The copyright and any other intellectual property right any third party content belongs to the author and/or other applicable third party.

Comments

  1. zfwise@gwu.eduzfwise@gwu.edu says

    ” The specification went even further and allowed each different flow table to match different fields. … …. as TCAMs could be replaced with large lower cost RAM devices.”

    I Agree.
    I think the fact that open flow table has too many columns results to some fields in a flow entry empty. Usually when we do the matching, the empty fields are treated as wildcard. And I read from somewhere that the wildcards in a flow entry are the major costumer of the TCAMs.
    If we have flow table with less columns, we have less wildcards and so we may replace the TCAMs with RAM.

  2. simhon.doctori@ecitele.com says

    In order to enlarge the scale of an openflow switches based network there is also an option to use the resources (i.e. flow table entries) of other network elements when some switches reach their limit of entries.

    This of course requires that the controller will have a full view of the network element’s resources, as it should, and will be able to shift the traffic from more ‘stressful network area’ to elements which currently can handle the traffic.

    This approach brings a new understanding of network wide resources management rather than monotonic per switch table resources behavior.

    Simhon Doctori

  3. steve@iveson.eu says

    Hey Mike,

    I’m far from having a full understanding of OpenFlow but regardless, I think I can say the following with some confidence;

    1) Just in general, look at something like http://www.bro.org and the model they use to handle huge amounts of throughput and inspect it all the way up to layer 7 (12 tuple sounds easy by comparison).
    2) Even ignoring that, some relatively simple load balancing at the controller or even interconnect switch level should make this a walk in the park. It an application right? Let’s apply some application delivery magic, even it it’s a bit circular (the controller cluster controls the load balancing to itself).

  4. ian.tivey@citihub.com says

    I have little doubt that OpenFlow-style SDN implementations can be made to scale for normal day-to-day operation, however for me the edge case of firing up a switch and immediately having to handle many million pps of flows at the controller level is where the biggest scalability challenges lie.

Leave a Reply