Simplifying the Enterprise Edge 
Products   | Solutions   | Resources   | Support   | News & Events   | About Us
Maximizing Serviceability & Service Resiliency at the Branch Office

A NetDevices White Paper


Download the PDF
PDF Icon

This white paper outlines the key requirements for maximizing serviceability and service resiliency for branch office networks and elaborates how the NetDevices unified services gateway solutions comprehensively address these requirements.

Driven by the demands of today's globally connected marketplace, enterprises are rapidly becoming more decentralized and distributing more & more business applications to branch offices. Support for critical business functions, such as CRM & ERP, as well as business applications, such as email & storage, is often being required at every branch office. This trend is being further accelerated by the "webification" of applications which is midway through its current cycle.

Branch office networking requirements are changing at a rapid pace in order to support these business needs. Multiple services like security, networking, voice and access are required at most locations. With mission-critical applications being distributed to branch offices, the branch network infrastructure must be robust, scalable and remotely manageable, with easy extensibility to support new services & applications.

Traditional branch office solutions have been based on the utilization of multiple cascaded point devices to support multiple services. With this approach, support for new services requires the disruptive addition of new devices. The increasing complexity of multiple layers results in several significant problems for businesses. Complex management, with little or no support for remote management, leads to a non-linear increase in operational costs. As the number of devices increases, an exponential rise in the number of discrete points of failure may result in unacceptable levels of downtime. Furthermore, the traditional approach to layered security has not been very effective in meeting new security threats in the open, Internet-based network environment.

Due to these inherent shortcomings, the evolution path offered by the multi-device model is unsustainable. Hence, enterprises need next-generation branch office solutions that enable them to fundamentally re-architect the way their branch office networks are built & managed.

To address these challenges, converging multiple services on a single platform appears attractive. However, in today's intensely networked environment, integrating multiple services on the same platform presents its own set of serious challenges for service availability and manageability, as outlined below:

  • Risk of wide service outage as one service impacts others and services become
    frequently unavailable
  • Constant infrastructure and service upgrades/ changes, impacting all services on the
    platform
  • Configuration errors and conflicts between services leading to unplanned downtime
  • Denial of service (DoS) attacks and other security threats bringing down mission-critical
    applications
  • Lack of expertise at branch and lack of resources at NOC to manage multiple branch
    office services, resulting in delayed response to problems

All the scenarios outlined above can lead to significant downtime and substantially impact the business bottom line in the form of lost productivity, lost revenues and increased costs. Hence, in order to overcome these problems and fully enjoy the benefits of branch service convergence, a purpose-built system is required to unify multiple branch networking services on a single platform and meet the most stringent requirements for serviceability and availability. Such a system needs to be architected from the ground-up to meet the following key requirements:

  • Always available management access to the system independent of the state of the
    system
  • High resiliency via fast recovery from both hardware and software failures
  • Fast response to even the most sophisticated security threats through a new approach to
    layered security
  • Non-disruptive servicing through in-service upgrades, configuration changes and new
    service additions
  • Unified management of multiple services in order to enhance efficiency and minimize
    conflicts
  • Support for comprehensive remote management

It is important to note that there are other ways to achieve high resiliency such as the use of multiple redundant devices and redundancy for all components in the system. While these alternatives may be appropriate for networking devices in the core of the network (e.g. core router), they tend to be too expensive for the branch office. On the other hand, the traditional branch office solutions do not meet the serviceability and availability requirements of today's distributed enterprise. Hence, the approach outlined above and elaborated in the following sections of this paper is aimed at providing an optimal choice given the specific constraints and requirements of the branch office.

A New Approach to Branch Network Serviceability and Availability:
Unified Services Gateways

The NetDevices family of unified services gateways are unique, purpose-built branch office solutions designed to address the requirements outlined above. In the following sections, we will elaborate on the key features of the NetDevices unified services gateways and how they enable enterprises to distribute multiple services to branch offices with unparalleled levels of serviceability and availability.

Always Available System Management

With critical business applications getting distributed to branch offices, it is vital for businesses to be able to respond rapidly to any branch office issues that could impact the availability of applications. In addition, the increasing sophistication of applications and networking services at the branch requires skilled, expensive IT resources to operate and manage branch office IT infrastructure. In order to efficiently and cost-effectively overcome this challenge, enterprise IT staff require always available access to all system management functions and they need 100% remote management access to all these functions.

The NetDevices system provides always available access to system management through patent-pending architectural innovations in its LifelineTM management framework. A key aspect of the Lifeline management framework is a dedicated management plane that is separate from the data and control planes. As noted in Figure 1 below, this is a significant differentiator from other solutions available in the market today and enables highly resilient management access to the system independent of the state of the system.


Serviceability Whitepaper 1
 

The Lifeline management framework is equipped with dedicated resources including an independent management plane, separate processors and separate management software processes. This enables complete isolation of system management functions from packet processing and control plane functions. As a result, management access to the system is unaffected under conditions such as failure of a data plane function (like routing or firewall), or high main processor utilization caused by high load or denial of service (DoS) attack. In contrast, with traditional solutions, there is no guarantee of being able to access the device when the main processing resource is unavailable.

Another key aspect of the Lifeline management framework is its built-in support for a "rescue" mode of operation. This mode plays a critical role in ensuring in-band management access and full management functionality under different types of failure modes. In the NetDevices system, typically all packets, including management data packets, are forwarded through the Services Engine (SE), which is the packet processing core of the system. Under the Lifeline management framework, there are multiple active instances of the management process running on each line card. Hence, if there is a problem with the SE, the "rescue" mode of operation is automatically initiated to ensure uninterrupted management access through processes running on a different line card (as illustrated in Figure 2 below). Full management functionality is available for rapid trouble-shooting and corrective action. In traditional solutions, such a scenario would have led to a complete loss of management access and functionality.


Serviceability Whitepaper 2
 
The NetDevices unified services gateways support multiple access mechanisms including in-band (primary) and out-of-band (secondary) access modes. While some traditional devices also support out-of-band access, by leveraging the dedicated management plane and intelligent software processes, NetDevices delivers the following unique advantages: (1) full in-band management functionality under a wide range of failure modes and (2) out-of-band access even if the data and control planes are not accessible.

Recovery from Failures

In all systems, failures tend to occur every now & then. However, what makes a system resilient is its ability to rapidly detect failures when they occur, automatically initiate corrective action if required and recover from the failures in a predictable manner. How this is achieved in the NetDevices system is illustrated using a couple of different example scenarios:
Software/ feature failure:
If a software component or feature fails, the feature health monitor in the management plane detects this and automatically initiates a restart of the process. In most cases, a restart of the feature will resolve the issue and the problem is fixed without any manual intervention. If there is an extended failure within a very short interval of time (typically two minutes), an alarm is raised to trigger manual intervention for trouble-shooting and restart of the feature. As noted in the previous section, NetDevices' Lifeline management framework ensures that remote management access is always available for rapid and efficient manual intervention.
Hardware failure:
We will first look at how the NetDevices system deals with a line card failure. Under the NetDevices Lifeline management framework, there are multiple active instances of the management process running in each line card. Hence, if there is a problem with just the data plane functionality on the line card, this will be detected by the management process on that card. The data plane can be re-initialized (automatically) or re-configured to fix the problem through the management plane processor on that card. On the other hand, if there is a failure of the line card itself, full management access to the card is still available through the management plane, which can be used to remotely power off/power on the card or do further trouble-shooting.

If there is a failure in the Services Engine, which normally forwards management packets to the management plane, this is detected by the Lifeline Manager. A special "rescue" mode of operation is automatically initiated in order to ensure that management data continues to be forwarded to the management plane through parallel management processes running on a different line card. This ensures continued in-band access and full management functionality for rapid trouble-shooting from a remote location. In contrast, traditional branch office solutions do not have the ability to automatically check for failures and initiate recovery procedures. Furthermore, a hardware failure will, in most cases, result in loss of management access since there are no separate resources for the management plane.

In addition to rapid detection and recovery for hardware failures, a key aspect of highly resilient systems is the ability to prevent the occurrence of failures through proactive monitoring and correction if required. This is achieved in the NetDevices system using the Chassis Manager sub-system. The Chassis Manager proactively monitors the health of each line card using detailed environmental information and automatically alerts the management system if corrective action is required.

Fast Response to Security Threats

In modern enterprises, it is normal for far-flung branch offices and remote workers to connect to the corporate network over different types of networks. In such an environment, enterprise networks have to guard against a wide array of security threats such as DoS attacks, viruses, worms, intrusions and illicit content. The traditional approach to layered security has not been fully effective in meeting these new security threats. By leveraging its unified multi-service architecture, NetDevices has adopted an innovative new approach to multi-layer security that has been designed to guard against the most sophisticated security threats.

With all networking services supported on the same device, the NetDevices system has the ability to apply security checks at the correct points in the data path, as illustrated in Figure 3 below. The least computing intensive checks are done first and packets are forwarded to the routing service module only after all checks are complete. This ensures quick & efficient detection of security threats, while enhancing overall performance. In addition, a common classification methodology is used through which all packets are classified once at the ingress filter for appropriate treatment. This eliminates the need for content to be processed multiple times in sub-systems such as IDS/IPS and web filter.


Serviceability Whitepaper 3
 
NetDevices provides a wide range of pre-defined DoS attack filters out of the box. Furthermore, the NetDevices system allows enterprise IT staff to write their own customized DoS filters by leveraging the extensive set of granular classifiers available to them.

If the system is under a DoS attack and the main processor utilization is very high, thanks to the separate management plane and dedicated management processors, remote management access will be unaffected. Hence, system administrators can rapidly initiate corrective measures, such as the addition of a new DoS filter, to prevent further attacks. In the new security environment, there is a constant need for addition of new filters and security updates. Through its support for always available remote management access, the NetDevices system allows such preventive & corrective measures to be efficiently propagated to all branch office locations.

Non-disruptive Servicing

With a multi-service branch office network, there will be a constant need for service upgrades, configuration changes and new service additions. If an upgrade or configuration change in one service brings down other services, this by itself can be a major contributor to system downtime and loss of productivity. Hence, it is critical that an integrated multi-service system be architected to overcome this challenge. NetDevices solves this problem through its ModuLiveTM operating system, which is a fully modular, always live software base that provides levels of availability and serviceability previously unavailable in enterprise branch office products.

Serviceability Whitepaper 4
 
The ModuLive OS architecture enables granular in-service software upgrades. Each service can be enabled or disabled with the ability to upgrade, rollback, fix or re-configure that service. New services can be initiated with no disruption to services in operation. The system allows new services to be dynamically inserted in the packet flow path. The ModuLive Service Manager supports comprehensive revision control and tracking of each service module with the ability to roll back to an older version if required.

The hardware architecture of the NetDevices unified services gateways supports live "plug & play" insertion and removal of cards. The ModuLive Chassis Manager supports dynamic detection of new hardware modules and configuration changes, thereby enabling seamless service continuity during hardware upgrades.

Due to the fully modular system design, failure of a service causes minimal or no disruption to other services in operation. The failed service can be restarted without impacting other services. This is in contrast to systems based on a monolithic software base in which a failure of one service will impact all other services and lead to a complete loss of service.

Unified Management of Multiple Services

To deliver the full set of operational benefits resulting from service convergence, a multi-service platform has to go beyond simple integration to support unified management of all services using a common management system. The NetDevices management system provides a comprehensive, unified, web-based or CLI management system to remotely manage all NetDevices-enabled branch office services. All services are managed via a common interface with granular, detailed instrumentation and control provided for all components and modules in the system. A sample screen shot of the unified NetDevices management system has been included below.


Serviceability Whitepaper 5
 
In a branch office network with multiple services, it is possible to have inadvertent conflicts between different services. For instance, the firewall access control policies may conflict with the routing policies. Since the NetDevices management system has full visibility over all services and built-in knowledge of the interactions between services, it supports an application-aware configuration process. A wizard-based approach is used to drive towards the right configuration for each service and automatically detect & resolve configuration conflicts. Furthermore, the common classification approach described earlier allows a uniform view of classification across all services, thereby facilitating consistency and ease of configuration.

A recent Sage Research study has revealed that as much as 39% of network outages are caused by configuration errors. Given that one hour of downtime can cost up to $4.5 million (source: Yankee Group research), branch office networking solutions that avoid or minimize configuration errors and conflicts can deliver significant cost savings.

Comprehensive Remote Management

The distribution of sophisticated applications and services to branch offices requires skilled IT staff to manage branch office networks. This poses a serious problem for enterprises due to the difficulty of finding skilled resources and the cost of staffing each branch office with such resources. To overcome this challenge, enterprises need the ability to efficiently perform all management functions remotely from a central Network Operations Center (NOC) facility.

The NetDevices system enables enterprises to meet this requirement by providing 100% remote manageability. With its Lifeline management framework, all the sophisticated management capabilities described in earlier sections can be performed remotely while appearing to be local, thereby eliminating the need for on-site intervention and delivering substantial operational cost savings.

Granular visibility and control is provided for remotely performing all critical management functions such as system monitoring, trouble-shooting, service provisioning, configuration management and software upgrades. This enables the centralized NOC staff to manage a complete remote office network and multiple services without the need for truck-rolls or on-site administration.

Summary

Enterprises are becoming more decentralized and distributing more & more applications to regional and branch offices. As a result, branch office networking requirements are changing and traditional solutions are unable to satisfy the new requirements. The best way to address these requirements is through a unified branch office services platform. A purpose-built unified services platform is required to overcome the limitations of traditional point products and fully deliver the benefits of integrating multiple network and security services. NetDevices Unified Services Gateways are unique, purpose-built solutions that enable enterprises to distribute multiple services to branch offices and meet the most demanding requirements for serviceability and service resiliency.


 

Copyright © 2005-2008, NetDevices Inc. All rights reserved. NetD, NetDevices, the NetDevices logo,
ModuLive, LifeLine & OnePass are trademarks of NetDevices, Inc.
Feedback | Privacy Policy