Distributed Testing Platform: design concept

Author: Stanislav Sinyagin
Document status: concept draft

UPD: the project name is now Mooxu

Introduction

In many network environments, especially in those of large ISPs or carriers, there’s a need to periodically execute some network tests. For example, an IPTV transport provider would need to make sure that all important multicast streams are available in every part of its edge network. Or a customer support engineer would need to collect byte and packet counters from a particular network port every 5 seconds.

The new software system (Project name: Mooxu) is designed to provide an open-source framework that enables the network operators to build the testing environment for their needs.  Also a number of open-source testing probe modules will be available.

Distributed architecture: the system consists of a central dispatcher, data aggregators, and the probes, and each of them can be located anywhere in the network on as many devices as necessary.

HTTP API: all system components communicate through a well-defined data protocol on top of HTTP transport.

Secure communication: all API clients authenticate themselves with the use of MD5 digests and shared passwords. SSL may be used for content encryption if enough CPU resource is available and data privacy is required.

Non-restrictive licensing: all components are distributed under a non-restrictive license (MIT?) that allows proprietary extensions without any limits.

Modular software design: the core software distribution provides the API and the tools for controlling the test execution and transporting their result data. Also a number of open-source plugin packages implement particular tests for various network technologies.

Dispatcher

The central dispatcher is the only component that is not distributed (although the design must not exclude redundancy and failover schemes).

The dispatcher is responsible for  controlling the test execution: configuring, starting, aborting, reporting the status, and pointing to the data store.

The dispatcher provides two different API’s: the one for the distributed system components, and the other one for northbound  OSS systems.

The dispatcher integrates with any standard HTTP server software, such as Apache.

The dispatcher maintains a database of all remote components and their properties. Each remote component is identified by a unique name and a shared secret.

Each test consists of a set of attributes:

  • core attributes: unique test ID, test classname, start time, duration, owner, probe name, data store name, …
  • class-specific attributes that define the particular test
  • custom attributes that the local system administrator assigns depending on the requirements

Component initialization and configuration

Each remote component (probes and data stores) communicates to the dispatcher and retrieves its configuration.

At the time of roll-out, each component is preconfigured with its unique name, shared password, and the URL of the dispatcher.

Each component is a permanently-running process that controls other processes and tasks. The components contact the dispatcher and refresh their configuration periodically (the refresh period is part of  the configuration supplied by the dispatcher).

Each component reports its local inventory to the dispatcher: the OS type and version, core software and the plugin versions, etc.

Data store

The system requires at least one data store component to save the test result data. Multiple data stores can be distributed across the network, depending on its topology, available bandwidth, and the system requirements.

Data stores also allow data replication and aggregation. For example, a test collects data every 2 seconds, and saves the results every 60 seconds to a local data store. Then the data is transported to the central location every 5 minutes, because the end user does not actually need it more frequently, but it’s important to have a quick response time for better interactivity.

Data stores also receive event notifications from the probes and forward them to an external event aggregator, such as SYSLOG or SNMP trap collector.

The data store runs an HTTP-based  data access API which allows external programs to retrieve the test results.

Probes

A probe is a process that runs on a probe machine.  It does not perform any tests itself, but spawns the test processes and controls their execution. A probe is identified by a unique name, and in theory, multiple probes may run on the same server.

The test processes are also responsible for sending the result data to the data store.

The test processes may generate events which would be dispatched with higher priority to the event management system. This enables the platform to run long-term or permanent tests and generate alarms.

The probe installation consists of the core package, as well as at least one plugin that implements particular tests.

Software implementation

All software components are distributed as GNU Autoconf-based packages, and are installable with the standard Autoconf procedure (./configure && make && make install). This guarantees a uniform installation procedure and easy packaging for any standard UNIX-like OS.

The main programing language for all components is planned to be Perl. The common API is also possible to implement in C and other programming languages, depending on the customer needs.

Advertisements

, , , , , ,

  1. #1 by Martin Egloff on October 1, 2011 - 4:16 pm

    Sounds very interesting indeed. What kind of tests is the software suppose to perform? Roundtrip, Jitter, Delay?

    • #2 by txlab on October 1, 2011 - 4:25 pm

      it’s a general framework for any kinds of tests and monitoring that would be required for a particular implementation. Roundtrip/Jitter/delay, and even SIP call emulation could be easily integrated too. At the moment I have a customer which wants to see the port traffic with 5-second interval.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: