IATM : ISDN Automatic Test Manager

Back to Products

1. Brief overview of IATM

The ISDN Automatic Test Manager (IATM) was developed by City Computing to manage the fault-diagnostic process for the British Telecom integrated services digital network (ISDN).

It achieves four primary functions

  • it provides a unified view of all United Kingdom ISDN faults
  • it allows users to observe automatically updating lists of  fault reports and to flexibly distribute the workload across a work-force located at multiple "technical centres".
  • It provides a variety of "add-on" tools to assist and accelerate the diagnostic process.
  • It incorporates sophisticated automatic diagnostic and dispatch processes which work in unison with the user community. These processes are capable of dealing with the majority of faults, allowing the users to focus their efforts on the work that is too complex or unusual for an automated process to complete.

IATM uses the City Computing Task Browser technology to provide a coherent flexible view of a variety of data. This has proved to be straightforward for the users to adapt to and has brought with it improved speed of operation and ability for users to manage multiple fault reports without losing context.

The initial business case was focused on automatic diagnostics ("zero-touch" or "one-touch" customer fault handling). To this extent IATM has been very successful - now automatically handling approximately 80% of all ISDN2 (Basic Rate ISDN) and ISDN30 (Primary Rate ISDN) fault reports without the need for manual diagnostics. The automation runs alongside and with the co-operation and interaction of the manual users. (Manual users can resubmit fault reports for automatic analysis and can also easily continue diagnostics from the conclusions obtained by the automation). This has given the manual users a sense of ownership of the automatic processes and confidence in the outcomes. IATM has enabled the "de-skilling" of some routine ISDN diagnostic activities and has allowed more experienced users to rapidly progress difficult ISDN30 fault reports without the need to repeat the automatic analysis.

In a 2004 upgrade to IATM, the automation process can be flexibly extended by means of on-line tables amended by using the Task Browser interface (for users with appropriate authority).

It is well known that BT is committed to develop and deploy its 21C network. ISDN30 will remain an important component within the 21C environment. BT has decided that IATM will continue to perform a significant automated testing role within the 21C network. Design work to make IATM ready for the 21C network rollout has already started.

2. Overview of IATM system and BT infrastructure

IATM determines that there is a fault on a particular ISDN directory number from either of two events. The fault may be reported manually, by the customer talking to a BT fault receptionist who enters the data onto the fault reception and dispatch computer system (known within BT as CSS, Customer Support System). Alternatively, or additionally, the exchange may detect that the directory number is faulty and deliver fault information (known as EFI - Exchange Fault Information) to IATM.

In order to test an ISDN circuit, IATM makes use of the same diagnostic (ISDN-testing) systems that are used by the staff in the technical centres. Primary diagnostics are conducted using a BT system called SwitchManager. There are approximately 20 of these systems spread throughout the United Kingdom, each of which handle specific exchanges and specific directory numbers.

When IATM has obtained the details of a fault, it identifies the particular exchange and SwitchManager system that will be able to examine the state of the circuit. This is done by IATM checking through a local database, which is updated regularly from SwitchManager. Once this information has been recovered, IATM executes a complex series of test-scripts on the circuit in order to identify the fault. Once a fault has been tested, the conclusions are recorded back on the fault reception system. If IATM has been able to identify the repair agency that is responsible for fixing the fault, the fault is transferred to that agency by updating the records on the fault reception system (CSS) and if appropriate IATM can raise a "work requests" using a field-engineer work-issuing system called WorkManager. If IATM has not been able to isolate the fault, the fault is passed through to the technical centre for manual progression – this is done by the IATM "task-browser" user interface.

Historically most technical centre staff have direct access to the CSS fault reception systems  SwitchManager testing systems and WorkManager systems. IATM assists these users with write-back and update tools for these systems plus the ability to view IATM information through a number of windows-based tools within the City Computing Task Browser.

In addition to automatic diagnostics, IATM facilities include

  • Monitoring and distribution of fault reports.
  • Managing ("job controlling") the progression of fault reports, their escalation and the updating of customers by email and SMS
  • Bulk-handling of certain categories of associated faults.
  • A database giving the relationship of directory-numbers to exchanges and SwitchManager systems .
  • A database giving the relationship between bearer-id and directory number.

Within the first year of implementation, IATM automatically correctly diagnosed, dispatched or cleared approximately 40% of ISDN-2 (Basic rate ISDN) fault reports.

Over subsequent years these techniques have been improved and also extended to ISDN30 (Primary rate ISDN). IATM now automatically handles approximately 80% of all ISDN faults without technical centre user intervention. IATM's sister system ("IFETS" – ISDN Front End Test System) was developed make IATM script automation available to field engineers over a touch-tone telephone menu. This greatly reduced the need for the technical centre to assist field engineers with additional diagnostics.

3. IATM Resilient Architecture

City Computing have built complete systems on processors ranging in size from dedicated digital-signal-processing boards, micro, mini and mainframe systems. The company has a track-record in multi-processor systems and in dimensioning the hardware to meet the requirements. IATM is implemented using Sun Solaris on a network of 4 identical hardware-resilient machines. The IATM server software is implemented in such a way that it will run on any Unix-compliant system and the design is such that more processors can be included to scale-up the system if needed. The 4 machine solution has proved a cost-effective, reliable and resilient way of implementing the IATM requirements.

The IATM system stores all its information in a distributed database. A high degree of resilience is achieved by the innovative technique of each software component racing to handle aspects of the work progression. The first component to take ownership of the task carries the task through. This means that as long as any one of the systems has access to the resources it needs, the work will progress – it does not matter if any of the systems are partially compromised as the affected components on that system will not enter the "race". During normal operation this technique achieves automatic load balancing – the system doing more work becomes more loaded, slows down in the "race" and allows other systems to win their share of the work. In the event of any major or minor failure, the system adapts and balances to carry on the work. The system ensures that alerts are generated to key support staff, but it has been demonstrated that IATM can gracefully reduce from a system of 4 machines to a single machine supporting the entire user community without the users being aware of any changes or the need to perform any reconfiguration. When systems rejoin the network the automatically align to the latest data and resume participation.

4. Internal Software Design of IATM

IATM is designed in an extremely modular fashion, using CityNet networking software to communicate between modules, both within a single machine and between machines. The modular decomposition of IATM is illustrated below.

IATM has a central management module which takes care of starting and shutting down the IATM system. This also allows for the on-line substitution of modules with updated versions and can re-launch modules that terminate prematurely (up to a restart limit).

There is a central error-logging facility by which any problems with the system can be recorded and analysed. Up to 10 days of error history are maintained : so that any live-site problems can be fully diagnosed.

click to continue