General

Project Ideas/Request

Projects (Hons) 2000/01

Comp. and E-comm
Real-time booking
Autonomous Agents

TCP/IP Emulator
Database link
Remote Learing

Projects (Hons) 2001/02

On-line voting
Agent Recovery
HTTP Monitoring
File Sharing

Prev. Projects (Hons)

Fire Wire (IEEE-1394)
E-Commerce

Prev. Projects (MSc)

RSA Method
Components
Mobile Agents
Gigabit Ethernet

Programmable Router


E-commerce
ASP/Database
Java booking system
Agents
TCP/IP Emulator
Database link

2000/2001 Hons reports

Autonomous Agents for Systems Management within a Microsoft Windows NT Environment

Jim Holmes [2000/01]

Stop Press: Final Report

Initial Specification

My company uses an IBM AS/400 as its main enterprise platform, but as with most companies these days more and more of our applications are being moved from the AS/400 to NT server. As part of this move I have built and am continuing to evolve our company intranet services running on IIS 4.0. We use the intranet for traditional web base applications such as on-line documentation etc. but I have also used it as a delivery mechanism for less traditional client server application.

"I'm glad to be out of that bag."

Quote from the Macintosh computer when it introduced itself.

This application used Microsoft COM technologies such as remotely distributed ActiveX controls to enter and display SPC information in the form of charts and Remote Data Services to collect data from both the AS/400 and a Microsoft SQL 7.0 database server. The system also uses remote mail using the SMTP server installed as part of Microsoft's Option Pack 4.0.

The AS/400 has numerous jobs that run on it, one of these jobs is shutting down the sub system that handles client interaction and ODBC support. When the intranet server attempts to access data stored on the AS/400 under these conditions the IBM provided ODBC driver causes a general protection error and falls over. Because the intranet server uses in-process COM components the IIS service stops, preventing access to the intranet site. For my project I was considering writing and application that would run as an NT service and monitor other NT services. The user would be allowed to select NT services of interest either on their local machine or remote machines, enable monitoring of those services and specify actions to be taken in the event of the services changing from the desired state. To facilitate the above, the project would require a distributed architecture in the form of independent agents. This would cut down on network traffic as well as allowing the agent to provide monitoring services in the event that the administration console was unavailable.

"Hello, I am Macintosh. Never trust a computer you cannot lift."

Quote from the Macintosh computer when it introduced itself.

Interim Report

Introduction

Computer systems have evolved from the early days of the first microprocessor. With the advent of the networking of computer systems and the various architectures that evolved the maintenance of these systems and the greater demand for higher availability have placed greater responsibility on systems administrators.

With the wide spread adoption of Microsoft's Windows NT and Windows 2000 operating systems in the traditionally mainframe and Unix areas, the opportunity exists to explore the possibility of creating a systems management architecture to aid the administrator in the day to day monitoring of these systems.

Since the release of Windows NT 4.0 a large number of organisations have deployed NT as their operating system of choice. For many different reasons not least Microsoft's component technology known as "Component Object Model" or COM. This allows applications to integrate easily with one another both locally and across a network. This ability to share application data has proved to be a powerful driving force behind the success of the NT platform.

Microsoft has further provided enterprise tools in various forms with the inclusion of the NT options pack, which provides Web services, Transaction services, Message delivery services and also the Back Office suite of applications which provides enterprise Database services, e-mail services, System Management services.

Project Background

This project has been born out of a need to provide high availability for a mission critical Web based enterprise system. The system is based on Microsoft's Internet Information Server, which runs as a service on Microsoft's Windows NT and 2000 Server platforms.

The project is a vehicle to allow the research of various topics including componentware and distributed architectures, but the main aim of the project is to demonstrate the possibility of using an autonomous agent that the systems administrator can install on one machine that will then wander around the domain monitoring the services it has been instructed to.

The notion of using an agent in system management software in nothing new. Most of these "agents" depend on some type of installation procedure to be implemented. Whether it is manually installed, installed via a log on script or installed by some software deployment system. The drawbacks of which are obvious:

Most servers run in a lights out condition, if the agent depends on a logon script this will not be run until someone logs on.

If the agent needs to be manually installed the system Administrator must remember to install the agent on any system to be monitored.

·If new systems are installed or existing systems configurations are modified to run new services the system Administrator must remember to install or modify the agent.

An autonomous agent on the other hand can continuously wander the Domain looking for systems that have services running on them that it has been configured to monitor. If it finds such a system it can then deploy itself on that system, spawning a new agent to wander the rest of the Domain looking for another suitable host.

In such a scenario the system administrator can introduce new servers or reconfigure existing servers in the knowledge that the agent will detect if the server is running a service that requires to be monitored and if so deploy itself onto the server and begin monitoring.


Project Objectives

This project hopes to determine the possibility of creating an "Administration Agent" that is autonomous in nature, having the ability to wander round a Windows NT domain, determining what services are running and monitoring services that it is interested in. The project also aims to demonstrate the use of componentware in a distributed architecture using Microsoft's Component Object Model.

Operational Specification

The aim is to allow the system administrator to introduce an agent into a NT Domain. Configure it to look for any service that is deemed necessary to ensure the proper operation of the system, or systems and then allow the agent to "wander" around the Domain seeking out any services that it should monitor.

The agent should have the ability not only to monitor the required services but also to re-start the service in the event of failure or stop a service from running if so desired. This ability should be coupled with the ability to notify the relevant administration personnel that the service has failed.

It is also desirable to have some means to control the number of re-start attempts before the agent either stops attempting to re-start the service or pauses for a pre-determined duration before attempting another re-start.

In order to achieve this the agent must be able to be deployed remotely by another agent. A typical scenario would be for a deployed agent to detect a machine on the network that is running a service that has been select for monitoring. The existing agent would then copy its executable file over to the machine and remotely start the new agent.

In order to allow the agent to restart after the system has been re-booted. There must be some mechanism to allow the agent to be started automatically. There are a number of options available to allow an executable file to start automatically but most of these require an interactive user to be logged on to the system.

In order to allow the agent to perform its duties as a background process it requires to run as an NT service. An NT service is just a standard executable with the addition of some control interfaces that allow the Service Control Manager (SCM) to control and interrogate the service.

There are also some security considerations to be taken into account in order to allow the service to access the network. This entails creating a new account with the correct security rights for the service to run under. The default account is the system account, while this account gives full access to the local machine it will not allow a service running under this account to access network resources.

The agent will therefore contain a number of objects that provide various functions. These functions are as follows:

The agents primary task is to monitor predetermined services and restart them in the event of them stopping or stopping a service from running if required. Therefore the agent will require a service-monitoring component. This component will be required to keep a list of all services that the agent is interested in. The component will be required to interact with the service control manager (SCM) in order to track the status of and control the desired services.

In order to reduce network traffic and agent collisions on the network some form of control resolution must be put in place. This would allow one of the deployed agents to become the master agent allowing it to co-ordinate the other agents in the domain.

In order to track the appearance of new systems within the domain the agent will require some way of determining what systems are available and if the target services are running on those machines. This will require a machine-tracking component. The component should be able to determine the address of any Windows NT / 2000 machines that are members of the domain and then interrogate them to see if any target services are running on those machines. If the agent is considered to be the master agent then the component should keep track of all known machines and be able to identify machines that have agents running on them.

Provision for control of the agent to allow the addition of services and the removal of services to be monitored.

Provision for specific machines to be excluded from the agent deployment list. This would allow the administrator to stop deployment of the agent on specific machines.


Implementation

In order to implement the agent's functionality various low-level API calls will be required. If the Agent has to run as a NT service it will be required to integrate with the Service Control Manager (SCM). By its very nature, a service must run as multithreaded application in order to comply with the requirements of the SCM. Also in order to make the agent more efficient it will be designed to supply its functionality as a multithreaded application.

This effectively rules out the possibility of using Visual Basic as the programming language of choice. In order to implement the multithreading requirements of the agent application only C/C++ or Java remains as a viable choice.

The drawback of using Java is its dependency on the JVM. If a server had target services running on it but no JVM the agent would not be able to deploy. With this in mind the only choice available, that is capable of deploying to any Win32 platform is C/C++ and in this case C++. The choice of programming language is based on my experience and also the ability to utilise Visual C++'s Active Template Library (ATL).

ATL provides a light weight framework for COM development based on similar paradigms utilised in the development of the Standard Template Library, where a framework of template classes are used to provide additional functionality with a low overhead. Microsoft expanded on this functionality by providing a suite
C/C++ Macros that allow the developer to tailor ATL to their specific requirements. Functionality such as the type of threading model and how the object is instantiated can easily be modified.

This document is not intended to provide detailed information on the use of ATL, but where relevant the use of ATL will be discussed. Version 6 of Microsoft's Visual C++ provides a variety of wizards designed to make the use of ATL for standard projects easier for the developer. One of these wizards provides a skeleton framework for creating a NT service as a COM server. The initial implementation of the agent will be provided by using this wizard.

The wizard generated code has a number of failings, some of them minor and some more serious. Starting with the smaller problems, the wizard uses the same name for both the service name and the service display name. This is a minor inconvenience, but it would be more professional to have the Service Display Name provide a more meaning full description of the service that can be viewed from the Service Control Manager.

On a more serious note, the wizard by default uses a single thread for COM services. It also serialises access to all objects through this one thread. In order for COM to serialise access to objects through a single thread a Windows message loop must be employed. This requires the service to use unnecessary resources to process the loop. This has an effect on overall performance and scalability of the service. It would be far better to utilise a kernel event object and allow the service to become blocked whilst waiting for the event to become signalled.

Most of the above criteria will require the use of the Win32® API. This will provide the necessary functions required in order to achieve the above functionality. This coupled with the use of COM/DCOM should provide a robust scalable agent capable of carrying out the above tasks.

It is intended to provide the ability to configure the agent using COM/DCOM. The agent will house a COM server that's interface will allow target services to be added and removed from the agent. The COM server will implement automation interfaces allowing the COM object to be accessed from Visual Basic, VB Script, JScript, C/C++, Java or any other COM compliant language.

By implementing the interface in this manner the agent can be configured by any object that knows about the COM object that the service provides. This allows any application that is capable of manipulating COM objects to interact with the Agent whether the application is located on the same machine or on a remote machine across the network.

Bibliography

Professional NT Services: Author Kevin Miller Wrox Press ISBN 1861001304

Beginning ATL COM Programming: Author Dr Richard Grimes Wrox Press ISBN1861000111

Professional ATL COM Programming: Author Dr Richard Grimes Wrox Press ISBN1861001401

Professional COM Applications with ATL: Authors Sing Li and Panos Econompoulos Wrox Press ISBN 1861001304

Chapters

Interim report.