1 Introduction
At the ULPGC's Network Division we administer several servers that run the
critical network services such as DNS, DHCP, Network monitoring, etc. Because
these services are critical, we run a number of scripts on every server that
check the sanity of these services and try to fix basic error situations.
We needed a way to distribute all these scripts and the important services'
configuration files from a centralized location with little differences to
adapt them to each host. We also needed a way to register any change on the
configuration files, to be able to detect when a particular error was
introduced (and who did it ;-)). We also wanted to centralize all network
incident notifications and alarm management.
To meet all these needs, we developed PICA. With PICA we have a central
repository of configuration files and alarm scripts. This repository is
managed using CVS, so we can recover old versions, see changelogs, and let
various admins to work concurrently. Actually, every sysadmin has a local copy
of the working tree, and CVS does all the dirty work.
In this scenario PICA is used to distribute the configuration files and alarm
scripts to the various servers. PICA uses SSH to establish secure connections
to the remote servers, which is very convenient, since we where already using
SSH with RSA authentication to access all remote servers.
The alarm scripts send incident notifications and service status reports to
our central NetSaint server using asynchronous checks (see netsaint
documentation). If a critical error is detected, an alarm is also sent via
e-mail and as a SMS message to the sysadmin mobile phone.