Imagine a hard day in life where your web server is down… due to some unknown reason.
How helpful it would be if some one/thing could tell you "OK Big guy.. this was the error and I 've corrected it".
NAGIOS is a network monitoring system with a web-based interface that tracks the health of servers and the services they provide. It does this by periodically polling the server/service with a health-checking script. If it detects what it believes is a failure state based on repeated health-check failures, it will note the specific server and take actions such as paging and emailing system administrators.
Cfengine is a policy engine that will detect a delta (difference) in a system's current configuration state and its optimal configuration state based on policy. It was developed by Mark Burgess of Oslo University College. Cfengine has many functions that facilitate self-healing. However, Cfengine runs only periodically because its delta detection process is too computationally intensive to run continuously. In most deployments, Cfengine runs once an hour.
By combining these two software packages, you can create a self-healing capability on your network. First, configure NAGIOS to do health checking on a server and, in the event of a failure, to invoke Cfengine on the remote server to repair the fault. The system will operate in a secure manner with little system or network overhead.