Dependency handling problem

Dependency handling problem

The dependency section of OpManager does not appear to handle faults in a WAN well where dependencies are declared.

To expand�. We have a network with several offices (with dozens of remote servers) and also multiple remote sites all linked via a wireless / vpn wan solution.

Opmanager does monitor the network well, no problem, but the alerting is a disaster.

Basically a simplified example:

Our main office is Local Office - (this hosts OpManager)

Local office Wireless point
|
|
Wireless Repeater
|
|
Remote Office Wireless point
|
|
Remote office � (servers, printers, switches etc)

So I set each remote point on the link to be dependent on the point immediately prior (closer to OpManager). Ie ALL of the remote office equipment is dependent on the �remote office wireless point�, this in turn in dependent on the wireless repeater, etc all the way back to the local office.

OK this is fine, the problem is when a fault actually occurs, say at the Wireless repeater. In this case ALL equipment further away than that point would not be available, that is we loose contact with all equipment at the remote office.

If OpManager polls the equipment at the remote office at this point of time but it has NOT yet polled the failed wireless repeater, it declares that the equipment is faulty and we receive an email alert of the failure, OpManager does not check the status of the dependency before changing the status of the equipment, but assumes that the status of all the dependency is the same as it was at the last time it was tested, which of course is not true.

Of course we then proceed to receive some dozens or emails and sms alerts until finally the failed point is checked, we then receive an alert that it (wireless repeater, in this example) has failed and from then on, no other alerts occur for remote equipment.

This appears to be a big fault in the logic. It is imperative that for dependencies to truly work, you cannot presume the status of the dependency has not changed since the last check.

I would strongly suggest that when a failure occurs and a dependency is declared, that this should automatically initiate a recheck of the dependency to verify the �real� location of the failure, before the status is changed, thus we would then receive only 1 alert for the real failed point. This would be much better and more accurate.

In addition, I would suggest another useful change for future consideration, when a test point fails that has a dependency declared it is then flagged as down, in fact it is not down (it is unreachable from the testing point), it really should have a third state of �unknown� as it�s dependency point is down and thus it�s true status cannot be determined.
































                  New to ADSelfService Plus?