Script monitors stop executing
I am a production Applications Manager customer running build 8200 on linux, but also running a test environment on Windows. The following issue was first discovered in our linux installation, but I reproduced it in a much smaller Windows test environment.
We utilize several remotely run script monitors at 5 minute intervals using ssh (port 22) with username/password for authentication. After running for some relatively small length of time (seems to be between 2 hours and several days), the exceptions in the attached exceptions.txt file occur.
After the exceptions the associated script monitor effectively freezes and is never run again by Applications Manager. In other words, the "Last Polled At" and "Next Poll At" times freeze.
After restarting Applications Manager, the affected script monitor runs OK for some length of time (from a few minutes to several days). But eventually, it always enters a state (with no Java exceptions I can find in the logs) in which the "Last Polled At" time is reported continuously as "-". The "Next Poll At" time is always reported as 1 to 3 minutes in the future, and it just keeps adjusting upward as time moves forward. The net effect is that the script monitor is never executed again.
Attached you'll see a MonitorImages.jpg file that shows two snapshots of the Monitor Information area of a script monitor's main screen. Notice the "-" in "Last Polled At" and notice how the time in "Next Poll At" just keeps incrementing (never executing the remote script).
I will send an email to support referencing this post and including a support information file (i.e., zipped logs) from my small Windows test network in which I reproduced the problem.
The net effect is that all of our production script monitors wind up in this state regardless of how often we restart Applications Manager. Some take a few days to get into the bad state, and some do so in 10 to 30 minutes. The effect is that all of the script monitors stop running. This is a serious problem for us.
I would appreciate a fix to this bug in the next service pack. Also, if there is something I can do to the state in the database to fix this in the interim I'd really appreciate it.
Thanks!
Brett Peterson
VisionShare, Inc.
New to ADSelfService Plus?