A Simple Complex Scenario !!

A Simple Complex Scenario !!

Thought I'll share a simple,  yet complex scenario my colleague and I faced recently when troubleshooting an 'SNMP Failed' issue in OpManager .

 

Scenario : OpManager could not monitor the performance metrics of a Juniper Netscreen   SSG Firewall because SNMP authentication was failing.

 

Severity : A prospect from UK was evaluating SDP and OpManager and was happy with both the applications. He needed the above scenario in OpManager to be addressed to purchase the application.

 

To troubleshoot this issue we followed the 'Pick every different cookie out of the jar' approach.We connected to the prospect's server over a ZOHO remote meeting session.On debugging the SNMP V2c queries in OpManager ,we found that the response for the SNMP GET requests where showing 'Timed out' and 'No response for  polling' in the logs.The Firewall was the next hop to OpManager hence there is no question of packets being filtered by another device.We created a Firewall Policy in Netscreen firewall to allow all packets from OpManager   IP address.We moved this policy to the top of the  policies since there is the default top-bottom approach in matching policies.The issue persisted.

 

We went dumpster diving on the wireshark packet trace from OpManager which showed traps from the Juniper firewall but not the response for the  SNMP queries.On checking the Netscreen firewall logs,we found that there was no mention of OpManager's   IP address anywhere.It was showing SNMP related logs sourced from another IP address and also mentioned that the community string was an 'Unknown' community string.

 

The prospect was upset that they encountered the same issue while evaluating a competitor application & they couldn't figure it out.

 

We looked inwards and started checking our application database and logs to check whether SNMP was enabled, if the correct SNMP version was updated in our database tables and such. After multiple deletion & addition of the device, we were convinced that the issue should be with the packets interchanged between OpManager & the Firewall.

 

We checked the Firewall interfaces that were facing the internal network. Checked if   OpManager was reachable and also  if the IP addresses was correct. Went back to the firewall logs and started investigating the unknown IP address present in Firewall logs whenever OpManager queries the device. On analyzing, we found that it was from Cyberroam which is an internet content filtering tool. The prospect insisted that it was meant to filter only the packets reaching out to internet and not all packets. We weren't convinced.

 

To isolate this particular scenario, we asked the prospect to exclude OpManager's   IP address in the Cyberroam device and then saved the settings.

 

BINGO !! OpManager was now able to show that the SNMP credential was working! The performance monitor dials started appearing now. The prospect was excited and exclaimed that it was brilliant troubleshooting as he was literally breaking his head for the previous 2 weeks to figure this out. The Cyberroam device (being the root cause) was improperly configured to intercept and filter all packets exiting OpManager Server's interface and not just the ones reaching internet.

 

Glad this simple but complex issue was resolved in time, paving way for his purchase.Yay ! Another fist pump moment for us !!