Troubleshoot - Duplicate APMInsight Monitors

Troubleshoot - Duplicate APMInsight Monitors

Introduction:

Duplicate monitors might be created in APMInsight due to various reasons. This article will help you identify and resolve the issue of duplicate monitors. The common causes include:
      -      Changes in monitor configuration. (apminsight.conf file)
      -      Multiple Connect requests during application startup.
      -      Cache issues / Other internal server issues.
 
To troubleshoot duplicate monitors in APMInsight, you can check and follow these steps:

1) Changes in Monitor Configuration (apminsight.conf file)
- Application Name Change
                 Any changes in application.name property in apminsight.conf file, would lead to new monitor creation.
                 Make sure to either revert this change, if it's unintentional, or if you want this name change, please do delete the old monitor
- Host Name/ Machine Name Change
                  If the host name or machine name of the monitored application changes, a new monitor might be created during the next product startup.

During the start-up, the agent sends the host name in connection request. APM validates this host name against the existing monitor data. If the host name differs from the existing monitor, it will be considered a new monitor request, and the system will create one.
This could be due to cloud, container based, VM host or any similar environments.

2) Multiple Connection Requests from Same Application

      If the application sends multiple connection requests during startup, it can lead to the creation of duplicate monitors. This occurs in certain environments where multiple processes are triggered during startup.
This could be a common case in build versions less than v16610, where we handled this case. But still this occurs rarely in few cases.
More than 1 monitor would be created for same instance and one would be in UP state and others would be in down.
Use the below queries to identify and fix this case.
  1.       SELECT instancename, COUNT(resourceid) AS monitor_count FROM apm_instances GROUP BY instancename, host, port, applicationid ORDER BY monitor_count DESC;
      If monitor_count is more than 1 from above query for any resource, then duplicates exist.
      Use the Below Update Queries to identify and rename duplicate monitors
PGSQL
MSSQL
PGSQL
WITH DuplicateInstances AS ( SELECT resourceid, host, port, applicationid FROM apm_instances WHERE (host, port, applicationid) IN ( SELECT host, port, applicationid FROM apm_instances GROUP BY host, port, applicationid HAVING COUNT(*) > 1 ) ), OldestCommunication AS ( SELECT resourceid, agentcommunicationtime FROM apm_instances_ext ), AlertStatus AS ( SELECT source AS resourceid, CASE WHEN category = '20005' AND severity = 1 THEN 1 ELSE 0 END AS is_down, CASE WHEN category = '20005' AND severity = 5 THEN 1 ELSE 0 END AS is_up FROM alert ), FilteredResources AS ( SELECT dg.resourceid, dg.host, dg.port, dg.applicationid, oc.agentcommunicationtime, alert_status.is_down, alert_status.is_up FROM DuplicateInstances dg JOIN OldestCommunication oc ON dg.resourceid = oc.resourceid JOIN AlertStatus alert_status ON dg.resourceid = alert_status.resourceid ), FinalSelection AS ( SELECT resourceid, agentcommunicationtime FROM FilteredResources WHERE is_up = 1 GROUP BY resourceid, agentcommunicationtime HAVING COUNT(*) = 1 UNION ALL SELECT resourceid, agentcommunicationtime FROM FilteredResources WHERE is_up = 0 OR (SELECT COUNT(*) FROM FilteredResources WHERE is_up = 1) != 1 ORDER BY agentcommunicationtime DESC LIMIT 1 ) UPDATE am_managedobject SET displayname = 'DUPLICATE_' || displayname WHERE resourceid IN ( SELECT resourceid FROM DuplicateInstances WHERE resourceid NOT IN (SELECT resourceid FROM FinalSelection) );

MSSQL
WITH DuplicateInstances AS ( SELECT resourceid, host, port, applicationid FROM apm_instances WHERE (host, port, applicationid) IN ( SELECT host, port, applicationid FROM apm_instances GROUP BY host, port, applicationid HAVING COUNT(*) > 1 ) ), OldestCommunication AS ( SELECT resourceid, agentcommunicationtime FROM apm_instances_ext ), AlertStatus AS ( SELECT source AS resourceid, CASE WHEN category = '20005' AND severity = 1 THEN 1 ELSE 0 END AS is_down, CASE WHEN category = '20005' AND severity = 5 THEN 1 ELSE 0 END AS is_up FROM alert ), FilteredResources AS ( SELECT dg.resourceid, dg.host, dg.port, dg.applicationid, oc.agentcommunicationtime, alert_status.is_down, alert_status.is_up FROM DuplicateInstances dg JOIN OldestCommunication oc ON dg.resourceid = oc.resourceid JOIN AlertStatus alert_status ON dg.resourceid = alert_status.resourceid ), FinalSelection AS ( SELECT resourceid, agentcommunicationtime FROM FilteredResources WHERE is_up = 1 GROUP BY resourceid, agentcommunicationtime HAVING COUNT(*) = 1 UNION ALL SELECT resourceid, agentcommunicationtime FROM FilteredResources WHERE is_up = 0 OR (SELECT COUNT(*) FROM FilteredResources WHERE is_up = 1) != 1 ORDER BY agentcommunicationtime DESC OFFSET 0 ROWS FETCH NEXT 1 ROWS ONLY ) UPDATE am_managedobject SET displayname = 'DUPLICATE_' + displayname WHERE resourceid IN ( SELECT resourceid FROM DuplicateInstances WHERE resourceid NOT IN (SELECT resourceid FROM FinalSelection) );

3)  Cache issues / Other internal server issues. 

      There could be rare cases where duplicates are created due to internal cache failures and other server issues. In such cases, validation might be skipped, resulting in the creation of a new monitor. We are working on fixing this issue in upcoming releases and will address any related stray entries and older monitors during the upgrade process.


Note: For now, please follow the steps mentioned in the previous cases(Step 2) to identify this issue, and rename duplicate monitors.
Contact support for further assistance, with screenshot from Bulk Config view page.


                  New to ADSelfService Plus?

                    • Related Articles

                    • Troubleshooting URL Monitor

                      Here are few of the common errors you may come across in URL monitor, we have mentioned the steps you can follow to troubleshoot them. General troubleshooting for URL monitor Ensure that the URL is accessible from the server in which Applications ...
                    • Duplicate Child Monitors Added in Amazon Monitor

                      Reason: When duplicate child monitors are added in Amazon monitor, it may be a result of multiple discovery threads (Amazon parent monitor) being initiated during the monitor addition. The issue of duplicate monitors may occur when multiple polls are ...
                    • Self monitor Applications Manager using APM Insight Java Agent

                      Applications Manager is built with Java, hence we can monitor it using APM Insight Java Agent to measure it's performance continuously, which can be very much useful. Setting up APM Insight Java Agent Follow the below steps to download and set up the ...
                    • Script/Custom Monitors - Alarms configured for Table rows were missing

                      In the Applications manager , users have the ability to manage the table rows of a script or custom monitor type according to their specific requirements. This can be done by enabling the "Enable Script Row Deletion" option. For example, let's ...
                    • Real User Monitor (RUM) - Troubleshooting

                      If the monitor has not polled data for a long time, follow the steps below for troubleshooting: Step 1: Check the RUM Agent configuration Real User Monitor requires the RUM Agent to be installed and mapped to the Applications Manager. Refer the help ...