Introduction:
Duplicate monitors might be created in APMInsight due to various reasons. This article will help you identify and resolve the issue of duplicate monitors. The common causes include:
- Changes in monitor configuration. (apminsight.conf file)
- Multiple Connect requests during application startup.
- Cache issues / Other internal server issues.
To troubleshoot duplicate monitors in APMInsight, you can check and follow these steps:
1) Changes in Monitor Configuration (apminsight.conf file)
- Application Name Change
Any changes in application.name property in apminsight.conf file, would lead to new monitor creation.
Make sure to either revert this change, if it's unintentional, or if you want this name change, please do delete the old monitor
- Host Name/ Machine Name Change
If the host name or machine name of the monitored application changes, a new monitor might be created during the next product startup.
During the start-up, the agent sends the host name in connection request. APM validates this host name against the existing monitor data. If the host name differs from the existing monitor, it will be considered a new monitor request, and the system will create one.
This could be due to cloud, container based, VM host or any similar environments.
2) Multiple Connection Requests from Same Application
If the application sends multiple connection requests during startup, it can lead to the creation of duplicate monitors. This occurs in certain environments where multiple processes are triggered during startup.
This could be a common case in build versions less than v16610, where we handled this case. But still this occurs rarely in few cases.
More than 1 monitor would be created for same instance and one would be in UP state and others would be in down.
Use the below queries to identify and fix this case.
- SELECT instancename, COUNT(resourceid) AS monitor_count FROM apm_instances GROUP BY instancename, host, port, applicationid ORDER BY monitor_count DESC;
If monitor_count is more than 1 from above query for any resource, then duplicates exist.
Use the Below Update Queries to identify and rename duplicate monitors
PGSQL
WITH DuplicateInstances AS ( SELECT resourceid, host, port, applicationid FROM apm_instances WHERE (host, port, applicationid) IN ( SELECT host, port, applicationid FROM apm_instances GROUP BY host, port, applicationid HAVING COUNT(*) > 1 ) ), OldestCommunication AS ( SELECT resourceid, agentcommunicationtime FROM apm_instances_ext ), AlertStatus AS ( SELECT source AS resourceid, CASE WHEN category = '20005' AND severity = 1 THEN 1 ELSE 0 END AS is_down, CASE WHEN category = '20005' AND severity = 5 THEN 1 ELSE 0 END AS is_up FROM alert ), FilteredResources AS ( SELECT dg.resourceid, dg.host, dg.port, dg.applicationid, oc.agentcommunicationtime, alert_status.is_down, alert_status.is_up FROM DuplicateInstances dg JOIN OldestCommunication oc ON dg.resourceid = oc.resourceid JOIN AlertStatus alert_status ON dg.resourceid = alert_status.resourceid ), FinalSelection AS ( SELECT resourceid, agentcommunicationtime FROM FilteredResources WHERE is_up = 1 GROUP BY resourceid, agentcommunicationtime HAVING COUNT(*) = 1 UNION ALL SELECT resourceid, agentcommunicationtime FROM FilteredResources WHERE is_up = 0 OR (SELECT COUNT(*) FROM FilteredResources WHERE is_up = 1) != 1 ORDER BY agentcommunicationtime DESC LIMIT 1 ) UPDATE am_managedobject SET displayname = 'DUPLICATE_' || displayname WHERE resourceid IN ( SELECT resourceid FROM DuplicateInstances WHERE resourceid NOT IN (SELECT resourceid FROM FinalSelection) );
MSSQL
WITH DuplicateInstances AS ( SELECT resourceid, host, port, applicationid FROM apm_instances WHERE (host, port, applicationid) IN ( SELECT host, port, applicationid FROM apm_instances GROUP BY host, port, applicationid HAVING COUNT(*) > 1 ) ), OldestCommunication AS ( SELECT resourceid, agentcommunicationtime FROM apm_instances_ext ), AlertStatus AS ( SELECT source AS resourceid, CASE WHEN category = '20005' AND severity = 1 THEN 1 ELSE 0 END AS is_down, CASE WHEN category = '20005' AND severity = 5 THEN 1 ELSE 0 END AS is_up FROM alert ), FilteredResources AS ( SELECT dg.resourceid, dg.host, dg.port, dg.applicationid, oc.agentcommunicationtime, alert_status.is_down, alert_status.is_up FROM DuplicateInstances dg JOIN OldestCommunication oc ON dg.resourceid = oc.resourceid JOIN AlertStatus alert_status ON dg.resourceid = alert_status.resourceid ), FinalSelection AS ( SELECT resourceid, agentcommunicationtime FROM FilteredResources WHERE is_up = 1 GROUP BY resourceid, agentcommunicationtime HAVING COUNT(*) = 1 UNION ALL SELECT resourceid, agentcommunicationtime FROM FilteredResources WHERE is_up = 0 OR (SELECT COUNT(*) FROM FilteredResources WHERE is_up = 1) != 1 ORDER BY agentcommunicationtime DESC OFFSET 0 ROWS FETCH NEXT 1 ROWS ONLY ) UPDATE am_managedobject SET displayname = 'DUPLICATE_' + displayname WHERE resourceid IN ( SELECT resourceid FROM DuplicateInstances WHERE resourceid NOT IN (SELECT resourceid FROM FinalSelection) );
3) Cache issues / Other internal server issues.
There could be rare cases where duplicates are created due to internal cache failures and other server issues. In such cases, validation might be skipped, resulting in the creation of a new monitor. We are working on fixing this issue in upcoming releases and will address any related stray entries and older monitors during the upgrade process.
Note: For now, please follow the steps mentioned in the previous cases(Step 2) to identify this issue, and rename duplicate monitors.
Contact support for further assistance, with screenshot from Bulk Config view page.