Ceph storage monitoring troubleshooting steps

Ceph storage monitoring troubleshooting steps

Ceph Storage Monitoring:

By default Ceph monitor added via ssh. so ssh should be work if the customer want to monitor the Ceph storage for the server.

Supported Ceph Storage Version : >= v0.66

Command used to collected all performance data :  ceph -s -f json-pretty

For any data collection issue or authentication issue .

First ask customer to run ServerSSHTroubleshoot.sh/bat and check whether ssh connection is working properly or not. if the ssh working properly then run the below command in remote machine and check the output and output should be in below format else you can see the error message "No performance data collected, as Ceph status command returns no output" on UI

ceph -s -f json-pretty

Sample output :

{ "health": { "health": { "health_services": [
                { "mons": [
                        { "name": "deis-cc4a9dd1-ed85-4958-9fcd-51444b43cfb3.novalocal",
                          "kb_total": 39514092,
                          "kb_used": 11348408,
                          "kb_avail": 27876392,
                          "avail_percent": 70,
                          "last_updated": "2014-11-15 21:06:50.920246",
                          "store_stats": { "bytes_total": 28439194,
                              "bytes_sst": 0,
                              "bytes_log": 1592115,
                              "bytes_misc": 26847079,
                              "last_updated": "0.000000"},
                          "health": "HEALTH_OK"},
                        { "name": "deis-a4a81c52-ac05-44d1-880c-5949c4777ba3.novalocal",
                          "kb_total": 39514092,
                          "kb_used": 12296928,
                          "kb_avail": 26675680,
                          "avail_percent": 67,
                          "last_updated": "2014-11-15 21:06:51.557871",
                          "store_stats": { "bytes_total": 28970803,
                              "bytes_sst": 0,
                              "bytes_log": 2381808,
                              "bytes_misc": 26588995,
                              "last_updated": "0.000000"},
                          "health": "HEALTH_OK"},
                        { "name": "deis-0b8f4cec-a9f3-4ce8-9b1b-b2a9a17ff2dd.novalocal",
                          "kb_total": 39514092,
                          "kb_used": 12030260,
                          "kb_avail": 27192460,
                          "avail_percent": 68,
                          "last_updated": "2014-11-15 21:06:44.210473",
                          "store_stats": { "bytes_total": 29148495,
                              "bytes_sst": 0,
                              "bytes_log": 2295888,
                              "bytes_misc": 26852607,
                              "last_updated": "0.000000"},
                          "health": "HEALTH_OK"}]}]},
      "summary": [
            { "severity": "HEALTH_WARN",
              "summary": "1536 pgs degraded"},
            { "severity": "HEALTH_WARN",
              "summary": "1536 pgs stuck degraded"},
            { "severity": "HEALTH_WARN",
              "summary": "1536 pgs stuck unclean"},
            { "severity": "HEALTH_WARN",
              "summary": "1536 pgs stuck undersized"},
            { "severity": "HEALTH_WARN",
              "summary": "1536 pgs undersized"},
            { "severity": "HEALTH_WARN",
              "summary": "recovery 634\/1896 objects degraded (33.439%)"}],
      "timechecks": { "epoch": 252,
          "round": 4,
          "round_status": "finished",
          "mons": [
                { "name": "deis-cc4a9dd1-ed85-4958-9fcd-51444b43cfb3.novalocal",
                  "skew": "0.000000",
                  "latency": "0.000000",
                  "health": "HEALTH_OK"},
                { "name": "deis-a4a81c52-ac05-44d1-880c-5949c4777ba3.novalocal",
                  "skew": "0.000812",
                  "latency": "0.273973",
                  "health": "HEALTH_OK"},
                { "name": "deis-0b8f4cec-a9f3-4ce8-9b1b-b2a9a17ff2dd.novalocal",
                  "skew": "0.000000",
                  "latency": "1.771649",
                  "health": "HEALTH_OK"}]},
      "overall_status": "HEALTH_WARN",
      "detail": []},
  "fsid": "7500c071-e3ee-4c5b-abb4-ddbd02759a46",
  "election_epoch": 252,
  "quorum": [
        0,
        1,
        2],
  "quorum_names": [
        "deis-cc4a9dd1-ed85-4958-9fcd-51444b43cfb3.novalocal",
        "deis-a4a81c52-ac05-44d1-880c-5949c4777ba3.novalocal",
        "deis-0b8f4cec-a9f3-4ce8-9b1b-b2a9a17ff2dd.novalocal"],
  "monmap": { "epoch": 3,
      "fsid": "7500c071-e3ee-4c5b-abb4-ddbd02759a46",
      "modified": "2014-11-14 13:42:35.275754",
      "created": "2014-11-14 13:41:52.048963",
      "mons": [
            { "rank": 0,
              "name": "deis-cc4a9dd1-ed85-4958-9fcd-51444b43cfb3.novalocal",
              "addr": "10.21.12.27:6789\/0"},
            { "rank": 1,
              "name": "deis-a4a81c52-ac05-44d1-880c-5949c4777ba3.novalocal",
              "addr": "10.21.12.28:6789\/0"},
            { "rank": 2,
              "name": "deis-0b8f4cec-a9f3-4ce8-9b1b-b2a9a17ff2dd.novalocal",
              "addr": "10.21.12.29:6789\/0"}]},
  "osdmap": { "osdmap": { "epoch": 88,
          "num_osds": 5,
          "num_up_osds": 2,
          "num_in_osds": 2,
          "full": false,
          "nearfull": false}},
  "pgmap": { "pgs_by_state": [
            { "state_name": "active+undersized+degraded",
              "count": 1536}],
      "version": 11577,
      "num_pgs": 1536,
      "data_bytes": 773294495,
      "bytes_used": 23943340032,
      "bytes_avail": 56388460544,
      "bytes_total": 80924860416,
      "degraded_objects": 634,
      "degraded_total": 1896,
      "degraded_ratio": "33.439",
      "read_bytes_sec": 24849,
      "write_bytes_sec": 2935,
      "op_per_sec": 8},
  "mdsmap": { "epoch": 50,
      "up": 1,
      "in": 1,
      "max": 1,
      "by_rank": [
            { "rank": 0,
              "name": "deis-cc4a9dd1-ed85-4958-9fcd-51444b43cfb3.novalocal",
              "status": "up:active"}],
"up:standby": 2}}
          • Related Articles

          • Troubleshooting steps for Server Hardware Health Monitoring

            1. SNMP Mode of monitoring: Monitoring Dell hardware status: Dell OpenManage Server Administrator and make sure SNMP agent is enabled Installation steps:  http://www.dell.com/downloads/global/power/ps2q06-20050112-Lou-OE.pdf Monitoring HP hardware ...
          • Troubleshooting steps when there is an exception while monitoring ASP .NET applications.

            Use Case: Sometimes, you may receive the exception "Loading this assembly would produce a different grant set from other instances.", while running the APM Insight .NET agent in your ASP .NET applications. This is because, by design, the agent works ...
          • Troubleshooting SSL Handshake Error

            SSL Handshake Error SSL Handshake error occurs when a secure connection cannot be established to the URL added for monitoring. Common reasons for it are wrong SSL protocol version, incompatible ciphers, and invalid/missing client-side certificate.  ...
          • Troubleshooting URL Monitor

            Here are few of the common errors you may come across in URL monitor, we have mentioned the steps you can follow to troubleshoot them.   General troubleshooting for URL monitor Ensure that the URL is accessible from the server in which Applications ...
          • IIS Server monitor - Troubleshooting steps

            Troubleshooting guide for adding an IIS Server monitor Troubleshooting 'No data available' error for IIS Server performance metrics (IIS Websites and Application Pools statistics)