link411 link412 link413 link414 link415 link416 link417 link418 link419 link420 link421 link422 link423 link424 link425 link426 link427 link428 link429 link430 link431 link432 link433 link434 link435 link436 link437 link438 link439 link440 link441 link442 link443 link444 link445 link446 link447 link448 link449 link450 link451 link452 link453 link454 link455 link456 link457 link458 link459 link460 link461 link462 link463 link464 link465 link466 link467 link468 link469 link470 link471 link472 link473 link474 link475 link476 link477 link478 link479 link480 link481 link482 link483 link484 link485 link486 link487 link488 link489 link490 link491 link492 link493 link494 link495 link496 link497 link498 link499 link500 link501 link502 link503 link504 link505 link506 link507 link508 link509 link510 link511 link512 link513 link514 link515 link516 link517 link518 link519 link520 link521 link522 link523 link524 link525 link526 link527 link528 link529 link530 link531 link532 link533 link534 link535 link536 link537 link538 link539 link540 link541 link542 link543 link544 link545 link546 link547

CloudWatch, Azure Monitor, and Stackdriver

Overview

  • Amazon CloudWatch is the platform that monitors Amazon Web Services (AWS).
  • Azure Monitor is Microsoft’s built-in monitoring service for the performance and health of Azure resources. At its most basic level, the model is similar to Cloudwatch: Azure Monitor consumes the telemetry data (performance and log data) that all Azure services generate and allows the user to visualize, query, route, archive, and take actions on the data.
  • Stackdriver, Google’s offering for delivering cloud monitoring capabilities, differs from both Cloudwatch and Azure monitor in a number of ways. Firstly, Stackdriver embraces not only Google Cloud Platform (GCP) but also AWS, providing unified monitoring of the two cloud platforms. Google touts Stackdriver’s multi-cloud strategy and, given Amazon’s prominent standing, it certainly broadens Stackdriver’s appeal.

Stackdriver

Stackdriver also includes a development (DevOPs) component in addition to IT monitoring. However, while the IT Operations functionality spans both AWS and GCP, the DevOPs functionality is Google-centric. Stackdriver is able to troubleshoot deployments on the Google platform with tracing and debugging functionality, and offers capabilities such as:

  • Stackdriver Monitoring measures the health of cloud resources and applications by providing visibility into metrics such as CPU usage, disk I/O, memory, network traffic and uptime. It is based on collectd, an open source daemon that collects system and application performance metrics. Users can receive customizable alerts when Stackdriver Monitoring discovers performance issues. It is used to monitor Google Compute Engine and Amazon EC2 VMs.
  • Stackdriver Error Reporting identifies and analyzes cloud application errors. A centralized error management interface provides IT teams with real-time visibility into production errors with cloud applications, as well as the ability to sort and filter content based on the number of error occurrences, when the error was first and last seen, and where the error is located.
  • Stackdriver Debugger inspects the state of an application, deployed in Google App Engine or Google Compute Engine, using production data and source code. During production, snapshots can be taken of an application’s state and linked back to a specific line location in the source code, without having to add logging statements. This inspection can occur without affecting the performance of the production application.
  • Stackdriver Trace collects network latency data from applications deployed in Google App Engine. Trace data is gathered, analyzed and used to create performance reports to identify network bottlenecks. Trace API and Trace SDK can be used to trace, analyze and optimize custom workloads, as well.
  • Stackdriver Logging provides real-time log management and analysis for cloud applications. Log data can be kept for longer periods of time by archiving it with Google Cloud Storage. The service works with both Google and AWS, and can gather logs from Google Compute Engine, Google App Engine and Amazon EC2.

Stackdriver Monitor

Stackdriver Monitoring collects metrics, events, and metadata from Google Cloud Platform, Amazon Web Services (AWS), hosted uptime probes, application instrumentation, and a variety of common application components including Cassandra, Nginx, Apache Web Server, Elasticsearch and many others. Stackdriver ingests that data and generates insights via dashboards, charts, and alerts.

Monitoring Agent

The Monitoring agent is a collectd-based daemon that gathers system and application metrics from virtual machine instances and sends them to Stackdriver Monitoring. By default, the Monitoring agent collects disk, CPU, network, and process metrics. You can configure the Monitoring agent to monitor third-party applications to get the full list of agent metrics.

Using the Monitoring agent is optional but recommended. Stackdriver Monitoring can access some metrics without the Monitoring agent, including CPU utilization, some disk traffic metrics, network traffic, and uptime information. Stackdriver Monitoring uses the Monitoring agent to access additional system resources and application services in virtual machine (VM) instances. If you want these additional capabilities, you should install the Monitoring agent.

Uptime Checks

Stackdriver can verify the availability of your service by accessing it from locations around the world. You can use the results from these uptime checks in your alerting policies, or you can directly monitor the results in the Stackdriver Monitoring uptime-check dashboards.

Alerting

Alerting gives timely awareness to problems in your cloud applications so you can resolve the problems quickly.

You use the Stackdriver Monitoring Console to set up alerting policies. Each policy specifies the following:

  • Conditions that identify an unhealthy state for a resource or a group of resources.
  • Optional notifications sent through email, SMS, or other channels to let your support team know a resource is unhealthy.
  • Optional documentation that can be included in some types of notifications to help your support team resolve the issue.

When events trigger conditions in one of your alerting policies, Stackdriver Monitoring creates and displays an incident in the Stackdriver Monitoring Console. If you set up notifications, Stackdriver Monitoring also sends notifications to people or third-party notification services. Responders can acknowledge receipt of the notification, but the incident remains open until resources are no longer in an unhealthy state.