Alerts for Runtime Failure¶
This section outlines how to configure a Monitor Resource that watches for a change in the status of a DataOS Resource (e.g., from active to deleted) and a Pager Resource that sends alerts when such a change is detected. This allows teams to respond quickly to unexpected state transitions, such as accidental deletions or misconfigurations.
-
Create a Monitor Resource manifest file. This manifest uses a Report Monitor to detect when a specific Resourceโs status changes from
activetodeleted.# Resource meta sectionname: runtime-monitor version: v1alpha type: monitor tags: - dataos:type:resource - dataos:layer:user description: Attention! instance secret is deleted layer: user monitor: # Monitor-specific sectionschedule: '*/2 - - - *' incident: name: depotincident severity: high incidentType: depotincident type: report_monitor # Report Monitor specificationreport: source: dataOsInstance: path: /collated/api/v1/reports/resources/status?id=depot:v2alpha:bigquery-depot conditions: - valueComparison: observationType: state valueJqFilter: '.value' operator: equals value: deleted -
Validate the monitor logic by testing the condition before applying the resource.
Expected output when the condition is not met:
bash CopyEdit INFO[0000] ๐ฎ develop observability... INFO[0000] ๐ฎ develop observability...monitor tcp-stream...starting INFO[0001] ๐ฎ develop observability...monitor tcp-stream...running INFO[0002] ๐ฎ develop observability...monitor tcp-stream...stopping INFO[0002] ๐ฎ context cancelled, monitor tcp-stream is closing. INFO[0003] ๐ฎ develop observability...complete RESULT (maxRows: 10, totalRows:0): ๐ง monitor condition not meExpected output when the condition is met:
INFO[0000] ๐ฎ develop observability... INFO[0000] ๐ฎ develop observability...monitor tcp-stream...starting INFO[0001] ๐ฎ develop observability...monitor tcp-stream...running INFO[0001] ๐ฎ develop observability...monitor tcp-stream...stopping INFO[0001] ๐ฎ context cancelled, monitor tcp-stream is closing. INFO[0002] ๐ฎ develop observability...complete RESULT (maxRows: 10, totalRows:1): ๐ฉ monitor condition met -
Apply the Monitor Resource using the following command.
-
Verify the Monitor runtime to confirm that the monitor is running and scheduled correctly.
-
Create a Pager Resource manifest file. This file defines how and where alerts will be sent when a Resource status change is detected.
name: status-alert-pager version: v1alpha type: pager description: Alert, Resource is deleted! workspace: <your-workspace> pager: conditions: - valueJqFilter: .properties.name operator: equals value: resource-status-alert output: msTeams: webHookUrl: https://rubikdatasolutions.webhook.office.com/webhookb2/09239cd8-92a8--9621-9bf6eec2-43f5-bf92-78e9f35a44fb/IncomingWebhook/92dcd2acdaee4e6cac125ac4a729e48f/631bd149-c89d--8979-8e364f62b419/V23AwNxCZx9JkwfToWpqDSYeRkDZ-cPn74p0HTqg1 -
Apply the Pager Resource using the command below.
-
Get notified! When the Resource status changes from
activeto any other value, the Monitor triggers an incident, and the Pager sends a notification to the configured destination, as shown below.