Skip to content

Freshness checks

Ensuring data freshness is essential for maintaining the reliability and accuracy of datasets. In Soda, freshness checks can be defined to monitor and enforce data timeliness. Below are explanations and sample configurations for various types of freshness checks.

1. Basic freshness check: This check ensures that the dataset is not older than a specified duration.

In this example, the freshness(test) < 10d condition verifies that the dataset's most recent record is less than 10 days old.

- freshness(test) < 10d:
    name: Data should not be older than 10 days
    attributes:
      category: Freshness
      title: Data should not be older than 10 days

2. Freshness check with alerting: Configure alerts to notify stakeholders when the data freshness check fails to meet the specified criteria.

The following manifest file checks that the ingestion_time column indicates data ingested within the last 12 hours and sends alerts via email and Slack if the condition is not met.

- freshness(test) < 12h:
    column: ingestion_time
    name: Data ingestion should occur within the last 12 hours
    attributes:
      category: Freshness
      title: Data ingestion should occur within the last 12 hours
    alert:
      email: data-team@example.com
      slack: '#data-alerts'

Integrating freshness checks ensures proactive monitoring and maintenance of dataset timeliness, supporting compliance with organizational data quality standards.