Introduction
Data validation lets you define rules that detect anomalies or unexpected results after a workflow run. Use it to detect missing values, outliers, and schema issues before downstream use.Prerequisites
- A completed workflow run with preview data
Configure validation
- Open the workflow in your dashboard.
-
In the sidebar, select
Issues
. You will see 3 tabs there:Key columns
: Define how rows are tracked across runs.Rules
: Create, view, suggest, and delete validation rules.Results
: Review issues found by rules for the selected run.
-
Select the
Rules
tab.
- Choose one of the following:
- Add new Rule: Manually choose a target column, define a condition, and (optionally) add domain hints for better precision.
- Suggest Rules: Kadoa auto-suggests rules from your schema and sample data.
Manual rules often provide the most precise results when you know the domain.
Rule execution
- Rules run at the end of each subsequent pipeline run.
- Adding or removing a rule takes effect on the next run.
- If data shape changes make a rule invalid, Kadoa disables it automatically.
Working with rules
- Add rule: Select target columns and describe the rule in natural language. Kadoa generates the rule.
- Suggest rules: Auto-generate common rules based on data types.
- Delete rules: Remove a single rule or delete all rules.
- Disabled rules: Rules can be auto-disabled when they no longer apply. A reason is shown when available.
- Raw rule code: Rules expose raw SQL for transparency.
Example rules
Here are examples of natural language inputs and their generated SQL validation rules:- Historical runs: When viewing a past run, you’ll see the rules that were in effect at that time (read-only).
Validation report
After a run finishes, go toIssues
→ Report
.
- See issues grouped by rule.
- Click an issue to open the row detail and view all issues associated with that row.
Results view details
- Filter by rule: Use the filter to focus on specific rules when multiple are present.
- Row details: Click an item to open the row and see the offending value and related context.
- Status indicators:
NEW
: first time the issue appearsRESOLVED
: issue no longer present
- Summary chips show change since previous run:
+n
new issues,–n
resolved
Key columns (optional)
Defining key columns lets Kadoa track rows across runs for richer insights.- Configure in
Issues
→Key columns
. - Pick one or more columns used to match the same row across runs.
- Requirements: values should be present for most rows and unique per row (no duplicates).
How to pick key columns
- Prefer stable identifiers (e.g., product ID, URL, SKU).
- If a row cannot be matched via the key, it is treated as a new row.
Key‑based insights
When key columns are set, the report shows change indicators between runs:+n
: new issues discovered since the previous run-n
: issues resolved since the previous run
Validate now
Use theValidate now
button to schedule validation for the current workflow’s latest data. This is available when no specific past run is selected.