Schemas define the structure of data you want to extract from websites. They specify field names, data types, and descriptions. Schemas can be reused across multiple workflows.
For example, if you need to extract store locations from 5 different websites, you can create one schema and use it for all of them.
Managing Schemas
Create a New Schema
- Go to My Schemas and click ‘Create Schema’
- Choose how to start:
- Create your own - Define your own data structure with custom fields
- Copy from an existing workflow - Use the schema from one of your workflows as a starting point
- Copy from an existing schema - Duplicate and modify one of your existing schemas
- Add, remove, or modify fields to match your data extraction needs
- Save your schema to use it in future workflows
Using Schemas in Workflows
When you create a new workflow, you can select one of your saved schemas to ensure consistent data extraction across different sources.This saves time and ensures your data always follows the same structure, making it easier to work with your extracted information.Learn more about UI workflow creation →Define Schemas Inline
Define schemas directly in your workflow creation code:const workflow = await client
.extract({
urls: ['https://example.com'],
extraction: builder => builder
.entity('Product')
.field('name', 'Product name', 'STRING', { example: 'Laptop' })
.field('price', 'Price in USD', 'MONEY')
.field('inStock', 'Availability', 'BOOLEAN')
})
.create();
Create Reusable Schemas
Create and manage schemas separately for reuse:// Create a schema
const schema = await client.schemas.create({
entity: 'Product',
fields: [
{ name: 'name', description: 'Product name', dataType: 'STRING' },
{ name: 'price', description: 'Price', dataType: 'MONEY' },
{ name: 'inStock', description: 'In stock', dataType: 'BOOLEAN' }
]
});
// Use schema in workflow
const workflow = await client
.extract({
urls: ['https://example.com'],
schemaId: schema.id
})
.create();
Learn more about SDK schema management →
Data Types
When defining schemas, you specify the data type for each field to ensure accurate extraction and validation. Kadoa supports the following data types:
| Data Type | Description | Example Use Cases |
|---|
| STRING | String/text content | Product names, descriptions, article headlines |
| NUMBER | Numeric values (integers, decimals) | Quantities, ratings, scores, counts |
| BOOLEAN | True/false values | Availability status, feature flags, yes/no indicators |
| DATE | Date values | Publication dates, deadlines, event dates |
| DATETIME | Date and time values | Timestamps, scheduled times, last updated |
| MONEY | Currency and monetary values | Prices, costs, revenue, discounts |
| IMAGE | Image URLs and references | Product photos, thumbnails, profile pictures |
| LINK | URLs and hyperlinks | Product pages, external links, social media |
| OBJECT | Nested/complex JSON structures | Structured metadata, complex configurations |
| ARRAY | Lists/arrays of values | Tags, categories, multiple images, feature lists |
Choose the appropriate data type to ensure your data is extracted and validated correctly.
Some data types return structured values:
| Data Type | Format | Example |
|---|
| MONEY | {"amount": number, "currencyCode": string} | $124.50 → {"amount": 12450, "currencyCode": "USD"} |
The amount field is always in the smallest currency unit (e.g., cents for USD, pence for GBP).
Special Field Types
Beyond regular data fields, Kadoa supports special field types for advanced use cases:
Classification Fields
Automatically categorize content into predefined labels. Useful for:
- Sentiment analysis (Positive/Negative/Neutral)
- Content categorization (Technology/Business/Sports)
- Priority classification (High/Medium/Low)
Learn more about classification in the SDK →
Metadata Fields (Raw Content)
Extract raw page content in different formats:
- HTML - Raw HTML source code
- MARKDOWN - Markdown formatted content
- PAGE_URL - Page URL
Learn more about metadata fields in the SDK →
Need help creating a custom schema? Contact our support team for assistance.