Data Caterer is a metadata-driven data generation and testing tool that aids in creating production-like data across both batch and event data systems. Run data validations to ensure your systems have ingested it as expected, then clean up the data afterwards.

Simplify your data testing

Take away the pain and complexity of your data landscape and let Data Caterer handle it

Try now

Data testing is difficult and fragmented

Data being sent via messages, HTTP requests or files and getting stored in databases, file systems, etc.
Maintaining and updating tests with the latest schemas and business definitions
Different testing tools for services, jobs or data sources
Complex relationships between datasets and fields
Different scenarios, permutations, combinations and edge cases to cover

Current solutions only cover half the story

Specific testing frameworks that support one or limited number of data sources or transport protocols
Under utilizing metadata from data catalogs or metadata discovery services
Testing teams having difficulties understanding when failures occur
Integration tests relying on external teams/services
Manually generating data, or worse, copying/masking production data into lower environments
Observability pushes towards being reactive rather than proactive

Try now

What you need is a reliable tool that can handle changes to your data landscape

With Data Caterer, you get:

Ability to connect to any type of data source: files, SQL or no-SQL databases, messaging systems, HTTP
Discover metadata from your existing infrastructure and services
Gain confidence that bugs do not propagate to production
Be proactive in ensuring changes do not affect other data producers or consumers
Configurability to run the way you want

Try now

Tech Summary

Use the Java, Scala API, or YAML files to help with setup or customisation that are all run via a Docker image. Want to get into details? Checkout the setup pages here to get code examples and guides that will take you through scenarios and data sources.

Main features include:

Metadata discovery
Batch and event data generation
Maintain referential integrity across any dataset
Create custom data generation scenarios
Clean up generated data
Validate data
Suggest data validations

Check other run configurations here.

What is it

Data generation and testing tool

Generate production like data to be consumed and validated.
Designed for any data source

We aim to support pushing data to any data source, in any format.
Low/no code solution

Can use the tool via either Scala, Java or YAML. Connect to data or metadata sources to generate data and validate.
Developer productivity tool

If you are a new developer or seasoned veteran, cut down on your feedback loop when developing with data.

What it is not

Metadata storage/platform

You could store and use metadata within the data generation/validation tasks but is not the recommended approach. Rather, this metadata should be gathered from existing services who handle metadata on behalf of Data Caterer.
Data contract

The focus of Data Caterer is on the data generation and testing, which can include details about how the data looks like and how it behaves. But it does not encompass all the additional metadata that comes with a data contract such as SLAs, security, etc.
Metrics from load testing

Although millions of records can be generated, there are limited capabilities in terms of metric capturing.

Try now

Data Catering vs Other tools vs In-house

	Data Catering	Other tools	In-house
Data flow	Batch and events generation with validation	Batch generation only or validation only	Depends on architecture and design
Time to results	1 day	1+ month to integrate, deploy and onboard	1+ month to build and deploy
Solution	Connect with your existing data ecosystem, automatic generation and validation	Manual UI data entry or via SDK	Depends on engineer(s) building it