Roadmap
- Support for other data sources
- AWS, GCP and Azure related data services ( cloud storage)
- Deltalake
- RabbitMQ
- ActiveMQ
- MongoDB
- Elasticsearch
- Snowflake
- Databricks
- Pulsar
- Further support for metadata discovery
- HTTP (OpenAPI spec)
- JMS
- Read from samples
- API for developers and testers
- Scala
- Java
- UI Portal for metadata and data generation
- Metadata stored in database
- Store data generation/validation run information in file/database
- Report for data generated and validation rules
- Integration with existing metadata services
- Populate metadata back to metadata services
- OpenLineage metadata (Marquez)
- OpenMetadata
- ODCS (Open Data Contract Standard)
- Amundsen
- Datahub
- Solace Event Portal
- Airflow
- DBT
- Integration with existing data validations
- Suggest data validations
- Data dictionary
- Business definitions of fields that can be referenced for metadata across all data sources
- Verification rules after data generation
- Validation waiting conditions
- Webhook
- File exists
- Data exists via SQL expression
- Pause
- Extend validation types
- Aggregates (sum of amount per account is > 500)
- Ordering (transactions are ordered by date)
- Relationship (at least one account entry in history table per account in accounts table)
- Data profile (how close the generated data profile is compared to the expected data profile)
- Extend count
- Cover all possible cases (i.e. record for each combination of oneOf values, positive/negative values etc.)
- Similar to edge cases
- Ability to override edge cases
- Alerting
- Slack
- Overriding tasks
- Can customise tasks without copying whole schema definitions, easier to create scenarios
- Gradle plugin
- Metadata improvements
- PII detection (can integrate with Presidio)
- Relationship detection across data sources
- SQL generation
- Ordering information
- Code generation
- Schema generation from Scala/Java class
- Ordering within data sources that support order for insertion
- Further data cleanup
- Clean up data in consumer data sinks
- Clean up data from real time sources (i.e. DELETE HTTP endpoint, delete events in JMS)
- Trial app to try out all features
- HTTP response data validation