Skip to content

Roadmap

  • Support for other data sources
    • AWS, GCP and Azure related data services (✅ cloud storage)
    • Deltalake
    • RabbitMQ
    • ActiveMQ
    • MongoDB
    • Elasticsearch
    • Snowflake
    • Databricks
    • Pulsar
  • Further support for metadata discovery
  • ✅ API for developers and testers
    • ✅ Scala
    • ✅ Java
  • UI Portal for metadata and data generation
    • Metadata stored in database
    • Store data generation/validation run information in file/database
  • ✅ Report for data generated and validation rules
  • Integration with existing metadata services
  • Integration with existing data validations
  • ✅ Suggest data validations
  • Data dictionary
    • Business definitions of fields that can be referenced for metadata across all data sources
  • ✅ Verification rules after data generation
  • ✅ Validation waiting conditions
    • ✅ Webhook
    • ✅ File exists
    • ✅ Data exists via SQL expression
    • ✅ Pause
  • Extend validation types
    • ✅ Aggregates (sum of amount per account is > 500)
    • Ordering (transactions are ordered by date)
    • ✅ Relationship (at least one account entry in history table per account in accounts table)
    • Data profile (how close the generated data profile is compared to the expected data profile)
  • Extend count
    • Cover all possible cases (i.e. record for each combination of oneOf values, positive/negative values etc.)
    • Similar to edge cases
    • Ability to override edge cases
  • Alerting
    • Slack
    • Email
  • Overriding tasks
    • Can customise tasks without copying whole schema definitions, easier to create scenarios
  • Gradle plugin
  • Metadata improvements
    • PII detection (can integrate with Presidio)
    • Relationship detection across data sources
    • SQL generation
    • Ordering information
  • Code generation
  • Schema generation from Scala/Java class
  • Ordering within data sources that support order for insertion
  • Further data cleanup
    • Clean up data in consumer data sinks
    • Clean up data from real time sources (i.e. DELETE HTTP endpoint, delete events in JMS)
  • ✅ Trial app to try out all features
  • HTTP response data validation