Hazy synthetic data platform product update

Hazy product update - Q1 24

In the latest release to the Hazy platform, we’ve taken a big step in operationalising synthetic data across businesses, ensuring that synthetic data is generated safely, at scale.

Synthetic data marketplace

We’ve upgraded our synthetic data library to be the first Synthetic Data Marketplace, enabling teams - regardless of technical skill - to benefit from synthetic data. Data producers can generate and share data and models with data consumers seamlessly who in turn can use the assets across various use cases. Find out more in our deep dive blog.

Out-of-core sampling

With our enhanced database subsetting pipeline, customers can harness the capabilities of Dask to read and downsample extensive datasets easily. This feature guarantees the retention of referential integrity and vital insights, making previously unwieldy datasets accessible and usable. Find out more.

Database subsetting screenshot in Hazy platform
Database subsetting within the Hazy platform

New database connectors

We've added two new data connectors to the Hazy product: Databricks, Postgres, Oracle and BigQuery. These connectors give our customers more options to connect and use their data assets and complement our existing library of connectors: S3/GCS/Azure Blob Storage/ IBM DB2/ MSSQL/ Snowflake).

Additional connectors are currently in development and are scheduled for our next release.

Synthetifc data platform database connectors
Database connectors available in the Hazy platform

Multi-zone developments

We work directly with our customers to seamlessly integrate Hazy into their data workflows, frequently navigating complex environment and architecture requirements to deploy a synthetic data platform that is safe and scalable.


Hazy now offers a ‘multi-zone’ deployment approach for customers with multiple security contexts. We support standalone and distributed architectures, both of which can be leveraged to adopt a multi-zone strategy and network access. Multi-hub deployment means only the model moves between production and analytics or development environments without any loss of functionality from a synthetic data usage perspective. Read the full blog.

Additional differentially private models

We have added two additional differentially private (DP) generative models into our latest product release: Ryan McKenna’s MST and AIM.


Both models have consistently demonstrated top-tier performance in academic benchmarking studies and offer great utility-privacy tradeoffs as well as state-of-the-art performance. Read more about them.

Advanced handler developments

We continued to develop our data type handling capabilities. One example is Hazy’s Split ID function, where users can preserve the real-world context of string data types whilst protecting the privacy of customer’s information.

The split ID feature allows the user to specify a certain modelling behaviour for sections of a string column which is particularly useful when sections of a string field have utility for downstream analytics. Find out more.

Split ID functionality in the Hazy platform
Split ID functionality in the Hazy platform

Live on two marketplaces

Hazy is now live on AWS Marketplace as the most secure and cost-effective synthetic data platform available for customers to deploy effortlessly in their private cloud.

We’re also live on Snowflake Marketplace - alongside next-generation rewards app Unbanx - where we’ve launched the first ethical data cooperative comprising synthetically generated financial transaction data.

Live on Snowflake Marketplace

Hazy and Unbanx have launched the first ethical synthetically generated transaction data on Snowflake.

Live on AWS Marketplace

Deploy Hazy in minutes to accelerate your data workflows with the most secure synthetic data platform available on AWS.


Subscribe to our newsletter

For the latest news and insights