Generators are the most important entity within the Hub. Generators encapsulate the Generator Models produced from a particular data set.
Generators belong to an Organisation and are accessed through the API by Users belonging to Teams within that Organisation.
Each Generator has one or more Versions, each version being a set of one or more Generator Models trained on a particular snapshot of the source data.
In order to add Generator Models to a Generator, you will need to create a Version and upload the model files.
Active vs Draft¶
Only "Active" versions are available through the API.
"Draft" versions are private and not available for data generation until activated.
Clicking "Make draft" on a currently active version will remove it from access via the API.
Clicking on your Generator in the Organisation homepage will show the Generator's overview page.
In a populated Generator - one with an active Version - the overview page will show:
- Information on the currently active Version
- Details of Teams with access
- The Synthesiser used to train the Generator Models
- The training parameters used
- Schema information and sample synthetic data
- How to use the Hazy Client to generate synthetic data from the selected Generator Model
Create a new Generator¶
From the Organisation homepage, click on "Add new generator".
The only required field is a name for the generator but you can add a description if necessary.
Click "Save" and you will be taken back to your Organisation's homepage, but this time it will show your newly created Generator:
Create a new Generator Version¶
Click on the "Versions" from the Generator overview page, then click "Add version".
This will take you to the Version management page.
Uploading Generator Models¶
To upload Generator Models, click on "Choose Files" within a Version. This will open a file selection dialogue. Alternatively you can drag and drop model files from your file system onto the file input.
Now click on "Upload" to attach the models to the current version.
The Hub will introspect the model files and determine the synthesiser used to create them and their differential privacy setting (their epsilon value). A single version can have multiple synthesisers each with multiple epsilon values.
Once the version is ready, hit the "Activate" button to make it available via the API.
The Generator's overview page will now show the active version and will allow you to download the attached model files for a specific synthesiser and epsilon combination.
Generator Model Metrics¶
Each model also includes metrics for an example set of synthetic data generated from it, use the "Metrics" tab to view them.
Hazy provides custom metrics for similarity, utility and privacy based on the user case and data source. These metrics are also interactive, allowing you to evaluate the trade-offs between privacy and utility.
Models only show relevant metrics, but may include: