Python SDK

The Python client library allows users to programmatically train and generate synthetic data. The minimum version of Python recommended for use is 3.8.

Structure

There are two main components of the SDK:

  • hazy_client2 (docs): Programmatic entrypoint to a running Hazy Hub instance,
  • hazy_configurator (docs): Library resources and objects for the configuration of Hazy pipeline.

Getting started

In this short example, we'll:

  1. Set up a Hazy TrainingConfig with a data schema, data locations and specific evaluation configs,
  2. Train the model via train() which outputs a Hazy model file (.hmf extension). This file holds all the information to generate synthetic data,
  3. Set up a Hazy GenerationConfig to configure synthetic data generation,
  4. Generate synthetic data via generate().

Please refer to our SDK to learn about advanced configuration.

There are two primary methods to interact with Hazy:

  • SynthAPI - Initiate training and generation jobs using the Hazy API, specifying the Hazy UI host URL and a personal API key for authentication.
  • SynthDocker - Initiate training and generation tasks using the SynthDocker class, configuring parameters such as the working directory, local Docker daemon URL, container user settings, and file handling preferences, allowing the execution of synthesis operations within Docker containers.

Both methods are outlined below.

Training and generation via SynthAPI

Setup

Copy the below data into a file called children.csv and save it locally or remotely. For more details on file storage options we support, refer to the data sources section of our user guide.

Create a new or use an existing project, and add a data source that points to the location of children.csv.

first_name,last_name,age,height
Megan,Chang,8,131.0
Robert,Green,2,87.0
William,Sullivan,10,146.0
Kristen,Turner,8,127.0
Thomas,Silva,9,135.5
Rebecca,Wagner,5,114.5
Juan,Campos,4,101.0
Christine,King,4,95.0
Renee,Mcgrath,6,122.0
Lisa,Barrera,4,101.0
Kyle,Blair,3,87.5
Rachel,Sutton,7,126.5
Thomas,Garcia,10,134.0
Ryan,Carr,7,124.5
Robin,Levy,7,130.5
Thomas,Grimes,5,115.5
Jorge,Trujillo,9,138.5
Ana,Smith,10,139.0
Jennifer,Ross,2,96.0
Mallory,Barnett,2,81.0
Aaron,Snyder,8,138.0
Mikayla,Sanchez,2,98.0
Mark,Harrell,9,134.5
James,Bradley,5,108.5
John,Ponce,3,91.5
Linda,West,5,105.5
Christopher,Flores,4,109.0
William,Cantu,9,126.5
Daniel,Arnold,3,95.5
Jasmine,Kelley,10,146.0
Lisa,Fernandez,3,94.5
Tamara,Morrison,10,140.0
Briana,Wallace,3,102.5
Caitlyn,Cruz,7,128.5
Barbara,Roberts,5,117.5
Jaime,Lopez,10,149.0
Chloe,Douglas,6,119.0
Thomas,Davis,3,104.5
Katherine,Mcdowell,8,128.0
Sandra,Kirby,5,107.5
Rachael,Leblanc,4,98.0
Amber,Myers,4,93.0
Janet,Hill,6,120.0
Lisa,Atkinson,3,87.5
Patty,Lawrence,4,96.0
Stephanie,Riley,2,81.0
Shannon,Keller,10,143.0
Wendy,Stark,10,139.0
Laura,Miller,10,138.0
Chloe,Tucker,5,116.5
Crystal,Bruce,8,136.0
John,Dennis,6,119.0
Dave,Robinson,9,144.5
Laura,Cook,7,113.5
Lisa,Garcia,7,130.5
Dustin,Wolfe,3,100.5
Brandon,Berry,7,117.5
Renee,Ferguson,5,98.5
Erin,Johnson,6,108.0
Cynthia,Obrien,5,109.5
Barbara,Myers,4,102.0
Mitchell,Hooper,8,119.0
Benjamin,Smith,3,89.5
Susan,Lopez,5,99.5
David,Clark,10,150.0
Lauren,Giles,3,85.5
Andrew,Coleman,3,105.5
Craig,Green,5,117.5
Jeffrey,Lucas,3,97.5
Michael,White,3,96.5
William,Williams,3,86.5
Craig,Mcneil,2,85.0
James,Howard,4,95.0
Jessica,Massey,9,130.5
Samantha,Jackson,2,79.0
Emily,Levy,10,144.0
Brian,Lowe,3,93.5
Megan,Peterson,3,92.5
John,Carlson,3,105.5
Scott,Thompson,6,116.0
Thomas,Ortiz,8,123.0
Ashley,Romero,2,95.0
Larry,Howard,9,125.5
Mary,King,3,97.5
Ann,Smith,5,106.5
Judith,Rogers,7,126.5
Brandon,Campbell,4,98.0
John,Benton,2,84.0
Michael,Roberts,4,102.0
Michael,Arroyo,10,139.0
Cynthia,Oliver,3,104.5
Jennifer,Hughes,9,129.5
Robert,Curtis,2,94.0
Aaron,Lee,8,136.0
Matthew,Allen,10,140.0
Dana,Gray,7,123.5
Nancy,Carroll,6,109.0
Robert,Morales,10,131.0
Jacqueline,Barnes,9,126.5
Eileen,Williams,7,112.5
Sean,Green,10,139.0
Eric,Rose,4,99.0
Tony,Hoffman,9,135.5
Karla,Henson,6,116.0
Troy,Collins,4,101.0
Steven,Lamb,8,131.0
Nancy,Burnett,3,85.5
Jacob,Key,5,108.5
Cynthia,Miller,4,99.0
Jessica,Hatfield,5,118.5
Richard,Gregory,9,136.5
Leslie,Lewis,8,119.0
Jennifer,Smith,8,136.0
Mackenzie,Rice,8,119.0
Connor,Wilson,4,106.0
Debra,Russo,3,93.5
Joshua,Good,4,106.0
Craig,Nash,10,146.0
Randy,Miller,10,150.0
Joshua,Chavez,2,80.0
Laura,Callahan,9,134.5
Dennis,Meyer,6,119.0
Debra,Reed,2,92.0
Monica,Ramirez,5,115.5
Andrew,Williams,3,89.5
Erin,Grant,2,91.0
Stacey,Mays,8,128.0
Renee,Williams,2,85.0
Kara,Miles,2,79.0
Diana,Joseph,10,150.0
Raven,Bowman,3,91.5
Nathan,Medina,3,104.5
Jared,Matthews,5,107.5
Alan,Hernandez,6,110.0
Mathew,Clarke,3,100.5
Jennifer,Morgan,8,138.0
Christine,Williams,3,85.5
Frank,Holden,6,119.0
Keith,Foster,3,93.5
Amy,Carter,4,112.0
Timothy,Allen,10,151.0
Brandon,White,7,114.5
Alexandra,Jones,4,100.0
Richard,Murphy,2,80.0
Robert,Garcia,2,85.0
Regina,Wells,6,122.0
Mary,Cherry,7,122.5
Matthew,Mendoza,2,98.0
Holly,Simmons,9,144.5
Kevin,Navarro,9,144.5
Patricia,Gillespie,8,129.0
Courtney,Bennett,10,136.0
Terri,Fowler,5,110.5
Cameron,Miller,6,105.0
Kara,Brown,4,96.0
Alan,Long,6,115.0
April,West,7,122.5
Tracy,Richards,3,95.5
Erin,Henderson,2,80.0
Micheal,Hinton,6,110.0
Jose,Waters,4,110.0
Ryan,Howard,6,116.0
Caleb,Boyer,8,135.0
Jacqueline,Leach,4,101.0
Shannon,Rhodes,3,100.5
David,Sanders,5,99.5
Jared,Williams,6,110.0
Stacy,Lewis,10,133.0
Dustin,Gonzalez,6,117.0
Nicholas,Payne,7,120.5
Edward,Hinton,8,121.0
Tonya,Hernandez,3,102.5
Richard,Frazier,9,139.5
Natalie,Simpson,7,121.5
Sally,Morris,3,100.5
Vernon,Jimenez,3,100.5
Elizabeth,Harris,8,119.0
Chelsea,Robinson,6,115.0
Matthew,Estes,4,97.0
Rachel,Meyers,8,138.0
Austin,Hernandez,3,87.5
Jonathan,Mueller,3,91.5
Megan,Aguilar,5,99.5
Jennifer,Roman,8,118.0
Carl,Miller,3,97.5
Misty,Williams,10,147.0
Jeffrey,Williams,6,119.0
Alexis,Anthony,9,142.5
Mark,Martin,5,111.5
Eduardo,Douglas,3,96.5
Tanya,Wagner,5,106.5
Rachel,Shaw,4,105.0
Audrey,Gregory,5,109.5
Linda,Chang,3,87.5
Vicki,Burgess,2,95.0
Rebecca,Harris,9,130.5
Amanda,George,3,100.5
Margaret,Olson,8,126.0
Kylie,Price,5,118.5
Brenda,York,2,85.0
Lauren,Sandoval,4,95.0
Aaron,White,5,112.5
William,Scott,8,129.0
Cameron,Heath,10,135.0
Sherri,Turner,3,104.5
Ricky,Patrick,9,128.5
Bryan,Davidson,8,138.0
David,Mitchell,8,134.0
Maria,Brown,9,134.5
Barry,Butler,9,139.5
Travis,Boyer,5,115.5
Jennifer,Nunez,5,98.5
Edward,Hatfield,7,121.5
Robert,Carr,7,112.5
Paul,Williams,10,135.0
Thomas,Hernandez,6,124.0
Antonio,Williamson,4,104.0
Crystal,Garcia,6,120.0
Andrea,Reed,3,87.5
Patrick,Frank,10,132.0
Tracy,Ibarra,3,92.5
Chelsea,Mcdonald,4,93.0
Cynthia,Morgan,6,105.0
David,Fleming,9,134.5
Christy,Kramer,4,96.0
David,Buck,9,135.5
Lauren,Stark,10,143.0
Monique,Becker,10,147.0
Lisa,Stone,2,97.0
Kristen,Lopez,3,101.5
Kimberly,Wallace,3,98.5
Katherine,Gibson,5,107.5
Kristine,Jones,10,150.0
Bradley,Villa,8,133.0
Todd,Santana,8,137.0
Shirley,Estrada,5,98.5
Ashley,Robinson,2,84.0
Clayton,Weiss,6,121.0
Pamela,Chan,6,115.0
Holly,Fisher,3,100.5
Kevin,Wilson,6,114.0
Ronald,Knight,8,130.0
Sandra,Walls,8,119.0
Robert,Garcia,4,112.0
Kim,Navarro,4,99.0
Anthony,Griffin,6,115.0
Gina,Johnson,2,80.0
Samantha,Rivers,9,137.5
Jennifer,Miller,4,107.0
Chad,Howard,3,89.5
Anthony,Bailey,7,124.5
Alejandro,Mccann,2,98.0
Lori,Jones,9,136.5
Patricia,Clark,9,125.5
Jamie,Nunez,3,100.5
Shawna,Martinez,4,92.0
Adrian,Wood,2,98.0
Angel,Jacobs,4,112.0
Michele,Lopez,7,114.5
Daniel,Cooper,10,151.0
Susan,Anderson,7,117.5
Tammy,Cox,8,133.0
Thomas,Carter,3,86.5
Sharon,Rubio,9,143.5
Cynthia,White,7,131.5
Victoria,Garcia,3,104.5
Beverly,Moore,6,109.0
Rachael,Bautista,8,127.0
Linda,Stewart,3,101.5
John,Fischer,5,99.5
Kelly,Barnes,8,132.0
Brandon,Anderson,7,117.5
Andrew,Miller,9,135.5
Charles,Fisher,3,86.5
Andrea,Yang,2,94.0
Douglas,Henderson,6,105.0
Dana,Miller,10,149.0
Sean,Wood,5,105.5
Stacy,Brown,3,105.5
Ricky,Butler,10,147.0
Jessica,Flores,8,134.0
James,Carter,6,108.0
Thomas,Clements,4,105.0
Laura,Hill,8,120.0
Angela,Watts,3,98.5
Laura,Griffin,3,88.5
Raymond,Saunders,8,122.0
Ryan,Wright,2,93.0
Shawn,Giles,8,131.0
Douglas,Ford,2,94.0
Dana,Webb,7,119.5
James,Smith,3,96.5
Holly,Montgomery,3,88.5
Alan,Evans,7,111.5
Kayla,Fuller,7,122.5
Amy,Moore,4,92.0
Jasmine,Ruiz,5,109.5
Erika,Wolf,3,104.5
Jesse,Gill,4,98.0
Joshua,Riggs,2,85.0
David,Stephenson,3,85.5
Billy,Scott,6,116.0
Alicia,Perkins,2,98.0
Randy,Garcia,5,102.5
Johnny,Campbell,4,106.0
Karina,Stout,3,100.5
Caitlin,Johnson,7,119.5
Laura,Torres,4,92.0
Matthew,Moreno,5,109.5
John,Mora,7,126.5
Frank,Perry,6,114.0
Keith,Meyer,10,151.0
Audrey,Burton,7,116.5
Amanda,Jenkins,3,88.5
Cynthia,Powell,10,149.0
Kimberly,French,6,110.0
Kelly,Watson,8,122.0
Courtney,Moore,4,99.0
Heidi,James,7,127.5
Brittany,Taylor,5,105.5
Elizabeth,Gomez,4,101.0
Thomas,Perry,7,124.5
Thomas,Neal,2,83.0
Lucas,Pearson,2,91.0
Brian,Evans,3,87.5
Julie,Williams,4,105.0
Christine,Johnson,6,122.0
Thomas,Oneal,8,122.0
Alejandro,Rose,8,127.0
Carl,Camacho,7,113.5
Gina,Harmon,5,112.5
Elizabeth,Smith,7,131.5
Blake,Oliver,10,132.0
Yvonne,Marks,8,131.0
Holly,Acosta,2,92.0
Jeremy,Walton,7,125.5
Keith,Garcia,5,109.5
Steven,Rivera,6,120.0
Gary,Fisher,3,90.5
Phyllis,Graham,3,93.5
Seth,Fletcher,3,102.5
Alexandria,Anderson,4,106.0
Renee,Wallace,8,123.0
Kristina,Price,8,131.0
Lindsay,Price,4,99.0
Jeffrey,Gonzalez,9,134.5
Shelby,Willis,10,135.0
Brandon,Price,7,125.5
Jim,Miller,3,100.5
Jacob,Brown,5,107.5
Danielle,Thompson,2,93.0
Cheryl,Salazar,9,124.5
Janet,Hunt,5,107.5
Justin,Rich,3,105.5
Michael,Ellis,6,122.0
Crystal,Black,4,105.0
Stephanie,Blevins,9,126.5
Sarah,Villa,9,131.5
Bianca,Henry,10,143.0
Janet,Lewis,6,125.0
Joseph,Williams,2,82.0
Francisco,Smith,6,106.0
Diamond,Taylor,2,87.0
Kristin,Becker,8,134.0
Tara,Sanders,8,132.0
Sandra,Chavez,3,93.5
Matthew,Garcia,7,120.5
Nicole,Norton,5,117.5
Marcus,Bryant,3,86.5
Mark,Johnson,3,93.5
Bradley,Wood,6,122.0
Jason,Warren,7,114.5
Jacob,Harris,10,138.0
Emily,Fitzgerald,4,94.0
Larry,Heath,8,127.0
Jonathan,Cooper,6,121.0
Jennifer,Williams,4,110.0
Sarah,Jones,10,151.0
Steven,Hardy,5,115.5
Brandon,Lamb,3,98.5
Tiffany,Stevens,10,143.0
David,Miller,6,114.0
Corey,Cannon,9,135.5
Robert,Calhoun,4,97.0
Scott,Jones,3,88.5
Ronald,Fischer,8,130.0
Maria,Williams,9,128.5
Henry,Burns,10,140.0
David,Pitts,7,131.5
Sarah,Flores,9,137.5
Ryan,Hawkins,5,113.5
Justin,Weaver,9,140.5
James,Phillips,7,126.5
Scott,Jacobs,2,93.0
Amanda,Green,6,109.0
Jesse,Wilson,9,125.5
Kristen,Garcia,5,98.5
Jessica,Wright,7,126.5
Justin,Fitzgerald,8,118.0
Donna,Harmon,10,133.0
Zachary,Trevino,3,97.5
David,Brewer,2,90.0
David,Mclaughlin,2,82.0
Michael,Leonard,2,87.0
Jade,Guerrero,6,112.0
Jeffrey,Decker,4,110.0
John,Holmes,6,111.0
Emily,Hall,3,98.5
Courtney,Mitchell,9,134.5
George,Pacheco,8,123.0
Angela,Murphy,7,124.5
Brenda,Johnson,8,122.0
David,Foster,9,128.5
Theresa,Dixon,10,141.0
Charles,Hubbard,4,98.0
Kimberly,Hampton,4,106.0
Ashley,Dominguez,7,123.5
Jason,Oconnor,8,133.0
Lisa,Wolf,8,125.0
Michael,Garza,5,112.5
Cristina,Lester,5,116.5
Daniel,Flowers,2,91.0
Alicia,Howard,2,86.0
Stanley,Smith,3,90.5
Jeffrey,Delgado,7,112.5
Donna,Simpson,4,99.0
Stephanie,Castillo,6,124.0
Lee,Abbott,3,101.5
Jonathon,Munoz,6,116.0
Jade,Underwood,8,132.0
Kristen,Cruz,2,99.0
Brian,Johnson,10,151.0
Jessica,Nixon,10,144.0
Matthew,Alexander,9,139.5
Jessica,Barrett,6,120.0
Jordan,Powers,5,108.5
Kevin,Kelly,6,106.0
Amber,Oconnor,2,80.0
Edward,Johnson,4,103.0
Courtney,Johnson,2,88.0
Brady,Perez,2,83.0
Desiree,Jones,3,98.5
Bryan,Stanton,5,117.5
Roberto,Stafford,8,135.0
Jay,Graham,5,112.5
Michael,Thompson,5,108.5
Mark,Hatfield,3,104.5
Julie,Reyes,3,95.5
Robert,Baker,7,128.5
Amanda,Fitzgerald,9,134.5
Gloria,Ford,6,105.0
Bobby,Dorsey,10,132.0
Robert,Myers,5,109.5
Aaron,Taylor,3,91.5
Misty,Palmer,10,142.0
Jessica,Hernandez,5,104.5
Mark,Larsen,6,114.0
Lori,Wright,6,121.0
Andrew,Drake,8,126.0
Gary,French,9,135.5
Gabriela,Jackson,5,99.5
Christian,Pennington,6,122.0
Carla,Oliver,3,85.5
Stephen,Hernandez,9,139.5
Carolyn,King,9,125.5
Linda,Oconnor,8,133.0
Anthony,Randolph,9,138.5
Barry,Davis,3,87.5
Kelly,Hernandez,3,92.5
Joel,Kelly,3,89.5
Jill,Sullivan,8,124.0
Joseph,Vasquez,9,143.5
Philip,Collins,3,98.5
Rachel,Pierce,10,143.0
Tiffany,Mejia,2,84.0
Ashley,Baker,5,113.5
Gloria,Bryant,5,102.5
Jennifer,Powell,6,116.0
Tyler,Smith,7,124.5
Jody,Johnson,3,102.5
Erin,Perkins,6,124.0
Kimberly,Roberts,10,137.0
Gina,Clark,6,119.0
Allison,Peterson,10,150.0
Tiffany,Bonilla,9,141.5
Jason,Knight,6,113.0
Mark,Schmitt,5,98.5
Sandra,Willis,3,104.5
Jennifer,Maldonado,3,90.5
Michael,Hendricks,8,125.0
Rachel,Rivera,5,107.5
Julie,Smith,2,96.0
Lisa,Cunningham,10,144.0
Stephanie,Cox,2,82.0
Connie,Morris,8,138.0
Kirsten,Burke,6,108.0
Vanessa,Smith,7,118.5
Donald,Williams,10,140.0
David,Noble,5,105.5
Stacy,Castillo,3,101.5
Stacey,Cardenas,6,115.0
Kimberly,Burgess,5,109.5
Jacob,Dunn,9,133.5
Rodney,Dodson,4,96.0
Glenn,Jackson,2,96.0
Donna,Moore,10,141.0
Daniel,Gordon,7,129.5
Laura,Jacobs,2,83.0
Karen,Baker,8,122.0
Justin,Patterson,4,108.0
Vicki,Robbins,3,89.5
Sophia,Medina,5,113.5
Angela,Branch,5,105.5
James,Patton,4,99.0
Latasha,Kirk,8,129.0
Karen,Moore,4,112.0
Donna,Bradshaw,9,127.5
Anna,Ward,2,95.0
Stefanie,Hoffman,7,126.5
Robert,Mendez,9,133.5
Linda,Perez,2,86.0
Alfred,Rice,10,151.0
Shelly,Frazier,4,107.0
Crystal,Burton,9,141.5
Tyler,Simon,7,113.5
John,Snow,6,109.0
Joshua,Duffy,8,124.0
Sara,Miller,7,120.5
Shane,Manning,8,119.0
Shannon,Hicks,5,99.5
Lindsay,Bush,7,118.5
Susan,Martin,7,125.5
Sandra,Reilly,5,106.5
Cynthia,Shepard,7,116.5
Johnny,Macias,6,105.0
Kiara,Lynch,7,129.5
Christopher,Johnson,10,132.0
Alex,King,4,103.0
Kathryn,Hughes,2,94.0
Kimberly,Garrett,2,79.0
Paul,Beard,5,99.5
Benjamin,Marshall,2,86.0
Maria,Martinez,7,113.5
Jennifer,Murphy,2,90.0
Andrew,Wells,8,122.0
Aimee,Williams,5,112.5
Ashlee,Reed,8,122.0
Adam,Lee,7,120.5
Daniel,Fernandez,4,112.0
Bradley,Hebert,7,124.5
Megan,Landry,8,118.0
Leroy,Whitehead,8,126.0
Tracey,Hubbard,10,148.0
Joshua,Lambert,9,125.5
Caitlin,Powell,3,98.5
Kevin,Brown,8,123.0
David,Anderson,2,95.0
Timothy,May,4,111.0
Kathryn,Williams,10,135.0
Madison,Williams,3,95.5
Angela,Manning,5,103.5
Crystal,Herring,5,98.5
Melvin,Willis,4,109.0
Christopher,Castro,4,94.0
Lisa,Harris,8,137.0
Jose,Graves,3,104.5
Jon,Green,9,128.5
Selena,Lutz,2,87.0
Eric,Mccullough,7,123.5
Angela,Johnson,2,99.0
Brenda,May,2,94.0
Arthur,Todd,3,96.5
Lisa,Jones,6,109.0
Deborah,Bryant,9,131.5
John,Ellis,10,142.0
Erik,Cook,4,104.0
Diana,Alvarez,7,119.5
Angela,Stephens,9,136.5
Lori,Cooper,2,88.0
Lisa,Miller,10,140.0
Ronald,Gomez,10,146.0
Shannon,Bass,2,96.0
William,Archer,10,139.0
Michelle,Wilson,2,93.0
Logan,Johnson,8,121.0
Jessica,Smith,8,129.0
Norma,Lee,9,125.5
Robert,Brown,2,87.0
Larry,Price,2,87.0
Brett,Saunders,6,111.0
Jennifer,Howard,10,147.0
Mary,Jones,7,123.5
Brian,Kelly,6,111.0
Rachel,Avila,3,103.5
Emily,Hart,7,118.5
Linda,Gutierrez,10,142.0
Anthony,Lang,4,96.0
James,Hughes,7,111.5
Michael,Martinez,2,97.0
Michael,Thompson,4,103.0
Christina,Valdez,7,120.5
Alexander,Bryant,6,115.0
Angela,Savage,9,136.5
Tyler,Miller,8,123.0
Brett,Atkins,2,83.0
Krystal,Garrison,2,93.0
Jose,Wong,4,102.0
Cody,Serrano,2,94.0
Matthew,Friedman,6,124.0
Thomas,Johnson,5,100.5
Regina,Garrett,10,144.0
Justin,Johnson,6,110.0
Nicholas,Moore,10,136.0
Amy,Thomas,3,105.5
Greg,Mccall,4,110.0
Danielle,Sanchez,3,101.5
Natasha,Weber,10,150.0
Sonya,Webb,8,131.0
Pamela,Gregory,6,114.0
Bradley,Allen,6,105.0
Juan,Jackson,8,126.0
John,Perez,6,122.0
Natalie,Ford,10,148.0
Nancy,Taylor,7,121.5
Annette,Smith,5,111.5
Sarah,Smith,4,92.0
Peter,Solis,10,135.0
Zoe,Smith,8,129.0
Madison,Hicks,9,125.5
Benjamin,Waters,10,144.0
Daniel,Carr,5,98.5
Kimberly,Nunez,7,127.5
Noah,Johnson,4,98.0
Samuel,Mcintyre,7,131.5
Mason,Wright,9,124.5
Devin,Dixon,5,116.5
Amanda,Jones,5,106.5
Ricky,Hopkins,4,105.0
Tammy,Reynolds,3,103.5
Ryan,Ruiz,9,131.5
Jeffrey,Foster,9,140.5
Deanna,Sanders,3,91.5
Bonnie,Houston,4,106.0
Whitney,Dyer,3,98.5
Nathan,Johnson,8,126.0
Cheryl,Wells,6,118.0
Joel,Williams,7,130.5
Marc,Yates,7,113.5
Tamara,Rodriguez,6,105.0
Natalie,Williams,9,124.5
Jennifer,Johnson,6,111.0
Robert,Berg,8,130.0
Phillip,Middleton,8,138.0
Jennifer,Munoz,8,119.0
Evan,Peterson,9,135.5
Robert,Lawrence,4,110.0
Wendy,Campbell,6,115.0
Wesley,Mahoney,2,91.0
Michael,Fuller,9,140.5
Katie,Mccoy,4,93.0
Karen,Gonzalez,3,103.5
Susan,Thompson,7,122.5
Brandy,Phillips,2,81.0
Cynthia,Carter,5,101.5
Ryan,Fleming,10,146.0
Joseph,Luna,2,89.0
Michael,Anderson,2,89.0
Christie,Martin,8,122.0
Russell,Ross,6,118.0
Charles,Page,4,111.0
Erin,Strickland,4,104.0
Amy,Spencer,6,121.0
Gary,Clark,2,84.0
Jeremy,Fox,4,96.0
Lori,Kelly,9,144.5
Nicole,Fitzgerald,2,95.0
Jennifer,Rogers,2,96.0
David,Estes,8,123.0
Melissa,Ortiz,7,129.5
Michelle,Nolan,3,87.5
Matthew,Mason,10,136.0
Martin,Neal,6,111.0
Rhonda,Rollins,6,115.0
Julia,Torres,6,113.0
Nicole,Riddle,10,145.0
Michael,Fry,4,106.0
William,Oconnell,10,135.0
Wendy,Hess,2,99.0
Frances,Moore,4,112.0
Adam,Larson,10,132.0
Janet,Walls,7,113.5
Zachary,Terry,5,118.5
Deborah,Harris,9,143.5
Dawn,Holden,5,112.5
Daniel,Barker,10,136.0
Christina,Bennett,7,131.5
Laura,Smith,4,107.0
Patricia,Roth,10,132.0
Timothy,Rodriguez,10,133.0
Shawn,Silva,10,141.0
Jon,Tucker,2,81.0
Kimberly,Livingston,3,98.5
Anna,Wilcox,7,129.5
Christian,Gates,9,134.5
Samantha,Jackson,8,134.0
Maria,Atkinson,7,131.5
Natalie,Holmes,3,89.5
Charlene,Clark,7,111.5
Jean,Sullivan,4,96.0
Andrew,Taylor,2,89.0
Paul,Mcclure,5,99.5
Annette,Hendricks,8,138.0
Sarah,Miller,2,88.0
Brianna,Cook,8,119.0
William,Gibson,4,103.0
Timothy,Garcia,3,98.5
Marissa,Henry,2,93.0
Deanna,Kennedy,7,130.5
Herbert,Weaver,6,114.0
Erik,Phelps,9,137.5
Marie,Thomas,4,92.0
Casey,Jones,9,132.5
Phillip,Benton,5,110.5
Angela,Baker,3,96.5
Jerry,Rodriguez,3,88.5
Donald,Cain,2,90.0
Dillon,Shields,2,84.0
Mackenzie,Taylor,8,137.0
Angelica,Smith,2,89.0
Michelle,Grant,9,141.5
Karina,Henry,9,139.5
Hannah,Velazquez,3,86.5
Anita,Baxter,10,143.0
Matthew,Davis,6,105.0
Adam,Perez,10,134.0
Mary,Collins,3,95.5
Jeffrey,Simpson,7,114.5
Stacey,Hicks,9,125.5
Matthew,Jones,4,108.0
Ashley,Perez,6,106.0
Michael,James,2,91.0
Katherine,Hall,7,116.5
Sharon,Newton,10,135.0
Timothy,Gilmore,4,97.0
Michael,Cruz,4,112.0
David,Osborne,5,117.5
Richard,Mason,7,111.5
Thomas,Douglas,9,144.5
Nathaniel,Moon,8,119.0
Ryan,Howard,5,105.5
Richard,Miranda,6,115.0
Victoria,Delacruz,4,99.0
Timothy,Carter,7,118.5
Sean,Cooper,4,105.0
John,Lopez,9,135.5
Stephanie,Porter,4,104.0
Daniel,Reyes,2,84.0
Paul,Palmer,2,91.0
Tina,Bray,4,96.0
Ariel,Montgomery,2,79.0
Emma,Shaw,7,127.5
Tommy,Edwards,2,80.0
Gabriella,Davis,2,82.0
Logan,Macdonald,4,96.0
Jeremy,Ponce,8,118.0
Tracey,Johnston,8,131.0
Billy,Davis,7,118.5
James,Murphy,4,103.0
Daniel,Robles,10,137.0
Michael,Hall,10,143.0
Bradley,Reyes,3,89.5
Adam,Wilson,8,136.0
Cassandra,Donovan,7,114.5
James,Figueroa,8,131.0
James,Smith,5,113.5
Peggy,Mathis,8,125.0
Carl,Thompson,8,125.0
Jimmy,Hebert,9,136.5
Patrick,Aguirre,3,93.5
Ryan,James,6,121.0
Kathy,Gilbert,7,128.5
Joseph,Aguilar,2,98.0
Jane,Wilcox,9,131.5
Erica,Davidson,6,106.0
Caitlin,Davis,7,123.5
Sierra,Espinoza,3,102.5
Joshua,Schwartz,2,83.0
Christine,Hopkins,8,118.0
Nicholas,Berry,8,130.0
Donna,Harris,8,121.0
William,Martinez,9,143.5
Daniel,Bartlett,9,129.5
David,Barry,4,102.0
Terry,Gibson,9,137.5
Matthew,Butler,4,110.0
Nicholas,Cunningham,6,121.0
Matthew,Short,3,96.5
Jesus,Wright,7,115.5
Timothy,Mcconnell,7,126.5
Andrew,Cummings,10,132.0
Adam,Sullivan,5,106.5
Kathryn,Griffin,4,110.0
Ruben,King,7,120.5
Thomas,Perez,8,138.0
Justin,Valdez,2,88.0
Carol,Noble,10,144.0
Tanya,Boyd,2,92.0
Christopher,Evans,6,117.0
April,Nunez,5,109.5
Sophia,Moore,4,96.0
Benjamin,Berg,3,104.5
Linda,Cannon,7,116.5
Taylor,Byrd,2,92.0
Robert,Lee,8,132.0
Peter,Cobb,3,105.5
Laurie,Woodward,3,98.5
Pam,Fleming,10,148.0
Abigail,Ramsey,4,97.0
Isaac,Leblanc,4,98.0
Jennifer,Wheeler,4,99.0
Ryan,Schultz,2,95.0
Christine,Nolan,4,107.0
Ashley,Lewis,7,130.5
Scott,Sanchez,6,115.0
Rebecca,Bridges,3,98.5
Angela,Gonzalez,6,110.0
Bradley,Anderson,7,131.5
Scott,Craig,10,141.0
Jeffrey,Cohen,10,135.0
Michelle,Todd,8,135.0
Ronald,Brewer,6,112.0
Vanessa,Mcclure,8,129.0
Matthew,Jennings,8,133.0
Frank,Grant,10,140.0
Michael,Kim,8,131.0
Scott,Alvarado,3,89.5
Meghan,Mcdonald,4,92.0
Dwayne,Webster,10,134.0
Karen,Finley,5,117.5
Phillip,Marshall,10,134.0
Amber,Mcclain,6,124.0
Janet,Smith,4,104.0
Jared,Nash,3,86.5
Brenda,Russo,2,82.0
Robert,Clark,7,126.5
Adam,Ferrell,7,114.5
Derek,Day,9,135.5
Tracy,James,6,120.0
Amanda,Miller,5,103.5
Tara,Vasquez,10,148.0
Alexis,Barnes,3,101.5
Katherine,Collins,4,92.0
Julie,Saunders,4,108.0
John,Faulkner,8,137.0
Anthony,Boyer,5,112.5
William,Hernandez,8,126.0
Jonathan,Watkins,2,97.0
Joshua,Roberson,4,104.0
Jeffrey,Cruz,4,106.0
William,Morgan,2,91.0
Madison,Baker,3,105.5
Christopher,Moore,8,128.0
Courtney,Elliott,5,114.5
Bryan,Kaiser,9,125.5
Timothy,Flores,9,143.5
Jessica,Johnson,3,93.5
Ryan,Beck,10,146.0
Suzanne,Gill,10,151.0
Nicole,Wilson,8,133.0
Russell,Johnson,6,110.0
Joseph,Cruz,5,115.5
Maurice,Brooks,7,116.5
Leah,Lopez,6,124.0
Makayla,Weaver,4,106.0
Matthew,Tran,3,87.5
Evan,Simpson,9,136.5
Andrew,Roy,8,135.0
Martin,Cooper,3,93.5
Wanda,Austin,9,131.5
Robert,Gibbs,3,94.5
Carlos,Duncan,4,103.0
Abigail,Callahan,3,89.5
Brenda,Washington,2,83.0
Abigail,Casey,10,137.0
Michael,Grant,2,80.0
Jasmine,Cowan,8,135.0
Heather,Hayden,9,127.5
Raymond,Lynch,9,141.5
Julie,Bailey,7,121.5
Joseph,Kaufman,3,85.5
Michael,Cooke,5,105.5
Melissa,Robinson,9,133.5
Maria,Terrell,6,112.0
Joshua,Beck,2,94.0
Kayla,Miller,7,127.5
Lori,Parker,7,113.5
Jasmine,Clements,3,94.5
Austin,Carr,8,125.0
Kelly,Wilson,7,123.5
Jennifer,Werner,4,99.0
William,Reed,6,111.0
Shane,Perry,9,135.5
Amanda,Ortiz,6,117.0
Christopher,Krause,4,95.0
Wendy,Thompson,8,129.0
John,Kim,10,146.0
Holly,Johnson,5,118.5
Mary,Little,7,131.5
Joseph,Pitts,7,124.5
Donna,Stewart,6,116.0
Amy,Krause,8,127.0
Brandon,Erickson,3,100.5
Melissa,Schwartz,6,108.0
Donald,Harper,9,128.5
Kristi,Barnes,7,118.5
Brandi,King,4,102.0
Lawrence,Stokes,9,131.5
Tracy,Cole,3,97.5
David,Brooks,8,132.0
Abigail,Hall,10,145.0
Lindsey,Reyes,5,110.5
John,Parker,10,140.0
Madison,Strong,9,131.5
Susan,Smith,7,127.5
Karen,Fox,2,81.0
David,Wright,9,134.5
Juan,Powell,8,125.0
Thomas,Zuniga,8,119.0
Brian,Fletcher,2,92.0
Melissa,Howard,3,93.5
John,Walker,5,108.5
Kevin,Thomas,4,95.0
Andrea,Wright,4,103.0
Anthony,Hall,2,86.0
Lori,Butler,2,79.0
Nicole,Acevedo,8,135.0
John,Castro,2,83.0
Nicole,Perkins,3,104.5
Gary,Wright,5,100.5
Vanessa,Evans,9,130.5
Mindy,Norton,2,95.0
Judy,Bowen,8,120.0
Kristy,Boone,10,136.0
Bryan,Jackson,5,105.5
William,Lewis,8,130.0
Amanda,Snyder,9,124.5
David,Miller,8,124.0
Alexander,Bryan,8,119.0
Christopher,Nixon,6,105.0
Tonya,Reese,7,122.5
Shannon,Hill,7,125.5
Robert,Reed,4,111.0
Randy,Barber,10,133.0
Patricia,Moore,6,108.0
John,Clark,3,93.5
Brandon,Dickerson,2,83.0
William,Jones,4,104.0
Sean,Hayes,5,116.5
Kimberly,Juarez,7,117.5
Dennis,Sims,8,134.0
William,Smith,10,134.0
Dylan,Estrada,10,134.0
Michael,Stuart,9,127.5
Warren,Barker,10,145.0
Dennis,Mendoza,9,129.5
Jessica,Bass,9,141.5
James,Sanders,7,115.5
Thomas,Alexander,8,126.0
Thomas,Phillips,8,120.0
Lindsey,Gentry,10,141.0
Michael,Little,5,112.5
Oscar,Riley,5,109.5
Heather,Mason,9,137.5
Emily,Sherman,2,93.0
Megan,Lopez,2,96.0
Javier,Robertson,8,132.0
William,Morton,5,111.5
Zachary,Mccullough,5,106.5
Kimberly,Hunter,9,139.5
Margaret,Alvarez,4,99.0
Matthew,Hamilton,9,133.5
Stephen,Santos,7,131.5
Brett,Blair,9,143.5
Tammy,Ellis,4,108.0
Casey,Harris,3,91.5
Dakota,Scott,6,121.0
Laura,Smith,3,86.5
Morgan,Ayers,4,102.0
Jackson,Thompson,2,89.0
Tammy,Ward,4,106.0
Lisa,Smith,8,124.0
Vicki,Smith,8,133.0
Cathy,Hebert,5,103.5
Marie,Collins,8,119.0
Lynn,Long,7,127.5
Vincent,Cox,5,108.5
Michael,Perkins,4,108.0
Kimberly,Lyons,10,150.0
Kayla,Smith,8,122.0
Amanda,Gomez,9,142.5

Set up training via TrainingConfig

A Hazy TrainingConfig contains the following required parameters when used to train via SynthAPI:

  • A DataSchema - defines the data and structure of the table(s).
  • A list of SecretDataSource - data sources for input. This will include the data source which contains children.csv.
  • A list of DataLocationInput - data locations for input. PathReadTableConfig is used to locate children.csv, it requires the id of the data source as well as the relative path to the file inside the data source.
from hazy_configurator import (
    CategoryType,
    DataLocationInput,
    DataSchema,
    FloatType,
    IntType,
    PathReadTableConfig,
    SecretDataSource,
    TabularTable,
    TrainingConfig,
)

training_config = TrainingConfig(
    data_schema=DataSchema(
        tables=[
            TabularTable(
                name="children",
                dtypes=[
                    CategoryType(
                        col="first_name",
                    ),
                    CategoryType(
                        col="last_name",
                    ),
                    FloatType(
                        col="height",
                    ),
                    IntType(
                        col="age",
                    ),
                ],
            ),
        ],
    ),
    data_input=[
        DataLocationInput(
            name="children",
            location=PathReadTableConfig(
                connection=INPUT_DATA_SOURCE_ID, rel_path="children.csv"
            ),
        ),
    ],
    data_sources=[SecretDataSource(id=INPUT_DATA_SOURCE_ID)],
)

Key Notes

  • Hazy's DataSchema contains a list of DataTable. Each DataTable contains dtypes, defining the Hazy DataTypes for each column and a name, linking the table's schema to its location.
  • Hazy's DataLocation currently support .csv, .csv.gz, .parquet and .avro paths, SQL Server and IBM Db2 locations.
  • The model_output parameter is not needed to train via SynthAPI. Models will be saved to a pre-configured storage folder.

Note: Additional configuration (e.g. EvaluationConfig) used to tweak the evaluation of the trained model has been omitted for simplicity.

Please refer to our SDK for further TrainingConfig configuration options.

Training the model

The SynthAPI class serves as an interface to interact with Hazy services as an API. An object of this class can be instantiated using a Hazy UI host URL and an API authentication key. Please see here for more information about authentication with Keycloak setup.

Below, training takes a project id and a TrainingConfig object as input. It assumes a project with the provided id is already set up. This project should have the children.csv data source attached. Please see SynthAPI for more information about how to train on a predefined configuration instead.

import os

from hazy_client2 import SynthAPI

MODELS_FOLDER = os.environ["MODELS_FOLDER"]

# Please see https://hazy.com/docs/python_sdk/tutorials/auth-synth-api-keycloak/
API_KEY = "YOUR_API_KEY"
HAZY_HUB_HOST = "https://your/hazy/hub"
PROJECT_ID = "YOUR_PROJECT_ID"
INPUT_DATA_SOURCE_ID = "YOUR_INPUT_DATA_SOURCE_ID"


synth = SynthAPI(host=HAZY_HUB_HOST, api_key=API_KEY)

train_response = synth.train(cfg=training_config, project_id=PROJECT_ID)

# wait for training to finish

model_id = train_response["model_id"]
assert os.path.exists(
    f"{MODELS_FOLDER}/{PROJECT_ID}/{model_id}.hmf"
), "Synthesiser should generate .hmf model file!"

Setting up generation via GenerationConfig

The GenerationConfig uses the .hmf file to generate synthetic data at the desired output location.

The following are required generating via SynthAPI:

  • A list of SecretDataSource - data sources for output.
  • A list of DataLocationInput - data locations for output. PathWriteTableConfig is used to connect to a data source to output the generated data to, using the data source if and a relative path inside that data source.
from hazy_configurator import (
    DataLocationOutput,
    GenerationConfig,
    PathWriteTableConfig,
    SecretDataSource,
)

OUTPUT_DATA_SOURCE_ID = "YOUR_OUTPUT_DATA_SOURCE_ID"
RELATIVE_OUTPUT_LOCATION = "output/children.csv"


generation_config = GenerationConfig(
    model="children.hmf",
    data_output=[
        DataLocationOutput(
            name="children",
            location=PathWriteTableConfig(
                connection=OUTPUT_DATA_SOURCE_ID, rel_path=RELATIVE_OUTPUT_LOCATION
            ),
        ),
    ],
    data_sources=[SecretDataSource(id=OUTPUT_DATA_SOURCE_ID)],
)

Note: Additional fields (e.g. GenSampleParams used to configure the amount of synthetic data to be generated) have been omitted for simplicity. The default magnitude of generated synthetic data is 1.0 which means the same amount of data is generated as was trained on.

Please refer to our SDK for further GenerationConfig configuration options.

Generating synthetic data

Using the id of the model generated during training, we can generate synthetic data for the given GenerationConfig.

synth.generate(cfg=generation_config, model_id=model_id)

# wait for generation to finish

Combining all this together

import os

from hazy_client2 import SynthAPI
from hazy_configurator import (
    CategoryType,
    DataLocationInput,
    DataLocationOutput,
    DataSchema,
    FloatType,
    GenerationConfig,
    IntType,
    PathReadTableConfig,
    PathWriteTableConfig,
    SecretDataSource,
    TabularTable,
    TrainingConfig,
)

MODELS_FOLDER = os.environ["MODELS_FOLDER"]

# Please see https://hazy.com/docs/python_sdk/tutorials/auth-synth-api-keycloak/
API_KEY = "YOUR_API_KEY"
HAZY_HUB_HOST = "https://your/hazy/hub"
PROJECT_ID = "YOUR_PROJECT_ID"
INPUT_DATA_SOURCE_ID = "YOUR_INPUT_DATA_SOURCE_ID"
OUTPUT_DATA_SOURCE_ID = "YOUR_OUTPUT_DATA_SOURCE_ID"
RELATIVE_OUTPUT_LOCATION = "output/children.csv"


synth = SynthAPI(host=HAZY_HUB_HOST, api_key=API_KEY)

training_config = TrainingConfig(
    data_schema=DataSchema(
        tables=[
            TabularTable(
                name="children",
                dtypes=[
                    CategoryType(
                        col="first_name",
                    ),
                    CategoryType(
                        col="last_name",
                    ),
                    FloatType(
                        col="height",
                    ),
                    IntType(
                        col="age",
                    ),
                ],
            ),
        ],
    ),
    data_input=[
        DataLocationInput(
            name="children",
            location=PathReadTableConfig(
                connection=INPUT_DATA_SOURCE_ID, rel_path="children.csv"
            ),
        ),
    ],
    data_sources=[SecretDataSource(id=INPUT_DATA_SOURCE_ID)],
)

generation_config = GenerationConfig(
    model="children.hmf",
    data_output=[
        DataLocationOutput(
            name="children",
            location=PathWriteTableConfig(
                connection=OUTPUT_DATA_SOURCE_ID, rel_path=RELATIVE_OUTPUT_LOCATION
            ),
        ),
    ],
    data_sources=[SecretDataSource(id=OUTPUT_DATA_SOURCE_ID)],
)

train_response = synth.train(cfg=training_config, project_id=PROJECT_ID)

# wait for training to finish

model_id = train_response["model_id"]
assert os.path.exists(
    f"{MODELS_FOLDER}/{PROJECT_ID}/{model_id}.hmf"
), "Synthesiser should generate .hmf model file!"

synth.generate(cfg=generation_config, model_id=model_id)

# wait for generation to finish

Training and generation via SynthDocker

Set up training via TrainingConfig

A Hazy TrainingConfig contains the following required parameters:

  • A DataSchema - defines the data and structure of the table(s).
  • A list of DataLocationInput - data sources for input.
  • A Hazy model file model_output path (.hmf ) - where the trained model will be saved.
from hazy_configurator import (
    CategoryType,
    DataLocationInput,
    DataSchema,
    FloatType,
    IntType,
    TabularTable,
    TrainingConfig,
)

training_config = TrainingConfig(
    model_output="children.hmf",
    data_schema=DataSchema(
        tables=[
            TabularTable(
                name="children",
                dtypes=[
                    CategoryType(
                        col="first_name",
                    ),
                    CategoryType(
                        col="last_name",
                    ),
                    FloatType(
                        col="height",
                    ),
                    IntType(
                        col="age",
                    ),
                ],
            ),
        ],
    ),
    data_input=[
        DataLocationInput(name="children", location="children.csv"),
    ],
)

Key Notes

  • Hazy's DataSchema contains a list of DataTable. Each DataTable contains dtypes, defining the Hazy DataTypes for each column and a name, linking the table's schema to its location.
  • Hazy's DataLocation currently support .csv, .csv.gz, .parquet and .avro paths, SQL Server and IBM Db2 locations.
  • The model_output path where the Hazy model file (.hmf) will be stored.

Note: Additional configuration (e.g. EvaluationConfig) used to tweak the evaluation of the trained model has been omitted for simplicity.

Please refer to our SDK for further TrainingConfig configuration options.

Training the model

The Hazy synthesiser SynthDocker class provides an abstraction for working with a local hazy Synthesiser. An object of this class takes a TrainingConfig as input and writes out a .hmf model file, where specified in the training_config.

import os
from os.path import exists

from hazy_client2 import SynthDocker

DOCKER_IMAGE = "docker_image:tag"

# replace these with specific IDs if required
DOCKER_USER_ID = os.getuid()
DOCKER_GROUP_ID = os.getgid()

synth = SynthDocker(
    image=DOCKER_IMAGE,
    container_user_default=f"{DOCKER_USER_ID}:{DOCKER_GROUP_ID}",
    features_file="/path/to/your/features.json",
    features_sig="/path/to/your/features.sig.json",
)
synth.train(cfg=training_config)

assert exists("children.hmf"), "Synthesiser should generate .hmf model file!"

Tip: Change the log level via export HAZY_LOGLEVEL=INFO in order to see increase visibility in the logs. (Default: WARNING).

Setting up generation via GenerationConfig

The GenerationConfig uses the .hmf file to generate synthetic data at the desired output location.

from hazy_configurator import DataLocationOutput, GenerationConfig

generation_config = GenerationConfig(
    model="children.hmf",
    data_output=[
        DataLocationOutput(
            name="children",
            location="/output/children.csv",
        ),
    ],
)

Note: Additional fields (e.g. GenSampleParams used to configure the amount of synthetic data to be generated) have been omitted for simplicity. The default magnitude of generated synthetic data is 1.0 which means the same amount of data is generated as was trained on.

Please refer to our SDK for further GenerationConfig configuration options.

Generating synthetic data

Using a Docker image of the Hazy synthesiser, we can generate synthetic data for the given GenerationConfig.

from os.path import exists

synth.generate(cfg=generation_config)  # same synth as previous snippet

assert exists("output/children.csv"), "Synthesiser should generate synthetic data!"

Combining all this together

import os
from os.path import exists

from hazy_client2 import SynthDocker
from hazy_configurator import (
    CategoryType,
    DataLocationInput,
    DataLocationOutput,
    DataSchema,
    FloatType,
    GenerationConfig,
    IntType,
    TabularTable,
    TrainingConfig,
)

DOCKER_IMAGE = "docker_image:tag"
DOCKER_USER_ID = os.getuid()
DOCKER_GROUP_ID = os.getgid()

synth = SynthDocker(
    image=DOCKER_IMAGE,
    container_user_default=f"{DOCKER_USER_ID}:{DOCKER_GROUP_ID}",
    features_file="/path/to/your/features.json",
    features_sig="/path/to/your/features.sig.json",
)

training_config = TrainingConfig(
    model_output="children.hmf",
    data_schema=DataSchema(
        tables=[
            TabularTable(
                name="children",
                dtypes=[
                    CategoryType(
                        col="first_name",
                    ),
                    CategoryType(
                        col="last_name",
                    ),
                    FloatType(
                        col="height",
                    ),
                    IntType(
                        col="age",
                    ),
                ],
            ),
        ],
    ),
    data_input=[
        DataLocationInput(
            name="children",
            location="children.csv",
        ),
    ],
)

generation_config = GenerationConfig(
    model="children.hmf",
    data_output=[
        DataLocationOutput(
            name="children",
            location="output/children.csv",
        ),
    ],
)

synth.train(cfg=training_config)
assert exists("children.hmf"), "Synthesiser should generate .hmf model file!"

synth.generate(cfg=generation_config)
assert exists("output/children.csv"), "Synthesiser should generate synthetic data!"

Further Reading

The above provides a very simple guide for generating synthetic data via Hazy's SDK.

Advanced features include:

Furthermore, our Complex Examples are provided for more detailed configuration requirements.