Upgrade guide

Hazy Release unknown Upgrade Guide¶

Authentication¶

In order to complete the upgrade, you will need a set of login credentials. These will have been provided separately. The following tokens are used in this document to indicate where the installation credentials should be entered.

[Username]
[Password]

Update Python Client¶

If python is used to run the training or generation, either using Jupyter notebooks or a script file, then the python client needs updating to the latest version.

The Python client can be installed from Hazy's private Python package repository.

In your ~/.pip/pip.conf file or inside a virtual environment (at $VIRTUAL_ENVIRONMENT/pip.conf) you can add the Hazy package repository as follows. You may need to create the .pip directory and the pip.conf file if they don't already exist.

# ~/.pip/pip.conf
[global]
extra-index-url = https://[Username]:[Password]@release.hazy.com/simple

Then you should install the python SDK packages:

pip install hazy-client2==unknown
pip install hazy-configurator==unknown

This needs to be run on all the virtual environments/machines that are used for training and generation. If Jupyter Labs is being used then the environment will need to be restarted to pick up the changed client.

New Synthesiser Images¶

The version number of the synthesiser needs to change to “unknown” and needs to confirm that the path to the synthesiser is hazy/… This should be done by the owners of the Jupyter notebooks.

For example:

image = "hazy/multi-table:unknown"

The Hazy license files now need to be provided explicitly to the docker containers and can be added in the client as as arguments in the SynthDocker constructor. The license files will be provided by Hazy.

Following the getting started guide in the docs, this can be used as:

synth = SynthDocker(
    image=image,
    features_file="features.json",
    features_sig="sig.json"
)

Backing up the hub database¶

The hub database can be backed up from the command line. An example is below:

docker run --rm --name hazy-configurator-backup                                    `# see note 1.` \
    -v /host/data/volume:/configurator-app-data                                    `# see note 2.` \
    --entrypoint sqlite3                                                           `# see note 3.` \
    hazy/multi-table:unknown                                                         `# see note 4.` \
    /configurator-app-data/db ".backup /configurator-app-data/db.bak_<timestamp>"  `# see note 5.`

Notes:

[1] A separate container is started to perform the backup.
[2] The host volume on which the sqlite3 database is stored must be mounted into the backup container.
[3] The sqlite3 command line tool is used to safely perform an online backup.
[4] The same synth image is used to manage backups.
[5] The sqlite3 tool provides a backup command which is used in this example. Apply a naming convention as per the example which allows backups to be taken in line with internal policy, for example, daily.

Migrating the hub database¶

The hub database can be migrated from the command line. An example is below:

docker run --rm --name hazy-configurator-migrate                      `# see note 1.` \
    -v /host/data/volume:/configurator-app-data                       `# see note 2.` \
    -e CONFIGURATOR_DB_URI=sqlite+aiosqlite:////configurator-app-data/db                        \
    hazy/multi-table:unknown                                            `# see note 3.` \
    migrate upgrade head                                              `# see note 4.`

Notes:

[1] A separate container is started to perform the migration.
[2] The host volume on which the sqlite3 database is stored must be mounted into the migration container.
[3] The same synth image is used to manage migrations.
[4] Migrate to the latest version of the database.

Running the hub UI¶

Previously you would have run something like the following:

# start-configurator-ui
CONFIGURATOR_INPUT=/host/input/volume
docker run                                         \
  --detach                                         \
  --name hazy-configurator                         \
  -p 5001:5001                                     \
  -v $CONFIGURATOR_INPUT:/configurator-input:ro    \
  -v /host/data/volume:/data                       \
  -e CONFIGURATOR_INPUT=$CONFIGURATOR_INPUT        \
  -e CONFIGURATOR_DB_URI=sqlite+aiosqlite:////data/db        \
  release.hazy.com/hazy/multi-table:unknown  \
  configure

You will now run something like:

# start-configurator-ui
CONFIGURATOR_INPUT=/host/input/volume                      # see note 1.
CONFIGURATOR_OUTPUT=/host/output/volume                    # see note 2.
CONFIGURATOR_APP_DATA=/host/data/volume                    # see note 3.
CONFIGURATOR_ENV_FILE=$(pwd)/.env                          # see note 4.
CONFIGURATOR_DOCKER_IMAGE="hazy/multi-table:unknown" # see note 5.
CONFIGURATOR_PORT=5001                                     # see note 6.
CONFIGURATOR_FEATURES=features.json                        # see note 7
CONFIGURATOR_FEATURES_SIG=features.sig.json
docker run                                                           \
  --detach                                                           \
  --name hazy-configurator                                           \
  -p $CONFIGURATOR_PORT:5001                                         \
  -v $CONFIGURATOR_APP_DATA:/configurator-app-data                   \
  -v $CONFIGURATOR_INPUT:/configurator-input:ro                      \
  -v $CONFIGURATOR_OUTPUT:/configurator-output                       \
  -v $CONFIGURATOR_FEATURES:/var/lib/hazy/features.json              \
  -v $CONFIGURATOR_FEATURES_SIG:/var/lib/hazy/features.sig.json      \
  -e CONFIGURATOR_INPUT=$CONFIGURATOR_INPUT                          \
  -e CONFIGURATOR_OUTPUT=$CONFIGURATOR_OUTPUT                        \
  -e CONFIGURATOR_APP_DATA=$CONFIGURATOR_APP_DATA                    \
  -e CONFIGURATOR_DB_URI=sqlite+aiosqlite:////configurator-app-data/db         \
  -e CONFIGURATOR_FEATURES=$CONFIGURATOR_FEATURES                    \
  -e CONFIGURATOR_FEATURES_SIG=$CONFIGURATOR_FEATURES_SIG            \
  --env-file $CONFIGURATOR_ENV_FILE                                  \
  $CONFIGURATOR_DOCKER_IMAGE                                         \
  configure

Notes:

[1] A volume must be mounted into the docker container that enables the configurator to access sample source data for analysis. The host volume can be any on the filesystem as long as permissions are set such that it can be mounted readable by the container. Here we specify the path as an environment variable and later provide it as both a bind mount (see note 3) and an environment variable (see note 5) to the container.
[2] A volume must be mounted into the docker container that enables the configurator to dump synthetic data. The host volume can be any on the filesystem as long as permissions are set such that it can be mounted writeable by the container. Here we specify the path as an environment variable and later provide it as both a bind mount and an environment variable to the container.
[3] A host volume must be mounted that can be used to store the application's data. An sqlite database will be created in that folder by the application when first run. If the user wishes to connect to external data sources both for reading in data and output. They must be placed in a data_sources.json.
[4] This .env file is for storing secrets as key value pairs. data_sources.json should contain no secrets - only references to environment variables which are stored in this file. This file should be protected with necessary permissions so only certain users can see the contents.
[5] This is the standard synth image which also houses the configurator. Hazy will inform the client which docker image to use.
[6] The configurator web server runs on port 5001; in order for the web site to be accessible from the docker host the port must be mapped. The example maps the container port 5001 to the host of the same port number. The host port number (the left hand side of the mapping) can be modified if necessary.
[7] As of version 2.3.0, licensing files are distributed separately and must be mounted into the container at launch in order to activate correctly.

Running Hazy Cloud¶

Please refer to Hazy Docs for Hazy Cloud installation.

Python SDK install

Identity and Access Management