Setting up a tutorial Db2 database
The following sets up an IBM Db2 Community Edition database instance for the Using a database for storage tutorial.
- Fetch the
ibmcom/db2
image and start a Docker container:
docker run -itd --name hazy-tutorial-db2 \
-e LICENSE=accept -e DB2INST1_PASSWORD=Hazy123! -e DBNAME=Master \
-e ARCHIVE_LOGS=false -e AUTOCONFIG=false \
--privileged=true \
-p 50000:50000 \
ibmcom/db2
Disclaimer: By using pulling the ibmcom/db2
image and providing LICENSE=accept
in the above command, the user agrees to the license terms for IBM Db2 Community Edition (Container Format).
- The container takes approximately 15 minutes to set up. You can run the following command to check if the setup is complete. When finished, you should see a
(*) Setup has completed
message in your terminal.
docker logs -f hazy-tutorial-db2 | grep '(*) Setup has completed.'
- Install Python packages required for seeding the database. You can store the following in a
requirements.txt
file and runpip install -r requirements.txt
:
numpy==1.23.*
pandas==1.5.*
SQLAlchemy==2.0.*
SQLAlchemy-Utils==0.38.*
pyodbc==4.0.*
ibm-db-sa==0.4.*
Faker==15.0.*
- Seed the
children
dataset into the Db2 instance via the following snippet:
import random
import pandas as pd
from faker.factory import Factory
from sqlalchemy.engine import URL, create_engine
from sqlalchemy.types import Float, SmallInteger, String
random.seed(0)
def build_children() -> pd.DataFrame:
fake = Factory.create()
fake.seed(0)
res = []
for _ in range(1000):
age = random.randint(2, 10)
res.append(
{
"first_name": fake.first_name(),
"last_name": fake.last_name(),
"age": age,
"height": age * 6.5 + 76 + random.randint(-10, 10),
}
)
return pd.DataFrame.from_records(res)
def upsert_dataset(df: pd.DataFrame, conn: URL):
df.to_sql(
name="children",
con=create_engine(conn, echo=True),
if_exists="append",
index=True,
dtype={
"first_name": String(length=255),
"last_name": String(length=255),
"age": SmallInteger(),
"height": Float(),
},
)
def build_connection_string() -> str:
return URL.create(
drivername="db2",
database="Master",
host="localhost",
port=50000,
username="db2inst1",
password="Hazy123!",
)
def main():
conn = build_connection_string()
df = build_children()
upsert_dataset(df, conn)
if "__main__" == __name__:
main()