Setting up a tutorial SQL Server database
The following sets up a SQL Server (Azure SQL Edge) database instance for the Using a database for storage tutorial.
- Fetch the
mcr.microsoft.com/azure-sql-edge
image and start a Docker container:
docker run -itd --cap-add SYS_PTRACE -e 'ACCEPT_EULA=1' -e 'MSSQL_SA_PASSWORD=Hazy123!' -p 1433:1433 --name hazy-tutorial-mssql -d mcr.microsoft.com/azure-sql-edge
Disclaimer: By using pulling the mcr.microsoft.com/azure-sql-edge
image and providing ACCEPT_EULA=1
in the above command, the user agrees to the license terms for Azure SQL Edge which can be found on Docker Hub.
- The container takes approximately 2 minutes to set up. You can run the following command to check if the setup is complete:
docker logs -f hazy-tutorial-mssql | grep 'Recovery is complete. This is an informational message only. No user action is required.'
- Install Python packages required for seeding the database. You can store the following in a
requirements.txt
file and runpip install -r requirements.txt
:
numpy==1.23.*
pandas==1.5.*
SQLAlchemy==2.0.*
SQLAlchemy-Utils==0.38.*
pyodbc==4.0.*
ibm-db-sa==0.4.*
Faker==15.0.*
- Seed the
children
dataset into the SQL Server instance via the following snippet:
Note: The following code requires unixodbc
as well as the ODBC Driver 17 for SQL Server
installed on your machine and correctly configured in obdcinst.ini
.
import random
import pandas as pd
from faker.factory import Factory
from sqlalchemy.engine import URL, create_engine
from sqlalchemy.types import Float, SmallInteger, String
random.seed(0)
def build_children() -> pd.DataFrame:
fake = Factory.create()
fake.seed(0)
res = []
for _ in range(1000):
age = random.randint(2, 10)
res.append(
{
"first_name": fake.first_name(),
"last_name": fake.last_name(),
"age": age,
"height": age * 6.5 + 76 + random.randint(-10, 10),
}
)
return pd.DataFrame.from_records(res)
def upsert_dataset(df: pd.DataFrame, conn: URL):
df.to_sql(
name="children",
con=create_engine(conn, echo=True),
if_exists="append",
index=True,
)
def build_connection_string() -> str:
server = "localhost"
username = "SA"
password = "Hazy123!"
database = "Master"
driver = "ODBC Driver 17 for SQL Server"
return f"mssql+pyodbc://{username}:{password}@{server}/{database}?driver={driver}"
def main():
conn = build_connection_string()
df = build_children()
upsert_dataset(df, conn)
if "__main__" == __name__:
main()