Day 24 of 50 Days of Python: Using SQLAlchemy or PYODBC to Interact with Databases.
Part of Week 4: Python for Data Engineering
Part of Day 24 is working with Python package that allow you to interact with database management systems. Whether you’re building pipelines, managing app data or performing analytics, knowing how to connect and query databases via Python is a must-have skill.
Why use Python to Interact with Databases?
As everyone data professional knows, databases are the backbone of most if not all applications. They mostly store tabular data in the form of tables and dependant on your modelling strategy you can store raw, enriched and curated data (bronze, silver, gold). When using Python you have some powerful packages at you fingertips to manage and interact with databases. Allowing you to:
Query data
INSERT, UPDATE, or DELETE records
Build ETL or ELT pipelines to move data between systems (API to DB).
With all that being said, let’s move onto some working examples with SQLAlchemy and PYODBC.
SQLAlchemy
SQLAlchemy is a great package which allows you to interact with databases using Python Objects instead of writing raw SQL queries. It supports multiple databases systems like SQLite, PostgreSQL, MySQL and many more.
Installing SQLAlchemy is the same as what we’ve discussed in other Days:
pip install sqlalchemy
Connecting to a database requires a connection string as well as the creation of an engine which is essential for using SQLAlchemy, I’ll be using SQLite as the example:
from sqlalchemy import create_engine
# SQLite connection string
connection_string = "sqlite:///mydatabase.db"
# Create an engine
engine = create_engine(connection_string)
I’ll give you some high-level examples on how to interact with a DB using this package to define tables and insert data as well as query it.
Defining a Table
from sqlalchemy import Column, Integer, String, MetaData, Table
metadata = MetaData()
# Define the table
users = Table(
"users",
metadata,
Column("id", Integer, primary_key=True),
Column("name", String),
Column("age", Integer),
)
# Create the table in the database
metadata.create_all(engine)
Inserting Data
from sqlalchemy import insert
# Inserting a record
with engine.connect() as connection:
statement = insert(users).values(name="Alice", age=30)
connection.execute(statement)
connection.commit()
Querying the Table
from sqlalchemy import select
# Query all users
with engine.connect() as connection:
query = select(users)
result = connection.execute(query)
for row in result:
print(row)
PYODBC
If you’re SQL is up to scratch and you prefer to write your own queries then PYODBC is for you, especially if you want to connect to databases using ODBC like when using MS SQL Server.
Installing PYODBC
pip install pyodbc
Connecting to a SQL Server DB
import pyodbc
# Connection string
connection_string = (
"DRIVER={SQL Server};"
"SERVER=your_server_name;"
"DATABASE=your_database_name;"
"UID=your_username;"
"PWD=your_password;"
)
# Establish a connection
conn = pyodbc.connect(connection_string)
Querying Tables
# Create a cursor
cursor = conn.cursor()
# Execute a query
cursor.execute("SELECT * FROM users")
# Fetch results
for row in cursor:
print(row)
Inserting Data into a Table
cursor.execute("INSERT INTO users (name, age) VALUES (?, ?)", "Bob", 25)
conn.commit() # This is needed to actually push changes, so don't forget it!!
SQLAlchemy vs. PYODBC: Which Should You Use?
SQLAlchemy: Use it if you want a high-level, “Pythonic” way to interact with databases. It’s great for when you want to avoid writing raw SQL.
PYODBC: Use it if you need lightweight, direct access to databases using ODBC or prefer writing raw SQL queries.
Data Engineering Week Task 01
Install SQLAlchemy or PYODBC (or both!).
Connect to a database (e.g., SQLite, PostgreSQL, or SQL Server).
Create a table, insert some data, and query it.
Share your code in a branch of the git repo or any questions in the comments below!
GitHub Reminder
Each time I publish an additional day, I’ll be updating the repo with example code and resources. If you’re familiar with Git then you’ll know you’ll need to pull the latest changes each time.
However, if you’re just downloading it now as part of this series then you’ll have to run a command in the terminal to keep your branch up to date:
git pull origin master
Next Up: Day 25 - Building ETL Pipelines in Python
Day 25, we’ll delve into building ETL pipelines in python. We’ll go through how to extract, transform, and load data using Python, building on today’s database interaction skills.
Let me know how you get on… happy coding!
Hi Jonathon, thanks again for the post.
Just wanted to note in case someone else experiences the same issue, that when I was trying to query the users table using SQLAlchemy package, it wasn't returning anything, but it also wan't throwing any errors. I had trouble with this for a while and went through my script line by line, commented lines out, used repr and keys methods to confirm the table existed.
I even created a SQLAlchemy table from a sqlite file I found on Datacamp and finally the select script worked.
This confirmed the issue was with the insert even though that also wasn't returning any error messages.
It turns out, after applying echo=True in the create_engine call, that the insert statement was being rolledback. So similar to what you've described in the PYODBC section, you also have to use the commit method in SQLAlchemy to push the insert.
Finally I can move on :)