To extend @edd 's answer, which works in a limited capacity.
@edd provided:
import sqlalchemy
engine = sqlalchemy.create_engine(...)
sqlalchemy.String('').literal_processor(dialect=engine.dialect)(value="untrusted value")
If your "untrusted value" is a query you want to execute, this will end up a double-quoted string wrapping a single-quoted string, which you can't directly execute without stripping the quotes, i.e. "'SELECT ...'"
.
You can used sqlalchemy.Integer().literal_processor
to do the same thing, but the result will not have the extra inner quotes, because it is intended to create an integer like 5
instead of a string like '5'
. So your result will only be quoted once: "SELECT ..."
.
I found this Integer approach a little sketchy - is the person that reads my code going to know why I'm doing this? For psycopg2 at least, there is a more direct and clear approach.
If your underlying driver is psycopg2, you can use sqlalchemy to reach down into the driver, get the cursor, then use psycopg2's cursor.mogrify
to bind & escape your query
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)
session = Session()
cursor = session.connection().connection.cursor()
processed_query = cursor.mogrify([mogrify args, see docs]).decode("UTF-8")
I got how to grab the cursor from this answer: SQLAlchemy, Psycopg2 and Postgresql COPY
And mogrify from this answer: psycopg2 equivalent of mysqldb.escape_string?
My use case was building a query, then wrapping it in parantheses and aliasing like (SELECT ...) AS temp_some_table
, in order to pass it to PySpark JDBC read
. When SQLAlchemy builds the queries, it minimizes the parentheses, and so I could only get SELECT ... AS temp_some_table
. I used the above approch to get what I need:
cursor = session.connection().connection.cursor()
aliased_query = cursor.mogrify(
f"({query}) AS temp_{model.__tablename__}"
).decode("UTF-8")