25

I am using SQLAlchemy to connect to different databases in Python, but not with the ORM support as this cannot be implemented due to several reasons.

Mainly I do build a complex SQL query using things like

sql += "AND fieldname = '%s'" % myvar

In my case is not a problem of SQL injection as the data is always from a trusted source but even if the source is trusted it could contain characters that could break the query like ', % or _.

Mainly, I need to escape them, and I wonder if there is an already existing escape function that I could re-use.

6 Answers 6

23

You should not try to implement your own escaping, but should instead use SQLAlchemy's builtin method:

sql = 'select * from foo where fieldname = :name'
result = connection.execute(sql, name = myvar)
7
  • 38
    @BrtH it does not matter if it was a second or a year ago, the answer is still being read by other users.
    – SimonT
    Commented Jul 11, 2016 at 21:00
  • 6
    Well I'm still reading this today and I have no idea what you guys were talking about when you are saying the 'sql expression layer'
    – Majiick
    Commented Nov 18, 2018 at 22:47
  • 3
    Yeah, this solution doesn't allow someone to build an aggregate insert using the INSERT INTO table_name (col1, col2) VALUES ('col1', 'col2) format. =P For this, you really need to genuinely escape values prior to collecting them.
    – tpartee
    Commented Apr 27, 2019 at 1:19
  • 2
    @tpartee And for that as well there are better methods than manually escaping. Commented Oct 17, 2020 at 7:26
  • I found this looking for an answer to a question "How to escape SQL strings manually" and the top answer is "don't"... It can by good advice but you could nevertheless write how to do it.
    – user14611663
    Commented Jan 27, 2022 at 17:29
18

In cases where one must explicitly escape a string, and the standard tools don't align with the requirement, you can ask SQLAlchemy to escape using an engine's dialect.

import sqlalchemy


engine = sqlalchemy.create_engine(...)
sqlalchemy.String('').literal_processor(dialect=engine.dialect)(value="untrusted value")

In my case, I needed to dynamically create a database (sqlalchemy-utils has this functionality but it failed in my case) according to user input.

1
  • 5
    Good answer but incomplete. Example doesn't actually do anything. it just puts "untrusted value" into " 'untrusted value' " with quotes. Pretty sure you need to clarify with a proper example not to mention doesn't explain how you would write a 'SELECT * FROM :table_dynamic" type thing. Commented Jan 25, 2021 at 21:39
4

To extend @edd 's answer, which works in a limited capacity.

@edd provided:

import sqlalchemy

engine = sqlalchemy.create_engine(...)
sqlalchemy.String('').literal_processor(dialect=engine.dialect)(value="untrusted value")

If your "untrusted value" is a query you want to execute, this will end up a double-quoted string wrapping a single-quoted string, which you can't directly execute without stripping the quotes, i.e. "'SELECT ...'".

You can used sqlalchemy.Integer().literal_processor to do the same thing, but the result will not have the extra inner quotes, because it is intended to create an integer like 5 instead of a string like '5'. So your result will only be quoted once: "SELECT ...".

I found this Integer approach a little sketchy - is the person that reads my code going to know why I'm doing this? For psycopg2 at least, there is a more direct and clear approach.

If your underlying driver is psycopg2, you can use sqlalchemy to reach down into the driver, get the cursor, then use psycopg2's cursor.mogrify to bind & escape your query

from sqlalchemy.orm import sessionmaker

Session = sessionmaker(bind=engine)
session = Session()
cursor = session.connection().connection.cursor()
processed_query = cursor.mogrify([mogrify args, see docs]).decode("UTF-8")

I got how to grab the cursor from this answer: SQLAlchemy, Psycopg2 and Postgresql COPY

And mogrify from this answer: psycopg2 equivalent of mysqldb.escape_string?

My use case was building a query, then wrapping it in parantheses and aliasing like (SELECT ...) AS temp_some_table, in order to pass it to PySpark JDBC read. When SQLAlchemy builds the queries, it minimizes the parentheses, and so I could only get SELECT ... AS temp_some_table. I used the above approch to get what I need:

cursor = session.connection().connection.cursor()
aliased_query = cursor.mogrify(
    f"({query}) AS temp_{model.__tablename__}"
).decode("UTF-8")
4

there also exists escape_like

from sqlalchemy_utils import escape_like


User.name.ilike(escape_like('John'))

So you can call escape_like for all your strings before formatting them to sql.

Or if you don't want to add extra therdparty lib, you can do it like

def escape_sql_search_text(text: str) -> str:
    return text.replace("%", "\\%").replace("\\", "\\\\").replace("_", "\\_")

User.name.ilike(escape_sql_search_text('John')) 

if you want to use f-string in your sql (don't recommend to do it), then you elso need .replace("'", "''")

def _escape_sql_search_text_for_fstring(text: str) -> str:
    return text.replace("%", "\\%").replace("\\", "\\\\").replace("_", "\\_").replace("'", "''")

ilike_text = _escape_sql_search_text_for_fstring(text)
sql = f"name::varchar ilike '{ilike_text}'"
1
  • 2
    Note this is from SQAlchemy-utils, which is a separate library.
    – bfontaine
    Commented Feb 11, 2022 at 20:30
2

Compiling and completing other contributors' answers.

Writing direct SQL strings is often a bad solution, because each database system supports its own SQL dialect, and hence SQL strings are often not portable across databases.

To abstract this issue away from the users, SQLAlchemy invites you to write your SQL queries at a slightly higher level, in a more object-oriented fashion. It's called the SQL Expression language, and it's documented here: https://docs.sqlalchemy.org/en/13/core/tutorial.html

Basically, you can build expressions in Python that describe what you want to do, and SQLAlchemy will produce the corresponding SQL string for you, using the appropriate dialect for the database you're using.

Since you are familiar with SQL, you can learn this "mini-language" in a matter of hour(s) (I'm hesitant to put an "s" here).

And if I believe @BrtH, using this system will also escape strings for you, transparently. Escaping is hard to get right, and so it's always better to leave it to a mature system than trying to do it yourself.

Here's a random example of a select clause:

from sqlalchemy import select
...
ham_table = meta.tables['ham']
sel = select([ham_table.c.size, ham_table.c.weight]).where(ham_table.c.taste == 'yummy')
result = meta.bind.execute(sel)

Don't be thrown off by the ".c.", it's just a convention that helps them automate things nicely for you. They basically generate column descriptors for you, and store them under the .c field of your table object.

2

You can use escape_string method from pymysql and then escape : so SQLAlchemy won't try to bind parameter for that variable, here is the sample

import MySQLdb
query = """ insert into.... values("{}"...) """.format(MySQLdb.escape_string(item).replace(':','\:'))

Note that if you use this way your code vulnerable to SQL Injection

to install pymysql

pip3 install pymysql
1
  • 1
    Note that this may not be sufficient for all SQL dialects. For example, escape_string uses a backslash to escape quotes which doesn't work with SQLite.
    – totalhack
    Commented Sep 11, 2020 at 12:57

Not the answer you're looking for? Browse other questions tagged or ask your own question.