40

My Flask API has a small memory leak that over a number of API calls causes my application to hit it's memory limit and crash. I've been trying to figure out why some memory isn't being released with no success so far, I believe I do know the sources. Id appreciate any help!

Unfortunately I can't share the code but to describe it in English, my flask app provides an API endpoint for a user to do the following (all in one call):

  1. Pull some data from MongoDB based on an ID provided.
  2. From what's returned, build a document object using the python-docx library and save that to disk.
  3. Finally, I take what was saved to disk and upload it to an S3 bucket then delete what was on disk.

From what I can tell, using the memory_profiler library the two areas where I am seeing the most memory usage is the initialization of the Document object and connecting/saving to S3 (7MB and 4.8MB respectively).

What I am doing to monitor the memory usage of my Python process is I'm having psutils print out the rss memory used at certain key points (example code below).

process = psutil.Process(os.getpid())
mem0 = process.memory_info().rss
print('Memory Usage After Action',mem0/(1024**2),'MB')

## Perform some action

mem1 = process.memory_info().rss
print('Memory Usage After Action',mem1/(1024**2),'MB')
print('Memory Increase After Action',(mem1-mem0)/(1024**2),'MB')

The console image provided is after I've called the app three times while hosting it locally.

console image provided

What's concerning is that every sequential API call seems to start at or above where the last call left the memory used amount at and continues to add on to it. The app starts at 93MB (see yellow highlights) but after the first call it ends at 103.79MB, the second starts at 103.87MB and ends at 105.39MB, and the third starts at 105.46Mb and ends at 106MB. There is diminishing usage amounts but after 100 calls I still see incremental memory usage. The red and blue lines show the memory changes at various points during the API call. The red lines are after the document build and the blue lines are after the S3 upload.

Please note that my test program is calling the API with the same parameters every time.

I have tested, among other things, the following:

  1. gc.collect().
  2. explicitly deleting variable/object references using 'del'.
  3. ensuring that the mongo connection is closed (since I'm using the IBM_Botos3 library for an S3 connection I don't know if there's a way to explicitly close this connection).
  4. No global variables that I'm saving to with each API call (app is the only global variable).

I know since I cant provide code there may not be much to go off of here but if there are no ideas I was wondering if there's a best practice way to handle flask memory usage or a way to clear out memory after the flask function returns something. Right now my flask functions are relatively standard Python functions (so I'd expect local variables inside this function to be garbage collected afterwards).

I am using Python 3.6, Flask 0.11.1, and pymongo 3.6.1, my tests are right now on a windows 7 machine but my IBM cloud server is seeing the same issue.

0

3 Answers 3

18

Important Note

Since this question was asked, Sanked Patel gave a talk at PyCon India 2019 about how to fix memory leaks in Flask. This is a summary of his strategy.

Minimal Example

Suppose you have a simple stateless Flask app with only one endpoint named 'foo'. Note that the other endpoints 'memory' and 'snapshot' aren't part of the original app. We need them later to find the memory leak.

import gc
import os
import tracemalloc

import psutil
from flask import Flask

app = Flask(__name__)
global_var = []
process = psutil.Process(os.getpid())
tracemalloc.start()
s = None


def _get_foo():
    global global_var
    global_var.append([1, "a", 3, True] * 10000)  # This is our (amplified) memory leak
    return {'foo': True}


@app.route('/foo')
def get_foo():
    gc.collect()  # does not help
    return _get_foo()


@app.route('/memory')
def print_memory():
    return {'memory': process.memory_info().rss}


@app.route("/snapshot")
def snap():
    global s
    if not s:
        s = tracemalloc.take_snapshot()
        return "taken snapshot\n"
    else:
        lines = []
        top_stats = tracemalloc.take_snapshot().compare_to(s, 'lineno')
        for stat in top_stats[:5]:
            lines.append(str(stat))
        return "\n".join(lines)


if __name__ == '__main__':
    app.run()

The memory leak is in line 17 and indicated by comment. Unfortunately, this is seldom the case. ;)

As you can see I have tried to fix the memory leak by calling garbage collection manually, i.e. gc.collect(), before returning a value at the endpoint 'foo'. But this doesn't solve the problem.

Finding the Memory Leak

To find out if there is a memory leak, we call the endpoint 'foo' multiple times and measure the memory usage before and after the API calls. Also, we will take two tracemalloc snapshots. tracemalloc is a debug tool to trace memory blocks allocated by Python. It is in the standard library if you use Python 3.4+.

The following script should clarify the strategy:

    import requests

    # Warm up, so you don't measure flask internal memory usage
        for _ in range(10):
        requests.get('http://127.0.0.1:5000/foo')

    # Memory usage before API calls
    resp = requests.get('http://127.0.0.1:5000/memory')
    print(f'Memory before API call {int(resp.json().get("memory"))}')

    # Take first memory usage snapshot
    resp = requests.get('http://127.0.0.1:5000/snapshot')

    # Start some API Calls
    for _ in range(50):
        requests.get('http://127.0.0.1:5000/foo')

    # Memory usage after
    resp = requests.get('http://127.0.0.1:5000/memory')
    print(f'Memory after API call: {int(resp.json().get("memory"))}')

    # Take 2nd snapshot and print result
    resp = requests.get('http://127.0.0.1:5000/snapshot')
    pprint(resp.text)

Output:

Memory before API call 35328000
Memory after API call: 52076544
('.../stackoverflow/flask_memory_leak.py:17: '
 'size=18.3 MiB (+15.3 MiB), count=124 (+100), average=151 KiB\n'
 '...\\lib\\tracemalloc.py:387: '
 'size=536 B (+536 B), count=3 (+3), average=179 B\n'
 '...\\lib\\site-packages\\werkzeug\\wrappers\\base_response.py:190: '
 'size=512 B (+512 B), count=1 (+1), average=512 B\n'
 '...\\lib\\tracemalloc.py:524: '
 'size=504 B (+504 B), count=2 (+2), average=252 B\n'
 '...\\lib\\site-packages\\werkzeug\\datastructures.py:1140: '
 'size=480 B (+480 B), count=1 (+1), average=480 B')

There is a large difference in memory usage before versus after the API calls, i.e. a memory leak. The second call of the snapshot endpoint returns the five highest memory usage differences. The first result locates the memory leak correctly in line 17.

If the memory leak hides deeper in the code, you may have to adapt the strategy. I have only scratched the capabilities of tracemalloc. But with this strategy you have a good starting point.

1
  • 1
    This is a good general strategy for finding memory leaks, but the fundamental problem with Flask is that it loads a fresh copy of the main script in memory every time the API is called. Thus, you can minimize the problem by not having the main script do much (e.g., moving data loading to an import--which will also speed things up, esp. with the "preload_app" flag in Gunicorn), but you can't get rid of it entirely without a server package like Gunicorn or Waitress.
    – MTKnife
    Commented Mar 10, 2021 at 18:05
5

After a few years I should give an updated. Since I posted in a comment Ill make this the "answer" unless someone finds a better solution.

Unfortunately I wasnt able to completely solve the problem and had to move on but I was able to reduce the incremental consumption to a point where regular maintenance and monitoring would clear what remained / notify if we got close to our limit.

The biggest thing that reduced incremental memory consumption between calls was to start another thread to handle the memory locking task within the Flask endpoint, wait for the thread to finish, and once done kill the thread. Like I said it didn't solve the problem completely and does introduce overhead, but it reduced the memory leaking problem to a point where we could accept it with the aforementioned steps. This was a band aid fix. So feel free to suggest an alternative/better/a real solution if one exists.

Thank you, @above_c_level for the helpful tip for debugging memory leaks in Flask.

1
  • 1
    Hi Bejan, can you expand on what you mean by starting another thread to handle the memory locking task? Do you mean you created a separate thread to create the document and upload to S3? Commented Sep 13, 2021 at 17:38
2

This behavior only happened to me in debug mode in development environment, but when I use waitress as web server my flask application works fine without memory leak.

This is my app.waitress to run from virtual environment.

import sys
import os
import site
from waitress import serve
dir_path = os.path.dirname(__file__)
sys.path.append(os.path.abspath(dir_path))
venv_packages =  os.path.abspath(os.path.join(dir_path, 'venv', 'lib', 'site-packages'))
sys.path.append(venv_packages)
site.addsitedir(venv_packages)
from dotenv import load_dotenv
dotenv_path = os.path.join(os.path.dirname(__file__), '.env')
load_dotenv(dotenv_path)
from settings import API_HOST, API_PORT
from app import app as application
serve(application, host=API_HOST, port=API_PORT)

To run it from terminal (Mac or Linux):

. venv/bin/activate
pip install waitress
python app.waitress

To run it from Windows:

py -3 -m pip install waitress
py app.waitress

Environment:

Python 3.7.9
waitress 1.4.1
Flask 1.1.2
Flask-Cors 3.0.10
Flask-JWT-Extended 3.25.0
python-dotenv 0.10.3
2
  • Gunicorn (gunicorn.org) will also do the trick. I don't think there's any way to completely eliminate the leak from within Flask. IIRC, it's a byproduct of the Global Interpreter Lock.
    – MTKnife
    Commented Mar 10, 2021 at 17:41
  • I disagree with you on this. It is absolutely possible to have synchronized code without any memory leak. Every leak is due to a mistake in code. What you may see as "creeping memory" that feels unavoidable is more often caused by memory fragmentation: Depending on the memory allocation strategy, it's possible that no free contiguous chunk of it is left, causing further heap alloc. Either way, this can also be solved even if it may not be at the python application level.
    – NGauthier
    Commented Nov 23, 2021 at 14:38

Not the answer you're looking for? Browse other questions tagged or ask your own question.