20

To speed up Lambda execution, I am trying to move some parts of my Python code outside the handler function

As per Lambda's documentation:

After a Lambda function is executed, AWS Lambda maintains the Execution Context for some time in anticipation of another Lambda function invocation. In effect, the service freezes the Execution Context after a Lambda function completes, and thaws the context for reuse, if AWS Lambda chooses to reuse the context when the Lambda function is invoked again. This Execution Context reuse approach has the following implications:

Any declarations in your Lambda function code (outside the handler code, see Programming Model) remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations…

Following their example, I have moved my database connection logic outside the handler function so subsequent WARM runs of the function can re-use the connection instead of creating a new one each time the function executes.

However, AWS Lambda provides no guarantees that all subsequent invocations of a function that started COLD will run warm so if Lambda decides a COLD start is necessary, my code would re-create the database connection.

When this happens, I assume the previous (WARM) instance of my function that Lambda teared down would have had an active connection to the database which was never closed and if the pattern kept repeating, I suspect I'd have a lot of orphaned DB connections.

Is there a way in Python to detect if Lambda is trying to kill my function instance (maybe they send a SIGTERM signal?) and have it close active DB connections?

The database I'm using is Postgres.

2
  • 1
    Have you checked whether you are actually seeing these orphaned connections? You might be trying to solve a problem that doesn't exist. These connections might be getting automatically closed by the DB quite soon after they're orphaned (no heartbeat). Other than that, you can try the atexit module
    – Faboor
    Commented May 9, 2019 at 19:30
  • @Faboor I do see a couple of idle connections from the Lambda function in pg_stat_activity. Whether or not they'll be closed when Lambda kills my function is hard to test as I don't know when exactly they'll get killed.
    – Vinayak
    Commented May 10, 2019 at 10:35

5 Answers 5

13

The accepted answer is no longer correct, it might've been in the past, but today your lambda should be receiving a SIGTERM when AWS intends to terminate.

AWS has official examples on handling graceful shutdowns in python and other languages here:

https://github.com/aws-samples/graceful-shutdown-with-aws-lambda/tree/main/python-demo

But effectively you do:

import signal

def exit_gracefully(signum, frame):
  print('SIGTERM RECEIVED')

signal.signal(signal.SIGTERM, exit_gracefully)

This gets called on container shutdown, and you have 300ms to do cleanup.

3
  • I think this answer is referring to my answer, I mention that so that you take my response with a grain of salt. But even though this answer is technically correct, for completeness sake, you should know that it uses a relatively new feature that enables this called "Lambda Extensions". The most important things to know with this feature is that the lambda extension will generate extra cost (as it is effectively running as a background process) and it will also count towards the package size limit.
    – Dudemullet
    Commented Feb 2 at 15:40
  • @Dudemullet This behavior has been in place since October 2020. This is the post that announced it with the lifecycle already in place: aws.amazon.com/blogs/compute/… it has been out of preview since 2021. It's very safe to rely on. You don't get billed extra for lambda extensions, they run inside of the same container. The code size towards the 250mb limit is the unzipped size of Lambda + Extensions.
    – Kit Sunde
    Commented Feb 14 at 7:27
  • I mention the relative "newness" of this feature not to sway away from using it, but more along the lines of why I originally never mentioned it. Also, you do get charged extra for Lambda extensions as they run for more time. docs.aws.amazon.com/lambda/latest/dg/lambda-extensions.html
    – Dudemullet
    Commented Feb 19 at 13:49
10
+50

There is no way to know when a lambda container will be destroyed unfortunately.

With that out of the way, cold boots and DB connections are both very discussed topics using Lambdas. Worst is that there is no definitive answer and should be handled on a use-case basis.

Personally, I think that the best way to go about this is to create connections and kill the idle ones based on a time out postgres side. For that I direct you to How to close idle connections in PostgreSQL automatically?

You might also want to fine tune how many lambdas you have running at any point in time. For this I would recommend setting a concurrency level in your lambda aws-docs. This way you limit the amount of running lambdas and potentially not drown your DB server with connections.

Jeremy Daly(serverless hero) has a great blog post on this. How To: Manage RDS Connections from AWS Lambda Serverless Functions

He also has a project, in node unfortunately, that is a wrapper around the mysql connection. This monitors the connection and automatically manages them like killing zombies serverless-mysql. You might find something similiar for python.

3

I dont think what you are looking for is possible at the moment. Hacks might work but I will advice not to depend on them as undocumented things can stop working at any point in time without notice in a closed source system.

I guess you are concerned about the number of new connection created by your lambda functions and the load it puts on the db server.

Have you seen pgbouncer (https://pgbouncer.github.io/) it is one of the famous connection poolers for postgres. I would recommend using something like pgbouncer in between your lambda function and db.

This will remove the load on your db server caused by creation of new connection as connections between pgbouncer and postgres can remain for a long time. The lambda functions can make new connection to pgbouncer which is more than capable of handling un-closed connections with the various timeout config settings.

Update on 9th Dec 2019

AWS recently announced RDS Proxy capable of connection pooling. Currently its in preview and has no support for postresql but they say its coming soon.

https://aws.amazon.com/rds/proxy/

https://aws.amazon.com/blogs/compute/using-amazon-rds-proxy-with-aws-lambda/

0

I totally agree with @dudemullet.

Currently there is no way you can surely say when a lambda function is going to die. The best approach is to first understand the purpose of your connection. If it is only a simple select/update query that would ideally not take too long to execute, I would suggest you to open and close the connections inside the handler function. This way at least you can be 100% sure that there will not be any orphaned connections

But on the flip side, you might have to bare those few extra milliseconds of the cold start!

0

I haven't time to test this, but how about trap - I'm AFK at the moment but when I get in I'll edit this answer after some experimentation?

FYI I don't know what signals are sent when a container gets killed, it's not something I looked at, so this answer is based on them being decommissioned in the same way a normal Linux machine goes down.

In your handler you'd add a shell command that runs this script, and then set a variable which will remain in place while the container is being re-used - I'm not a python guy but your logic would go something like this:

Handler

const { exec } = require('child_process');

if(typeof isNewContainer === 'undefined'){
     const isNewContainer = true 

    // run a shell script, in javascript we use shell exec and 
    // then have a callback for when it exits so the execution is non blocking and allows 
    // the code below to execute.
    exec('./script.sh & sleep 1 && kill -- -$(pgrep script.sh)', (err, stdout, stderr) => {
    // close db connections
   }


}

// handle the request

Shell script based on this answer:

#!/bin/bash
exitCallback() {
    trap - SIGTERM # clear the trap
    kill -- -$$ # Sends SIGTERM to child/sub processes
}

trap exitCallback SIGTERM

sleep infinity

Make sure you have a read of the comments on the accepted answer for that question as it gives you the shell commands to run the script.

I would say it's pretty easy to keep containers warm but your question was "Is there a way in Python to detect if Lambda is trying to kill my function instance (maybe they send a SIGTERM signal?) and have it close active DB connections?"

Not the answer you're looking for? Browse other questions tagged or ask your own question.