6

I'd like to do the following:

for every nested function f anywhere in this_py_file:
    if has_free_variables(f):
        print warning

Why? Primarily as insurance against the late-binding closure gotcha as described elsewhere. Namely:

>>> def outer():
...     rr = []
...     for i in range(3):
...         def inner():
...             print i
...         rr.append(inner)
...     return rr
... 
>>> for f in outer(): f()
... 
2
2
2
>>> 

And whenever I get warned about a free variable, I would either add an explicit exception (in the rare case that I would want this behaviour) or fix it like so:

...         def inner(i=i):

Then the behaviour becomes more like nested classes in Java (where any variable to be used in an inner class has to be final).

(As far as I know, besides solving the late-binding issue, this will also promote better use of memory, because if a function "closes over" some variables in an outer scope, then the outer scope cannot be garbage collected for as long as the function is around. Right?)

I can't find any way to get hold of functions nested in other functions. Currently, the best way I can think of is to instrument a parser, which seems like a lot of work.

8
  • What are you trying to do exactly?
    – kpie
    Commented May 3, 2016 at 2:53
  • 1
    This is an XY problem on a massive scale. If you're worried about late-binding closures in a piece of code, rewrite the code to not suffer from it, don't do a massive try-except style thing after the fact. Do you not have control over the code that could potentially be doing this? Commented May 3, 2016 at 3:00
  • Also, people have asked about late-binding closures before and have received answers that help mitigate the same. Following suit would serve you better than your current approach. Commented May 3, 2016 at 3:04
  • "if a function "closes over" some variables in an outer scope, then the outer scope cannot be garbage collected for as long as the function is around" - no, just the variables the function needs, and removing the closure will just mean you have to keep those references some other way. Commented May 3, 2016 at 3:10
  • 1
    @AkshatMahajan Have you come across HTML validators or lint checkers? This is exactly how I'm planning to do this. Otherwise, think of asserts that are turned off before code is released or the debug version of STL. This isn't an XY problem unless you can suggest a better alternative (while I do appreciate the "rewrite all your code" joke :) Commented May 3, 2016 at 3:13

4 Answers 4

2

Consider the following function:

def outer_func():
    outer_var = 1

    def inner_func():
        inner_var = outer_var
        return inner_var

    outer_var += 1
    return inner_func

The __code__ object can be used to recover the code object of the inner function:

outer_code = outer_func.__code__
inner_code = outer_code.co_consts[2]

From this code object, the free variables can be recovered:

inner_code.co_freevars # ('outer_var',)

You can check whether or not an code object should be inspected with:

hasattr(inner_code, 'co_freevars') # True

After you get all the functions from your file, this might look something like:

for func in function_list:
    for code in outer_func.__code__.co_consts[1:-1]:
        if hasattr(code, 'co_freevars'):
            assert len(code.co_freevars) == 0

Someone who knows more about the inner workings can probably provide a better explanation or a more concise solution.

0

To "get a hold" of your nested functions (even though you are overriding them) you would have to use eval to make variable definition names on each declaration.

def outer():
     rr = []
     for i in range(3):
         eval("def inner"+str(i)+"""():
             print """+str(i))
         rr.append(eval("inner"+str(i)))
     return rr

for f in outer(): f()

prints

1
2
3
3
  • That forces the closure to be evaluated at assignment time, which is good, but not always the right solution. Imagine if you wanted to code a stateful function that's only meant to make, say, database calls later on in the code - this would force a database call right away, rather than later on as planned. This is a hack, not a complete solution. Good try, though :D Commented May 3, 2016 at 3:01
  • Interesting. my code doesn't work. and here I was so confident... The usual dilemma.
    – kpie
    Commented May 3, 2016 at 3:03
  • 2
    I didn't mention it, but it should be clear that a real solution has to be a "set and forget" style feature. It will become part of the development harness and there is absolutely no way that it should influence coding style or change existing code. Even adding a decorator to every nested function would be overkill. Commented May 3, 2016 at 3:17
0

I also wanted to do this in Jython. But the way shown in the accepted answer doesn't work there, because the co_consts isn't available on a code object. (Also, there doesn't seem to be any other way to query a code object to get at the code objects of nested functions.)

But of course, the code objects are there somewhere, we have the source and full access, so it's only a matter of finding an easy way within a reasonable amount of time. So here's one way that works. (Hold on to your seats.)

Suppose we have code like this in module mod:

def outer():
    def inner():
        print "Inner"

First get the code object of the outer function directly:

code = mod.outer.__code__

In Jython, this is an instance of PyTableCode, and, by reading the source, we find that the actual functions are implemented in a Java-class made out of your given script, which is referenced by the code object's funcs field. (All these classes made out of scripts are subclasses of PyFunctionTable, hence that's the declared type of funcs.) This isn't visible from within Jython, as a result of magic machinery which is a designer's way of saying that you're accessing these things at your own risk.

So we need to dive into the Java for a moment. A class like this does the trick:

import java.lang.reflect.Field;

public class Getter {
    public static Object getFuncs(Object o) 
    throws NoSuchFieldException, IllegalAccessException {
        Field f = o.getClass().getDeclaredField("funcs");
        f.setAccessible(true);
        return f.get(o);
    }
}

Back in Jython:

>>> import Getter
>>> funcs = Getter.getFuncs(mod.outer.__code__)
>>> funcs
mod$py@1bfa3a2

Now, this funcs object has all of the functions declared anywhere in the Jython script (those nested arbitrarily, within classes, etc.). Also, it has fields holding the corresponding code objects.

>>> fields = funcs.class.getDeclaredFields()

In my case, the code object corresponding to the nested function happens to be the last one:

>>> flast = fields[-1]
>>> flast
static final org.python.core.PyCode mod$py.inner$24

To get the code object of interest:

>>> flast.setAccessible(True)
>>> inner_code = flast.get(None)  #"None" because it's a static field.
>>> dir(inner_code)
co_argcount co_filename    co_flags co_freevars co_name co_nlocals co_varnames
co_cellvars co_firstlineno

And the rest is the same as the accepted answer, i.e. check co_freevars, (which is there in Jython, unlike co_consts).

A good thing about this approach is that you enumerate exactly all code objects that are declared anywhere within the source code file: functions, methods, generators, whether or not they are nested under anything or under each other. There is nowhere else for them to hide.

-1

you need to import copy and use rr.append(copy.copy(inner))

https://pymotw.com/2/copy/

2
  • This does not answer the question. He isn't asking to avoid late-binding closures, he's asking to detect them. Commented May 3, 2016 at 3:19
  • Unfortunately this doesn't actually work. I think different inner() functions are created anyway (in some sense), but the trouble is that they all keep a free variable reference (using terminology in some loose sense) to the outer scope, and only look up the value of i in that scope when they are called. By which time it is 2. This would appear as strange semantics from the point of view of C++/Java language model; this is more like what happens in JavaScript and maybe Ruby. Commented May 8, 2016 at 12:10

Not the answer you're looking for? Browse other questions tagged or ask your own question.