Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for weak references in frames #102960

Open
ambv opened this issue Mar 23, 2023 · 7 comments
Open

Support for weak references in frames #102960

ambv opened this issue Mar 23, 2023 · 7 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@ambv
Copy link
Contributor

ambv commented Mar 23, 2023

It's occasionally useful to be able to link information to active frames. Two recent examples:

Currently, since frames don't support being weakly referenced, custom lifecycle management is required in this case. That's unfortunate because errors in cleaning out the references will lead to memory leaks of dead frames and their respective locals.

The straightforward solution is to add weakref support to frame objects. The performance hit is limited to setting NULL on the f_weakreflist field on frame object creation, and later a NULL check (for whether there are any weakrefs) at the frame object's deallocation. This is what the related PR does.

I ran the PR and the base main branch at the time against pyperformance 1.0.6. The raw results are in benchmarks.zip but the gist is that there are no significant differences between the two branches in any of the benchmarks executed.

Linked PRs

@markshannon
Copy link
Member

Do you have a link to the results in a legible form?

@ambv
Copy link
Contributor Author

ambv commented Mar 23, 2023

I'll get that for you tomorrow. In the mean time you can:

$ pip install pyperformance
$ unzip benchmarks.zip
$ pyperformance compare py312*
@ambv
Copy link
Contributor Author

ambv commented Mar 24, 2023

py312-4075fe1.json
==================

Performance version: 1.0.6
Report on macOS-12.6.3-arm64-arm-64bit
Number of logical CPUs: 10
Start date: 2023-03-23 16:41:05.587781
End date: 2023-03-23 17:54:09.388857

py312+weakref.json
==================

Performance version: 1.0.6
Report on macOS-12.6.3-arm64-arm-64bit
Number of logical CPUs: 10
Start date: 2023-03-23 15:18:16.273352
End date: 2023-03-23 16:31:20.587873

### 2to3 ###
Mean +- std dev: 700 ms +- 1 ms -> 691 ms +- 4 ms: 1.01x faster
Not significant

### async_generators ###
Mean +- std dev: 1.41 sec +- 0.00 sec -> 1.39 sec +- 0.01 sec: 1.02x faster
Not significant

### async_tree_cpu_io_mixed ###
Mean +- std dev: 1.80 sec +- 0.01 sec -> 1.79 sec +- 0.01 sec: 1.00x faster
Not significant

### async_tree_io ###
Mean +- std dev: 2.42 sec +- 0.03 sec -> 2.41 sec +- 0.03 sec: 1.00x faster
Not significant

### async_tree_memoization ###
Mean +- std dev: 1.27 sec +- 0.01 sec -> 1.27 sec +- 0.01 sec: 1.00x faster
Not significant

### async_tree_none ###
Mean +- std dev: 1.09 sec +- 0.01 sec -> 1.09 sec +- 0.01 sec: 1.00x faster
Not significant

### bench_mp_pool ###
Mean +- std dev: 92.8 ms +- 0.3 ms -> 92.6 ms +- 0.3 ms: 1.00x faster
Not significant

### bench_thread_pool ###
Mean +- std dev: 1.54 ms +- 0.00 ms -> 1.53 ms +- 0.00 ms: 1.00x faster
Not significant

### chameleon ###
Mean +- std dev: 22.6 ms +- 0.2 ms -> 22.6 ms +- 0.2 ms: 1.00x faster
Not significant

### chaos ###
Mean +- std dev: 253 ms +- 1 ms -> 252 ms +- 1 ms: 1.00x faster
Not significant

### coroutines ###
Mean +- std dev: 97.1 ms +- 0.2 ms -> 96.2 ms +- 0.3 ms: 1.01x faster
Not significant

### coverage ###
Mean +- std dev: 267 ms +- 2 ms -> 267 ms +- 3 ms: 1.00x faster
Not significant

### crypto_pyaes ###
Mean +- std dev: 266 ms +- 1 ms -> 265 ms +- 1 ms: 1.00x faster
Not significant

### deepcopy ###
Mean +- std dev: 1.17 ms +- 0.01 ms -> 1.16 ms +- 0.00 ms: 1.00x faster
Not significant

### deepcopy_memo ###
Mean +- std dev: 124 us +- 1 us -> 124 us +- 0 us: 1.00x slower
Not significant

### deepcopy_reduce ###
Mean +- std dev: 10.6 us +- 0.0 us -> 10.6 us +- 0.0 us: 1.00x faster
Not significant

### deltablue ###
Mean +- std dev: 12.0 ms +- 0.1 ms -> 12.1 ms +- 0.1 ms: 1.01x slower
Not significant

### docutils ###
Mean +- std dev: 5.77 sec +- 0.05 sec -> 5.81 sec +- 0.02 sec: 1.01x slower
Not significant

### dulwich_log ###
Mean +- std dev: 104 ms +- 0 ms -> 104 ms +- 0 ms: 1.00x slower
Not significant

### fannkuch ###
Mean +- std dev: 1.40 sec +- 0.00 sec -> 1.39 sec +- 0.01 sec: 1.01x faster
Not significant

### float ###
Mean +- std dev: 253 ms +- 1 ms -> 253 ms +- 2 ms: 1.00x slower
Not significant

### generators ###
Mean +- std dev: 101 ms +- 1 ms -> 101 ms +- 1 ms: 1.00x slower
Not significant

### genshi_text ###
Mean +- std dev: 73.6 ms +- 0.5 ms -> 73.7 ms +- 0.8 ms: 1.00x slower
Not significant

### genshi_xml ###
Mean +- std dev: 153 ms +- 1 ms -> 152 ms +- 1 ms: 1.00x faster
Not significant

### go ###
Mean +- std dev: 455 ms +- 3 ms -> 453 ms +- 2 ms: 1.00x faster
Not significant

### hexiom ###
Mean +- std dev: 21.7 ms +- 0.0 ms -> 21.7 ms +- 0.0 ms: 1.00x slower
Not significant

### html5lib ###
Mean +- std dev: 135 ms +- 0 ms -> 135 ms +- 0 ms: 1.00x faster
Not significant

### json_dumps ###
Mean +- std dev: 31.6 ms +- 0.1 ms -> 31.5 ms +- 0.1 ms: 1.00x faster
Not significant

### json_loads ###
Mean +- std dev: 83.7 us +- 0.7 us -> 83.3 us +- 0.8 us: 1.00x faster
Not significant

### logging_format ###
Mean +- std dev: 19.9 us +- 0.1 us -> 19.8 us +- 0.1 us: 1.01x faster
Not significant

### logging_silent ###
Mean +- std dev: 344 ns +- 1 ns -> 343 ns +- 1 ns: 1.00x faster
Not significant

### logging_simple ###
Mean +- std dev: 18.6 us +- 0.1 us -> 18.4 us +- 0.1 us: 1.01x faster
Not significant

### mako ###
Mean +- std dev: 33.9 ms +- 0.1 ms -> 33.8 ms +- 0.1 ms: 1.00x faster
Not significant

### mdp ###
Mean +- std dev: 7.92 sec +- 0.03 sec -> 7.94 sec +- 0.17 sec: 1.00x slower
Not significant

### meteor_contest ###
Mean +- std dev: 280 ms +- 0 ms -> 280 ms +- 0 ms: 1.00x slower
Not significant

### nbody ###
Mean +- std dev: 337 ms +- 2 ms -> 337 ms +- 2 ms: 1.00x faster
Not significant

### nqueens ###
Mean +- std dev: 302 ms +- 1 ms -> 302 ms +- 1 ms: 1.00x faster
Not significant

### pathlib ###
Mean +- std dev: 47.2 ms +- 0.4 ms -> 47.0 ms +- 0.2 ms: 1.00x faster
Not significant

### pickle ###
Mean +- std dev: 30.6 us +- 0.3 us -> 30.7 us +- 0.3 us: 1.00x slower
Not significant

### pickle_dict ###
Mean +- std dev: 69.9 us +- 0.3 us -> 69.9 us +- 0.2 us: 1.00x faster
Not significant

### pickle_list ###
Mean +- std dev: 9.89 us +- 0.11 us -> 9.89 us +- 0.12 us: 1.00x faster
Not significant

### pickle_pure_python ###
Mean +- std dev: 914 us +- 2 us -> 911 us +- 4 us: 1.00x faster
Not significant

### pidigits ###
Mean +- std dev: 633 ms +- 4 ms -> 631 ms +- 3 ms: 1.00x faster
Not significant

### pprint_pformat ###
Mean +- std dev: 5.19 sec +- 0.02 sec -> 5.20 sec +- 0.02 sec: 1.00x slower
Not significant

### pprint_safe_repr ###
Mean +- std dev: 2.56 sec +- 0.01 sec -> 2.56 sec +- 0.01 sec: 1.00x faster
Not significant

### pyflate ###
Mean +- std dev: 1.46 sec +- 0.01 sec -> 1.46 sec +- 0.00 sec: 1.00x faster
Not significant

### python_startup ###
Mean +- std dev: 18.7 ms +- 0.1 ms -> 18.7 ms +- 0.0 ms: 1.00x slower
Not significant

### python_startup_no_site ###
Mean +- std dev: 14.8 ms +- 0.0 ms -> 14.9 ms +- 0.1 ms: 1.00x slower
Not significant

### raytrace ###
Mean +- std dev: 1.09 sec +- 0.00 sec -> 1.08 sec +- 0.00 sec: 1.00x faster
Not significant

### regex_compile ###
Mean +- std dev: 399 ms +- 1 ms -> 398 ms +- 1 ms: 1.00x faster
Not significant

### regex_dna ###
Mean +- std dev: 238 ms +- 0 ms -> 238 ms +- 0 ms: 1.00x faster
Not significant

### regex_effbot ###
Mean +- std dev: 6.61 ms +- 0.05 ms -> 6.59 ms +- 0.04 ms: 1.00x faster
Not significant

### regex_v8 ###
Mean +- std dev: 41.9 ms +- 0.1 ms -> 41.7 ms +- 0.1 ms: 1.00x faster
Not significant

### richards ###
Mean +- std dev: 146 ms +- 1 ms -> 144 ms +- 1 ms: 1.02x faster
Not significant

### scimark_fft ###
Mean +- std dev: 1.17 sec +- 0.02 sec -> 1.17 sec +- 0.01 sec: 1.01x faster
Not significant

### scimark_lu ###
Mean +- std dev: 374 ms +- 2 ms -> 372 ms +- 1 ms: 1.00x faster
Not significant

### scimark_monte_carlo ###
Mean +- std dev: 234 ms +- 2 ms -> 233 ms +- 1 ms: 1.00x faster
Not significant

### scimark_sor ###
Mean +- std dev: 434 ms +- 3 ms -> 437 ms +- 3 ms: 1.01x slower
Not significant

### scimark_sparse_mat_mult ###
Mean +- std dev: 17.1 ms +- 0.1 ms -> 17.2 ms +- 0.1 ms: 1.01x slower
Not significant

### spectral_norm ###
Mean +- std dev: 428 ms +- 2 ms -> 430 ms +- 1 ms: 1.01x slower
Not significant

### sqlglot_normalize ###
Mean +- std dev: 388 ms +- 1 ms -> 388 ms +- 1 ms: 1.00x faster
Not significant

### sqlglot_optimize ###
Mean +- std dev: 181 ms +- 0 ms -> 181 ms +- 1 ms: 1.00x slower
Not significant

### sqlglot_parse ###
Mean +- std dev: 4.67 ms +- 0.02 ms -> 4.69 ms +- 0.03 ms: 1.00x slower
Not significant

### sqlglot_transpile ###
Mean +- std dev: 5.42 ms +- 0.03 ms -> 5.43 ms +- 0.03 ms: 1.00x slower
Not significant

### sqlite_synth ###
Mean +- std dev: 4.45 us +- 0.02 us -> 4.44 us +- 0.02 us: 1.00x faster
Not significant

### telco ###
Mean +- std dev: 19.5 ms +- 0.1 ms -> 19.4 ms +- 0.1 ms: 1.00x faster
Not significant

### tornado_http ###
Mean +- std dev: 196 ms +- 6 ms -> 192 ms +- 6 ms: 1.02x faster
Significant (t=3.52)

### unpack_sequence ###
Mean +- std dev: 161 ns +- 6 ns -> 161 ns +- 6 ns: 1.00x faster
Not significant

### unpickle ###
Mean +- std dev: 51.4 us +- 0.5 us -> 51.4 us +- 0.6 us: 1.00x slower
Not significant

### unpickle_list ###
Mean +- std dev: 17.1 us +- 0.1 us -> 17.0 us +- 0.0 us: 1.00x faster
Not significant

### unpickle_pure_python ###
Mean +- std dev: 700 us +- 1 us -> 708 us +- 1 us: 1.01x slower
Not significant

### xml_etree_generate ###
Mean +- std dev: 284 ms +- 2 ms -> 285 ms +- 2 ms: 1.00x slower
Not significant

### xml_etree_iterparse ###
Mean +- std dev: 299 ms +- 2 ms -> 299 ms +- 2 ms: 1.00x slower
Not significant

### xml_etree_parse ###
Mean +- std dev: 366 ms +- 2 ms -> 366 ms +- 2 ms: 1.00x slower
Not significant

### xml_etree_process ###
Mean +- std dev: 195 ms +- 1 ms -> 196 ms +- 1 ms: 1.00x slower
Not significant

The only "Significant" one is tornado_http and this one claims it's in favor of the weakreflist introduction so clearly noise.

@glyph
Copy link
Contributor

glyph commented Mar 29, 2023

There's some Deferred debugging stuff that has caused significant angst with hard references to frame objects. Weakrefs here would be hugely useful. I'm not sure exactly how we'd use them (the cycles and attendant leaks were such a problem that we have mostly given up on this sort of thing and I'd need to load the whole feature back into my head) but I definitely want this if it is a possibility!

@tacaswell
Copy link
Contributor

We have an open issue on Matplotlib due to frame related reference cycles matplotlib/matplotlib#25406

@markshannon
Copy link
Member

This is not the way to fix memory leaks.

If you really must have cycles involving frames, then you can break the cycle like this:
class SupportsWeakRefs: pass
frame -> weakref() -> SupportsWeakRefs() -> frame.

Although I have to wonder why plt.savefig needs to do strange things to the stack.

@tacaswell
Copy link
Contributor

plt.savefig is how the use found it but the cycles are from calls much deeper in the code (we were caching an Exception object 🐑 , and it looks like some versions of pillow and pyparsing also are doing something with cached exceptions). I do not think there is a compelling reason we (Matplotlib) needs this, but we were bit by frame cycles someplace.

@iritkatriel iritkatriel added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
6 participants