Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inspect module is prohibitively expensive to import #117865

Open
jaraco opened this issue Apr 14, 2024 · 9 comments
Open

Inspect module is prohibitively expensive to import #117865

jaraco opened this issue Apr 14, 2024 · 9 comments
Labels
performance Performance or resource usage type-feature A feature request or enhancement

Comments

@jaraco
Copy link
Member

jaraco commented Apr 14, 2024

From post by @AlexWaygood in #117372 (comment)_:

I'm a little dismayed that dataclasses adds so much overhead. Does that imply that dataclasses shouldn't be used in the stdlib?

dataclasses is an unfortunately heavy import, yes. Although it imports relatively few modules itself, one of those modules is inspect, which has a huge import time (inspect imports half the standard library). In the past, I've looked into removing the dataclasses dependency on inspect, but couldn't see a way to do so without making the code significantly more ugly; the same goes when it comes to improving the import time for inspect itself. (In hindsight, I'd argue inspect should probably have been a package rather than a huge single-file Python module... but hindsight is 20/20.)

Perhaps inspect could be refactored to limit the import-time cost.

Currently on my m3 mac pro, on the second attempt, the import times are 1.2ms and 7.9ms (cumulative):

 cpython main @ ./python.exe -X importtime -c 'import inspect' 2> /dev/null ; ./python.exe -X importtime -c 'import inspect' 2>1 | grep -E '(cumul|inspect)'
import time: self [us] | cumulative | imported package
import time:      1184 |       7892 | inspect

I've searched the issue tracker and didn't see anything tracking this concern, so right now it's a naïve feature request.

Related: #108901

Linked PRs

@jaraco
Copy link
Member Author

jaraco commented Apr 14, 2024

It looks like the bulk of the import time is spent in re, ast, and dis:

import time: self [us] | cumulative | imported package
import time:       223 |        223 |         types
import time:       991 |       1214 |       enum
import time:        47 |         47 |         _sre
import time:       187 |        187 |           re._constants
import time:       270 |        456 |         re._parser
import time:        81 |         81 |         re._casefix
import time:       257 |        839 |       re._compiler
import time:        81 |         81 |           itertools
import time:       100 |        100 |           keyword
import time:        40 |         40 |             _operator
import time:       202 |        241 |           operator
import time:       131 |        131 |           reprlib
import time:        41 |         41 |           _collections
import time:       554 |       1146 |         collections
import time:        32 |         32 |         _functools
import time:       590 |       1766 |       functools
import time:       125 |        125 |       copyreg
import time:       384 |       4326 |     re
import time:      1325 |       1325 |     _ast
import time:       458 |        458 |     contextlib
import time:       861 |       6968 |   ast
import time:       281 |        281 |       _opcode
import time:       160 |        160 |       _opcode_metadata
import time:       220 |        660 |     opcode
import time:       594 |       1253 |   dis
import time:       129 |        129 |   collections.abc
import time:       116 |        116 |     importlib
import time:        43 |        158 |   importlib.machinery
import time:       141 |        141 |     token
import time:        20 |         20 |     _tokenize
import time:       569 |        729 |   tokenize
import time:      1548 |      10783 | inspect

If those imports could be deferred, that would likely address the concern.

@AlexWaygood AlexWaygood added type-feature A feature request or enhancement performance Performance or resource usage labels Apr 14, 2024
@hugovk
Copy link
Member

hugovk commented Apr 14, 2024

Visualised with https://github.com/nschloe/tuna:

./python.exe -X importtime -c 'import inspect' 2> /dev/null ; ./python.exe -X importtime -c 'import inspect' 2> import.log && tuna import.log
image
@JelleZijlstra
Copy link
Member

Some notes:

  • The graph above isn't the whole story because many modules get imported multiple times. For example, re is imported both directly by inspect and also by ast. Removing one of the two imports wouldn't actually help import time.
  • ast is used heavily in inspect.py, including as a base class. Deferring it would be difficult.
  • re is used only once in inspect.py in a relatively obscure case (formatannotation); that import could be deferred.
  • re is similarly only used once in ast.py and for a single function (https://docs.python.org/3.10/library/ast.html#ast.get_source_segment).
  • I haven't checked if there are other paths by which inspect imports re.
  • enum is imported by inspect and re. In both cases, it's for defining enums in the global scope, so it cannot easily be deferred.
  • It may also be worth running a line-profiling tool to see if any module is doing something especially expensive at import time. For example, it looks like ast spends about 1 ms of import time on things outside importing other modules. What is that time being spent on, and can we improve it?
@AlexWaygood
Copy link
Member

yeah, I looked at this in the past and decided that it wasn't realistically possible to improve this situation without either heavily refactoring inspect.py or making the code quite a lot uglier in order to defer imports. But I'd be very happy to be proved wrong if somebody can see a good way of making progress here :-)

@JelleZijlstra
Copy link
Member

I think we can get rid of the re import, which would shave off about 1 ms.

@hugovk
Copy link
Member

hugovk commented Apr 14, 2024

  • I haven't checked if there are other paths by which inspect imports re.

re is also used in tokenize:

  • There are two re.compiles which could be moved into a function with a local import re
  • One re.compile in a dedicated _compile function which looks unused and could be removed or given a local import re
  • But one use in a global Special = group(*map(re.escape, sorted(EXACT_TOKEN_TYPES, reverse=True))) which looks harder to refactor
@AlexWaygood
Copy link
Member

A bunch of the re-module usage in tokenize is backwards-compatibility stuff that isn't actually used by tokenize's core functionality anymore. We could easily deprecate that now, I think. (Previously discussed in #104719. Looks like I promised to open a followup issue there about deprecating the obsolete constants but never did so...)

@serhiy-storchaka
Copy link
Member

The half of the re import cost is in importing enum and collections. It is difficult to find a program that does not import them. Look not only at the cumulative time, but also at the pure time. The slowest are modules that create a large number of classes, especially with metaclasses.

@DavidCEllis
Copy link
Contributor

DavidCEllis commented Apr 19, 2024

Using a tool I have for lazy imports/exports I did some quick testing on deferring the imports for ast, dis, re and tokenize in inspect to see what impact that has on both inspect and dataclasses.

This involved making inspect a package and splitting out the definitions of _ClassFinder and the compiler flags into separate files to be lazily imported if needed. It didn't break ./python -m test -v test_inspect at least (after making __main__.py).


My machine isn't stable enough for really consistent testing but this gives a rough idea of the result.

Benchmarks 4 and 5 include actually creating a dataclass. The dataclass_no_docstring difference is because without a docstring dataclasses calls inspect.signature to generate a basic docstring1 and this pulls in the code that uses dis2.

Results from today's main
Benchmark 1: ./python -c ""
  Time (mean ± σ):      12.3 ms ±   0.6 ms    [User: 10.0 ms, System: 2.5 ms]
  Range (min … max):    11.6 ms …  14.0 ms    25 runs
 
Benchmark 2: ./python -c "import inspect"
  Time (mean ± σ):      32.1 ms ±   0.7 ms    [User: 29.0 ms, System: 3.3 ms]
  Range (min … max):    30.6 ms …  33.6 ms    25 runs
 
Benchmark 3: ./python -c "import dataclasses"
  Time (mean ± σ):      34.4 ms ±   1.5 ms    [User: 30.2 ms, System: 4.3 ms]
  Range (min … max):    32.1 ms …  37.4 ms    25 runs
 
Benchmark 4: ./python scratch/dataclass_with_docstring.py
  Time (mean ± σ):      34.3 ms ±   1.0 ms    [User: 30.6 ms, System: 3.7 ms]
  Range (min … max):    32.7 ms …  36.5 ms    25 runs
 
Benchmark 5: ./python scratch/dataclass_no_docstring.py
  Time (mean ± σ):      34.3 ms ±   1.0 ms    [User: 30.5 ms, System: 4.0 ms]
  Range (min … max):    32.8 ms …  37.2 ms    25 runs
Results from lazy_inspect
Benchmark 1: ./python -c ""
  Time (mean ± σ):      12.3 ms ±   0.8 ms    [User: 10.5 ms, System: 2.2 ms]
  Range (min … max):    11.7 ms …  15.3 ms    25 runs
 
Benchmark 2: ./python -c "import inspect"
  Time (mean ± σ):      21.6 ms ±   1.1 ms    [User: 20.2 ms, System: 1.7 ms]
  Range (min … max):    20.6 ms …  24.1 ms    25 runs
 
Benchmark 3: ./python -c "import dataclasses"
  Time (mean ± σ):      26.0 ms ±   1.1 ms    [User: 22.8 ms, System: 3.5 ms]
  Range (min … max):    24.7 ms …  27.9 ms    25 runs
 
Benchmark 4: ./python scratch/dataclass_with_docstring.py
  Time (mean ± σ):      26.7 ms ±   1.2 ms    [User: 23.2 ms, System: 3.7 ms]
  Range (min … max):    25.2 ms …  29.3 ms    25 runs
 
Benchmark 5: ./python scratch/dataclass_no_docstring.py
  Time (mean ± σ):      29.7 ms ±   2.2 ms    [User: 26.8 ms, System: 3.1 ms]
  Range (min … max):    27.4 ms …  36.0 ms    25 runs

Footnotes

  1. Is this documented anywhere?

  2. This also means -OO will always pull in dis.

JelleZijlstra added a commit to JelleZijlstra/cpython that referenced this issue May 25, 2024
This is used only by ast.get_source_segment(), so it seems sensible to avoid importing it.
JelleZijlstra added a commit that referenced this issue May 28, 2024
This is used only by ast.get_source_segment(), so it seems sensible to avoid importing it.

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage type-feature A feature request or enhancement
6 participants