Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split up _testcapimodule.c #93649

Open
encukou opened this issue Jun 9, 2022 · 31 comments
Open

Split up _testcapimodule.c #93649

encukou opened this issue Jun 9, 2022 · 31 comments
Labels
tests Tests in the Lib/test dir topic-C-API

Comments

@encukou
Copy link
Member

encukou commented Jun 9, 2022

Modules/_testcapimodule.c is a nearly-8000-line behemoth with no clear structure or organization. It is getting hard to maintain.

It also doesn't work well with testing (a) feature macros that affect Python.h and (b) module initialization, so we have additional C-API testing modules: _testmultiphase, _testimportmultiple, _testinternalcapi.
_testbuffer is already split out, but there are many other aspects of the API that would use a similar dedicated test suite.

We should split and combine these, ideally without polluting the namespace of top-level modules.

@encukou encukou added tests Tests in the Lib/test dir topic-C-API labels Jun 9, 2022
@arhadthedev
Copy link
Member

ideally without polluting the namespace of top-level modules

Can hypothetical Lib/_testcapimodule/*.c export their objects as test.capi.*?

@encukou
Copy link
Member Author

encukou commented Jun 10, 2022

As far as I know they can, as long as test.capi itself is not an extension module. It would need some extra support in both make and Windows builds, though.

encukou added a commit to encukou/cpython that referenced this issue Jul 4, 2022
The _testcapimodule.c file is getting too large to work with effectively.
Vectorcall tests aren't the biggest issue -- it's just an area I want to work
on next, so I started there.
It does make it clear that MethodDescriptor2 is related to testing vectorcall,
which wasn't clear before (the /* Test PEP 590 */ section had an ambiguous end).

This PR lays out a general structure of how tests can be split up,
with more splitting to come later if the structure is OK.
@encukou
Copy link
Member Author

encukou commented Jul 4, 2022

PR: #94549

@vstinner
Copy link
Member

vstinner commented Jul 5, 2022

I tried to group de 299 attributes of the _testcapi extension module. I'm not
sure about each attribute, I only looked quickly at the name, sometimes at the
implementation. But it should give a coarse idea of which "groups" make sense
or not.

Python/getargs.c and Python/modsupport.c (parse and build arguments) (51):
  • parse_tuple_and_keywords
  • getargs_B
  • getargs_C
  • getargs_D
  • getargs_H
  • getargs_I
  • getargs_K
  • getargs_L
  • getargs_S
  • getargs_U
  • getargs_Y
  • getargs_Z
  • getargs_Z_hash
  • getargs_b
  • getargs_c
  • getargs_d
  • getargs_es
  • getargs_es_hash
  • getargs_et
  • getargs_et_hash
  • getargs_f
  • getargs_h
  • getargs_i
  • getargs_k
  • getargs_keyword_only
  • getargs_keywords
  • getargs_l
  • getargs_n
  • getargs_p
  • getargs_positional_only_and_keywords
  • getargs_s
  • getargs_s_hash
  • getargs_s_hash_int
  • getargs_s_star
  • getargs_tuple
  • getargs_u
  • getargs_u_hash
  • getargs_w_star
  • getargs_y
  • getargs_y_hash
  • getargs_y_star
  • getargs_z
  • getargs_z_hash
  • getargs_z_star
  • test_L_code
  • test_k_code
  • test_s_code
  • test_buildvalue_N
  • test_buildvalue_issue38913
  • test_empty_argparse
  • argparsing
datetime, pytime.c (33):
  • PyDateTime_DATE_GET
  • PyDateTime_DELTA_GET
  • PyDateTime_GET
  • PyDateTime_TIME_GET
  • PyTime_AsMicroseconds
  • PyTime_AsMilliseconds
  • PyTime_AsSecondsDouble
  • PyTime_AsTimespec
  • PyTime_AsTimespec_clamp
  • PyTime_AsTimeval
  • PyTime_AsTimeval_clamp
  • PyTime_FromSeconds
  • PyTime_FromSecondsObject
  • datetime_check_date
  • datetime_check_datetime
  • datetime_check_delta
  • datetime_check_time
  • datetime_check_tzinfo
  • get_date_fromdate
  • get_date_fromtimestamp
  • get_datetime_fromdateandtime
  • get_datetime_fromdateandtimeandfold
  • get_datetime_fromtimestamp
  • get_delta_fromdsu
  • get_time_fromtime
  • get_time_fromtimeandfold
  • get_timezone_utc_capi
  • get_timezones_offset_zero
  • make_timezones_capi
  • pytime_object_to_time_t
  • pytime_object_to_timespec
  • pytime_object_to_timeval
  • test_datetime_capi
Types, metaclasses, type inheritance (wide category, not sure if it makes sense) (30):
  • _test_structmembersType
  • awaitType
  • instancemethod
  • ipowType
  • create_type_from_repeated_slots
  • matmulType
  • NullTpDocType
  • Generic
  • GenericAlias
  • MyList
  • ContainerNoGC
  • HeapCTypeMetaclass
  • HeapCTypeMetaclassCustomNew
  • HeapCTypeSetattr
  • HeapCTypeSubclass
  • HeapCTypeSubclassWithFinalizer
  • HeapCTypeWithBuffer
  • HeapCTypeWithDict
  • HeapCTypeWithNegativeDict
  • HeapCTypeWithWeakref
  • HeapDocCType
  • HeapGcCType
  • test_from_spec_invalid_metatype_inheritance
  • pytype_fromspec_meta
  • test_from_spec_metatype_inheritance
  • test_get_statictype_slots
  • test_get_type_name
  • test_get_type_qualname
  • test_lazy_hash_inheritance
  • test_type_from_ephemeral_spec
Function calls and spawn threads to call functions (25):
  • _test_thread_state
  • call_in_temporary_c_thread
  • get_kwargs
  • get_args
  • create_cfunction
  • pyobject_vectorcall
  • pyvectorcall_call
  • pyobject_fastcall
  • pyobject_fastcalldict
  • meth_fastcall
  • meth_fastcall_keywords
  • meth_noargs
  • meth_o
  • meth_varargs
  • meth_varargs_keywords
  • MethClass
  • MethInstance
  • MethStatic
  • MethodDescriptor2
  • MethodDescriptorBase
  • MethodDescriptorDerived
  • MethodDescriptorNopGet
  • return_null_without_error
  • return_result_with_error
  • stack_pointer
Unicode, codecs (12):
  • codec_incrementaldecoder
  • codec_incrementalencoder
  • test_unicode_compare_with_ascii
  • test_widechar
  • test_string_from_format
  • unicode_asucs4
  • unicode_asutf8
  • unicode_asutf8andsize
  • unicode_aswidechar
  • unicode_aswidecharstring
  • unicode_copycharacters
  • unicode_findchar
Memory allocators (17):
  • WITH_PYMALLOC
  • pyobject_malloc_without_gil
  • pymem_api_misuse
  • pymem_buffer_overflow
  • pymem_getallocatorsname
  • pymem_malloc_without_gil
  • test_pymem_alloc0
  • test_pymem_setallocators
  • test_pymem_setrawallocators
  • test_pyobject_new
  • test_pyobject_setallocators
  • set_nomemory
  • check_pyobject_forbidden_bytes_is_freed
  • check_pyobject_freed_is_freed
  • check_pyobject_null_is_freed
  • check_pyobject_uninitialized_is_freed
  • remove_mem_hooks
Exceptions, "error handling", C "errno" variable and signals (13):
  • RecursingInfinitelyError
  • raise_memoryerror
  • make_exception_with_doc
  • write_unraisable_exc
  • traceback_print
  • set_exception
  • raise_exception
  • set_exc_info
  • exception_print
  • fatal_error
  • error
  • set_errno
  • raise_SIGINT_then_send_None
C types limit and size (20):
  • CHAR_MAX
  • CHAR_MIN
  • INT_MAX
  • INT_MIN
  • LLONG_MAX
  • LLONG_MIN
  • LONG_MAX
  • LONG_MIN
  • PY_SSIZE_T_MAX
  • PY_SSIZE_T_MIN
  • SHRT_MAX
  • SHRT_MIN
  • UCHAR_MAX
  • UINT_MAX
  • ULLONG_MAX
  • ULONG_MAX
  • USHRT_MAX
  • SIZEOF_TIME_T
  • test_config
  • test_sizeof_c_types
Float, PyLongObject and PyNumber C API (16):
  • test_long_and_overflow
  • test_long_api
  • test_long_as_size_t
  • test_long_as_unsigned_long_long_mask
  • test_long_long_and_overflow
  • test_long_numbits
  • test_longlong_api
  • pynumber_tobase
  • DBL_MAX
  • DBL_MIN
  • FLT_MAX
  • FLT_MIN
  • float_pack
  • float_unpack
  • test_string_to_double
  • test_long_as_double
Docstring (10):
  • docstring_empty
  • docstring_no_signature
  • docstring_with_invalid_signature
  • docstring_with_invalid_signature2
  • docstring_with_signature
  • docstring_with_signature_and_extra_newlines
  • docstring_with_signature_but_no_doc
  • docstring_with_signature_with_defaults
  • no_docstring
  • test_with_docstring
Garbage collector (3):
  • test_gc_control
  • without_gc
  • with_tp_del
Tracemalloc (3):
  • tracemalloc_get_traceback
  • tracemalloc_track
  • tracemalloc_untrack
os module helpers (1):
  • W_STOPCODE
Subinterpreters (1):
  • run_in_subinterp
C API (57):
  • PyObject and PyVarObject C API (9):

    • negative_refcount
    • test_decref_doesnt_leak
    • test_incref_decref_API
    • test_incref_doesnt_leak
    • test_refcount_funcs
    • test_refcount_macros
    • test_xdecref_doesnt_leak
    • test_xincref_doesnt_leak
    • test_set_type_size
  • PyObject_Bytes(), PyObject_Repr(), PyObject_Str() (3):

    • pyobject_bytes_from_null
    • pyobject_repr_from_null
    • pyobject_str_from_null
  • PyTypeObject C API (2):

    • negative_dictoffset
    • type_get_version
  • C API macros (3):

    • test_macros
    • test_py_is_funcs
    • test_py_is_macros
  • PyCodeObject C API (2):

    • test_code_api
    • code_newempty
  • PyFrameObject C API (5):

    • frame_getbuiltins
    • frame_getgenerator
    • frame_getglobals
    • frame_getlasti
    • frame_getlocals
  • PyDictObject C API (3):

    • dict_get_version
    • dict_getitem_knownhash
    • test_dict_iteration
  • PyThreadState C API (2):

    • test_tstate_capi
    • crash_no_current_thread
  • PyListObject C API (1):

    • test_list_api
  • PySequence and PyMapping C API, PyObject_GetItem() (6):

    • sequence_getitem
    • get_mapping_items
    • get_mapping_keys
    • get_mapping_values
    • getitem_with_error
    • bad_get
  • C API "feature" macros like HAVE_FORK (1):

    • get_feature_macros
  • HAMT C API (1):

    • hamt
  • PyBuffer and memoryview C API (5):

    • getbuffer_with_null_view
    • make_memoryview_from_NULL_pointer
    • test_from_contiguous
    • PyBuffer_SizeFromFormat
    • test_pep3118_obsolete_write_locks
  • PyCapsule C API (1):

    • test_capsule
  • PyThread_tss C API (1):

    • test_pythread_tss_key_state
  • PyStructSeq C API (2):

    • test_structseq_newtype_doesnt_leak
    • test_structseq_newtype_null_descr_doc
  • PyEval_SetTrace C API (1):

    • settrace_to_record
  • PyMarshal C API (6):

    • pymarshal_read_last_object_from_file
    • pymarshal_read_long_from_file
    • pymarshal_read_object_from_file
    • pymarshal_read_short_from_file
    • pymarshal_write_long_to_file
    • pymarshal_write_object_to_file
  • Py_AddPendingCall C API (1):

    • _pending_threadfunc
  • Misc C API (2):

    • Py_Version
    • Py_CompileString
Misc (7):
  • __doc__
  • __file__
  • __loader__
  • __name__
  • __package__
  • __spec__
  • the_number_three
@vstinner
Copy link
Member

vstinner commented Jul 5, 2022

It might be interesting to split the large _testcapi module into multiple modules. But I like the idea of starting by splitting the long C file into multiple C files.

miss-islington pushed a commit that referenced this issue Jul 8, 2022
The `_testcapimodule.c` file is getting too large to work with effectively.
This PR lays out a general structure of how tests can be split up, with more splitting to come later if the structure is OK.

Vectorcall tests aren't the biggest issue -- it's just an area I want to work on next, so I'm starting here.
An issue specific to vectorcall tests is that it wasn't clear that e.g. `MethodDescriptor2` is related to testing vectorcall: the `/* Test PEP 590 */` section had an ambiguous end. Separate file should make things like this much clearer.
OTOH, for some pieces it might not be clear where they should be -- I left `meth_fastcall` with tests of the other calling conventions. IMO, even with the ambiguity it's still worth it to split the huge file up.

I'm not sure about the buildsystem changes, hopefully CI will tell me what's wrong.

@vstinner, @markshannon: Do you think this is a good idea?

Automerge-Triggered-By: GH:encukou
@tiran
Copy link
Member

tiran commented Jul 8, 2022

You do not need to include _testcapi/vectorcall.c in MODULE__TESTCAPI_DEPS. It is already listed as dependency for _testcapi in the Setup file.

tiran added a commit to tiran/cpython that referenced this issue Jul 8, 2022
- header files are located in $(srcdir)
- dependencies must not list C files that are also in a makesetup Setup
  file
tiran added a commit that referenced this issue Jul 9, 2022
- header files are located in $(srcdir)
- dependencies must not list C files that are also in a makesetup Setup file
- generate SRCDIRS for OOT builds
@encukou
Copy link
Member Author

encukou commented Jul 11, 2022

Thank you for the fix! I'll 🔨 test-with-buildbots next time I touch the buildsystem.
(I wonder why this issue didn't get the broken buildbot notification, though...)

@tiran
Copy link
Member

tiran commented Jul 11, 2022

Thank you for the fix! I'll hammer test-with-buildbots next time I touch the buildsystem. (I wonder why this issue didn't get the broken buildbot notification, though...)

We think that there was a problem in the buildbot hook, python/buildmaster-config#333

encukou added a commit to encukou/cpython that referenced this issue Jul 28, 2022
encukou added a commit that referenced this issue Aug 1, 2022
This removes the unused negative_dictoffset function:
the type this function would create is available as
    _testcapi.HeapCTypeWithNegativeDict
@philg314
Copy link
Contributor

philg314 commented Aug 1, 2022

How should raiseTestError be handled?

static PyObject *TestError; /* set to exception object in init */
/* Raise TestError with test_name + ": " + msg, and return NULL. */
static PyObject *
raiseTestError(const char* test_name, const char* msg)
{
PyErr_Format(TestError, "%s: %s", test_name, msg);
return NULL;
}

Edit: For the Unicode tests I copied it and renamed it to _testcapi.unicode_error.

philg314 added a commit to philg314/cpython that referenced this issue Aug 2, 2022
@vstinner
Copy link
Member

Is it time to close issue #78453 to continue the work here? cc @serhiy-storchaka

@serhiy-storchaka
Copy link
Member

I moved PyUnicode C API tests from test_unicode to test_capi.test_unicode, because I am going to cover by test all C API, and I think that such large chunk of test is better in a separate file. test_unicode is large even without C API tests.

@vstinner
Copy link
Member

$ grep -c PyAPI_FUNC Include/unicodeobject.h Include/cpython/unicodeobject.h 
Include/unicodeobject.h:83
Include/cpython/unicodeobject.h:69

There are around 152 "PyUnicode" C API functions. That's a lot knowing that Python 3.12 exports 939 public functions (and 354 private functions): stats on the C API.

$ wc -l Lib/test/test_{codecs,unicode}.py 
  3560 Lib/test/test_codecs.py
  2701 Lib/test/test_unicode.py
  6261 total

Currently, test_codecs is around 3 500 lines and test_unicode around 2 700 lines.

If we want to extend the coverage of the PyUnicode C API (what is being discussed here :-)), I agree that moving the Python parts of PyUnicode C API tests can be the newly added Lib/test/test_capi/test_unicode.py.

@vstinner
Copy link
Member

In PR #99613, I proposed to @serhiy-storchaka to separate "codecs" tests from "unicode" tests. For me, "codecs" includes PyCodec C API but also "encode" and "decode" functions of the PyUnicode C API. What do you think?

@serhiy-storchaka
Copy link
Member

I think this issue and issue #78453 are different. This issue is about reorganization of the C code, and #78453 is more about reorganization of the Python code.

shihai1991 added a commit to shihai1991/cpython that referenced this issue Dec 18, 2022
* origin/main: (1306 commits)
  Correct CVE-2020-10735 documentation (python#100306)
  pythongh-100272: Fix JSON serialization of OrderedDict (pythonGH-100273)
  pythongh-93649: Split tracemalloc tests from _testcapimodule.c (python#99551)
  Docs: Use `PY_VERSION_HEX` for version comparison (python#100179)
  pythongh-97909: Fix markup for `PyMethodDef` members (python#100089)
  pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890)
  pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890)
  pythonGH-98831: Add DECREF_INPUTS(), expanding to DECREF() each stack input (python#100205)
  pythongh-78707: deprecate passing >1 argument to `PurePath.[is_]relative_to()` (pythonGH-94469)
  pythongh-99540: Constant hash for _PyNone_Type to aid reproducibility (pythonGH-99541)
  pythongh-100039: enhance __signature__ to work with str and callables (pythonGH-100168)
  pythongh-99830: asyncio: Document returns of remove_{reader,writer} (python#100302)
  "Compound statement" docs: Fix with-statement step indexing (python#100286)
  pythonGH-90043: Handle NaNs in COMPARE_OP_FLOAT_JUMP (pythonGH-100278)
  Improve stats presentation for calls. (pythonGH-100274)
  Better stats for `LOAD_ATTR` and `STORE_ATTR` (pythonGH-100295)
  pythongh-81057: Move the Cached Parser Dummy Name to _PyRuntimeState (python#100277)
  Document that zipfile's pwd parameter is a `bytes` object (python#100209)
  pythongh-99767: mark `PyTypeObject.tp_watched` as internal use only in table (python#100271)
  Fix typo in introduction.rst (python#100266)
  ...
carljm added a commit to carljm/cpython that referenced this issue Dec 19, 2022
* main:
  pythongh-89727: Fix os.walk RecursionError on deep trees (python#99803)
  Docs: Don't upload CI artifacts (python#100330)
  pythongh-94912: Added marker for non-standard coroutine function detection (python#99247)
  Correct CVE-2020-10735 documentation (python#100306)
  pythongh-100272: Fix JSON serialization of OrderedDict (pythonGH-100273)
  pythongh-93649: Split tracemalloc tests from _testcapimodule.c (python#99551)
  Docs: Use `PY_VERSION_HEX` for version comparison (python#100179)
  pythongh-97909: Fix markup for `PyMethodDef` members (python#100089)
  pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890)
  pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890)
  pythonGH-98831: Add DECREF_INPUTS(), expanding to DECREF() each stack input (python#100205)
  pythongh-78707: deprecate passing >1 argument to `PurePath.[is_]relative_to()` (pythonGH-94469)
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 23, 2023
miss-islington pushed a commit that referenced this issue Feb 23, 2023
carljm added a commit to carljm/cpython that referenced this issue May 12, 2023
* main:
  pythongh-91896: Fixup some docs issues following ByteString deprecation (python#104422)
  pythonGH-104371: check return value of calling `mv.release` (python#104417)
  pythongh-104415: Fix refleak tests for `typing.ByteString` deprecation (python#104416)
  pythonGH-86275: Implementation of hypothesis stubs for property-based tests, with zoneinfo tests (python#22863)
  pythonGH-103082: Filter LINE events in VM, to simplify tool implementation. (pythonGH-104387)
  pythongh-93649: Split gc- and allocation tests from _testcapimodule.c (pythonGH-104403)
  pythongh-104389: Add 'unused' keyword to Argument Clinic C converters (python#104390)
  pythongh-101819: Prepare _io._IOBase for module state (python#104386)
  pythongh-104413: Fix refleak when super attribute throws AttributeError (python#104414)
  Fix refleak in `super_descr_get` (python#104408)
  pythongh-87526: Remove dead initialization from _zoneinfo parse_abbr() (python#24700)
  pythongh-91896: Improve visibility of `ByteString` deprecation warnings (python#104294)
  pythongh-104371: Fix calls to `__release_buffer__` while an exception is active (python#104378)
  pythongh-104377: fix cell in comprehension that is free in outer scope (python#104394)
  pythongh-104392: Remove _paramspec_tvars from typing (python#104393)
  pythongh-104396: uuid.py to skip platform check for emscripten and wasi (pythongh-104397)
  pythongh-99108: Refresh HACL* from upstream (python#104401)
  pythongh-104301: Allow leading whitespace in disambiguated pdb statements (python#104342)
@vstinner
Copy link
Member

_testcapi extension was splitted into sub-tests: multiple extensions and test_capi package with multiple tests.

The issue can now be closed, no?

@encukou
Copy link
Member Author

encukou commented Aug 23, 2023

I wouldn't close it -- the module still has a lot of tests that can be split out.

@erlend-aasland
Copy link
Contributor

I agree, there are still useful refactors to be made.

@vstinner
Copy link
Member

See also issue gh-78453.

@vstinner
Copy link
Member

I created a _testlimitedcapi extension to test the C API via the limited C API: see issue gh-116417 (completed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests Tests in the Lib/test dir topic-C-API
7 participants