[Bug] The serialized model is larger than the 2GiB limit imposed by the protobuf library #2768

maximefuchs · 2024-05-16T09:26:16Z

Checklist

I have searched related issues but cannot get the expected help.
2. I have read the FAQ documentation but cannot get the expected help.
3. The bug has not been fixed in the latest version.

Describe the bug

I trained a MViT from mmaction2 and would like to deploy the trained model.
However, the following command:

python mmdeploy/tools/deploy.py mmdeploy/configs/mmaction/video-recognition/video-recognition_3d_tensorrt_static-224x224.py work_dirs/test_mvit_sequence/test_mvit_sequence.py  work_dirs/test_mvit_sequence/best_acc_top1_epoch_6.pth  mmpretrain/demo/demo.JPEG --work-dir work_dirs/test_mvit_sequence/output_trt --device cuda --dump-info

Gives the following error:

Process Process-2:
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/mmdeploy/apis/pytorch2onnx.py", line 98, in torch2onnx
    export(
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 356, in _wrap
    return self.call_function(func_name_, *args, **kwargs)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 326, in call_function
    return self.call_function_local(func_name, *args, **kwargs)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 275, in call_function_local
    return pipe_caller(*args, **kwargs)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/mmdeploy/apis/onnx/export.py", line 138, in export
    torch.onnx.export(
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 516, in export
    _export(
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1613, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/mmdeploy/apis/onnx/optimizer.py", line 27, in model_to_graph__custom_optimizer
    graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1139, in _model_to_graph
    graph = _optimize_graph(
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 677, in _optimize_graph
    graph = _C._jit_pass_onnx(graph, operator_export_type)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1957, in _run_symbolic_function
    return symbolic_fn(graph_context, *inputs, **attrs)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/symbolic_opset9.py", line 7153, in onnx_placeholder
    return torch._C._jit_onnx_convert_pattern_from_subblock(block, node, env)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1957, in _run_symbolic_function
    return symbolic_fn(graph_context, *inputs, **attrs)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/symbolic_opset11.py", line 236, in index_put
    broadcast_index_shape = g.op("Shape", index)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/_internal/jit_utils.py", line 87, in op
    return _add_op(self, opname, *raw_args, outputs=outputs, **kwargs)
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/_internal/jit_utils.py", line 246, in _add_op
    node = _create_node(
  File "/home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/torch/onnx/_internal/jit_utils.py", line 307, in _create_node
    _C._jit_pass_onnx_node_shape_type_inference(node, params_dict, opset_version)
RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library. Therefore the output file must be a file path, so that the ONNX external data can be written to the same directory. Please specify the output file name.
05/16 11:04:22 - mmengine - ERROR - /home/maxime/Documents/classification/.venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py - pop_mp_output - 80 - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.

Reproduction

This is the config file for the MViT test_mvit_sequence.py

_base_ = [
    "../mmaction2/configs/_base_/models/mvit_small.py",
    "../mmaction2/configs/_base_/default_runtime.py",
]
# dataset settings
classes = (
    "nothing",
    "Liver",
    "artefacts",
    "head",
    "true_negatif",
    "body",
    "other",
    "tail",
)
num_class = len(classes)
dataset_type = "RawframeDataset"
data_root = "/home/maxime/Documents/DATA/dataset_classifier_5fps/"
ann_file_train = "train.txt"
ann_file_val = "val.txt"
ann_file_test = "test.txt"
# hyperparameters
clip_len = 4  # 16 in former (Adrien) model
batch_size = 2
num_workers = 1
num_clips = 1

metainfo = dict(classes=classes)
model = dict(
    backbone=dict(
        arch="base",
        temporal_size=clip_len,
        drop_path_rate=0.3,
    ),
    data_preprocessor=dict(
        type="ActionDataPreprocessor",
        mean=[114.75, 114.75, 114.75],
        std=[57.375, 57.375, 57.375],
        blending=dict(
            type="RandomBatchAugment",
            augments=[
                dict(type="MixupBlending", alpha=0.8, num_classes=num_class),
                dict(type="CutmixBlending", alpha=1, num_classes=num_class),
            ],
        ),
        format_shape="NCTHW",
    ),
    cls_head=dict(num_classes=num_class),
)


train_pipeline = [
    dict(type="SampleFrames", clip_len=clip_len, frame_interval=1, num_clips=num_clips),
    dict(type="RawFrameDecode"),
    dict(type="Resize", scale=(-1, 256)),
    dict(type="RandomResizedCrop"),
    dict(type="Resize", scale=(224, 224), keep_ratio=False),
    dict(type="Flip", flip_ratio=0.5),
    dict(type="FormatShape", input_format="NCTHW"),
    dict(type="PackActionInputs"),
]
val_pipeline = [
    dict(
        type="SampleFrames",
        clip_len=clip_len,
        frame_interval=1,
        num_clips=num_clips,
        test_mode=True,
    ),
    dict(type="RawFrameDecode"),
    dict(type="Resize", scale=(-1, 256)),
    dict(type="CenterCrop", crop_size=224),
    dict(type="FormatShape", input_format="NCTHW"),
    dict(type="PackActionInputs"),
]
test_pipeline = [
    dict(
        type="SampleFrames",
        clip_len=clip_len,
        frame_interval=1,
        num_clips=25,
        test_mode=True,
    ),
    dict(type="RawFrameDecode"),
    dict(type="Resize", scale=(-1, 256)),
    dict(type="TenCrop", crop_size=224),
    dict(type="FormatShape", input_format="NCTHW"),
    dict(type="PackActionInputs"),
]

train_dataloader = dict(
    batch_size=batch_size,
    num_workers=num_workers,
    persistent_workers=True,
    sampler=dict(type="DefaultSampler", shuffle=True),
    dataset=dict(
        type=dataset_type,
        metainfo=metainfo,
        ann_file=data_root + ann_file_train,
        filename_tmpl="img_{:05}.png",  # id of images has to start at 1
        # modality="Flow",
        data_prefix=dict(img=data_root),
        pipeline=train_pipeline,
    ),
)
val_dataloader = dict(
    batch_size=batch_size,
    num_workers=num_workers,
    persistent_workers=True,
    sampler=dict(type="DefaultSampler", shuffle=False),
    dataset=dict(
        type=dataset_type,
        metainfo=metainfo,
        ann_file=data_root + ann_file_val,
        filename_tmpl="img_{:05}.png",  # id of images has to start at 1
        # modality="Flow",
        data_prefix=dict(img=data_root),
        pipeline=val_pipeline,
        test_mode=True,
    ),
)
test_dataloader = dict(
    batch_size=1,
    num_workers=num_workers,
    persistent_workers=True,
    sampler=dict(type="DefaultSampler", shuffle=False),
    dataset=dict(
        type=dataset_type,
        metainfo=metainfo,
        ann_file=data_root + ann_file_test,
        filename_tmpl="img_{:05}.png",  # id of images has to start at 1
        # modality="Flow",
        data_prefix=dict(img=data_root),
        pipeline=test_pipeline,
        test_mode=True,
    ),
)

val_evaluator = dict(type="AccMetric")
test_evaluator = val_evaluator

train_cfg = dict(
    type="EpochBasedTrainLoop", max_epochs=200, val_begin=1, val_interval=1
)
val_cfg = dict(type="ValLoop")
test_cfg = dict(type="TestLoop")

base_lr = 1.6e-3
optim_wrapper = dict(
    optimizer=dict(type="AdamW", lr=base_lr, betas=(0.9, 0.999), weight_decay=0.05),
    paramwise_cfg=dict(norm_decay_mult=0.0, bias_decay_mult=0.0),
    clip_grad=dict(max_norm=1, norm_type=2),
)

param_scheduler = [
    dict(
        type="LinearLR",
        start_factor=0.01,
        by_epoch=True,
        begin=0,
        end=30,
        convert_to_iter_based=True,
    ),
    dict(
        type="CosineAnnealingLR",
        T_max=200,
        eta_min=base_lr / 100,
        by_epoch=True,
        begin=30,
        end=200,
        convert_to_iter_based=True,
    ),
]

default_hooks = dict(
    checkpoint=dict(interval=1, max_keep_ckpts=5), logger=dict(interval=100)
)

# Default setting for scaling LR automatically
#   - `enable` means enable scaling LR automatically
#       or not by default.
#   - `base_batch_size` = (8 GPUs) x (8 samples per GPU).
auto_scale_lr = dict(enable=False, base_batch_size=256)

And this is the config for the deploy video-recognition_3d_tensorrt_static-224x224.py

_base_ = ["./video-recognition_static.py", "../../_base_/backends/tensorrt.py"]

onnx_config = dict(input_shape=[224, 224])

backend_config = dict(
    common_config=dict(max_workspace_size=1 << 30),
    model_inputs=[
        dict(
            input_shapes=dict(
                input=dict(
                    min_shape=[1, 1, 3, 4, 224, 224],
                    opt_shape=[1, 1, 3, 4, 224, 224],
                    max_shape=[1, 1, 3, 4, 224, 224],
                )
            )
        )
    ],
)

Environment

05/16 11:25:12 - mmengine - INFO - 

05/16 11:25:12 - mmengine - INFO - **********Environmental information**********
05/16 11:25:13 - mmengine - INFO - sys.platform: linux
05/16 11:25:13 - mmengine - INFO - Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
05/16 11:25:13 - mmengine - INFO - CUDA available: True
05/16 11:25:13 - mmengine - INFO - MUSA available: False
05/16 11:25:13 - mmengine - INFO - numpy_random_seed: 2147483648
05/16 11:25:13 - mmengine - INFO - GPU 0: NVIDIA RTX A4000
05/16 11:25:13 - mmengine - INFO - CUDA_HOME: /usr
05/16 11:25:13 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.5, V11.5.119
05/16 11:25:13 - mmengine - INFO - GCC: x86_64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
05/16 11:25:13 - mmengine - INFO - PyTorch: 2.2.0+cu121
05/16 11:25:13 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.3.2 (Git Hash 2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 12.1
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.9.2
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.2.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, 

05/16 11:25:13 - mmengine - INFO - TorchVision: 0.17.0+cu121
05/16 11:25:13 - mmengine - INFO - OpenCV: 4.9.0
05/16 11:25:13 - mmengine - INFO - MMEngine: 0.10.4
05/16 11:25:13 - mmengine - INFO - MMCV: 2.2.0
05/16 11:25:13 - mmengine - INFO - MMCV Compiler: GCC 9.3
05/16 11:25:13 - mmengine - INFO - MMCV CUDA Compiler: 12.1
05/16 11:25:13 - mmengine - INFO - MMDeploy: 1.3.1+87395c5
05/16 11:25:13 - mmengine - INFO - 

05/16 11:25:13 - mmengine - INFO - **********Backend information**********
05/16 11:25:13 - mmengine - INFO - tensorrt:    8.5.2.2
05/16 11:25:13 - mmengine - INFO - tensorrt custom ops: Available
05/16 11:25:13 - mmengine - INFO - ONNXRuntime: 1.17.3
05/16 11:25:13 - mmengine - INFO - ONNXRuntime-gpu:     1.17.1
05/16 11:25:13 - mmengine - INFO - ONNXRuntime custom ops:      Available
05/16 11:25:13 - mmengine - INFO - pplnn:       None
05/16 11:25:13 - mmengine - INFO - ncnn:        None
05/16 11:25:13 - mmengine - INFO - snpe:        None
05/16 11:25:13 - mmengine - INFO - openvino:    None
05/16 11:25:13 - mmengine - INFO - torchscript: 2.2.0
05/16 11:25:13 - mmengine - INFO - torchscript custom ops:      NotAvailable
05/16 11:25:13 - mmengine - INFO - rknn-toolkit:        None
05/16 11:25:13 - mmengine - INFO - rknn-toolkit2:       None
05/16 11:25:13 - mmengine - INFO - ascend:      None
05/16 11:25:13 - mmengine - INFO - coreml:      None
05/16 11:25:13 - mmengine - INFO - tvm: None
05/16 11:25:13 - mmengine - INFO - vacc:        None
05/16 11:25:13 - mmengine - INFO - 

05/16 11:25:13 - mmengine - INFO - **********Codebase information**********
05/16 11:25:13 - mmengine - INFO - mmdet:       3.3.0
05/16 11:25:13 - mmengine - INFO - mmseg:       None
05/16 11:25:13 - mmengine - INFO - mmpretrain:  1.2.0
05/16 11:25:13 - mmengine - INFO - mmocr:       None
05/16 11:25:13 - mmengine - INFO - mmagic:      None
05/16 11:25:13 - mmengine - INFO - mmdet3d:     None
05/16 11:25:13 - mmengine - INFO - mmpose:      None
05/16 11:25:13 - mmengine - INFO - mmrotate:    None
05/16 11:25:13 - mmengine - INFO - mmaction:    1.2.0
05/16 11:25:13 - mmengine - INFO - mmrazor:     None
05/16 11:25:13 - mmengine - INFO - mmyolo:      None

Error traceback

No response

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] The serialized model is larger than the 2GiB limit imposed by the protobuf library #2768

[Bug] The serialized model is larger than the 2GiB limit imposed by the protobuf library #2768

maximefuchs commented May 16, 2024

[Bug] The serialized model is larger than the 2GiB limit imposed by the protobuf library #2768

[Bug] The serialized model is larger than the 2GiB limit imposed by the protobuf library #2768

Comments

maximefuchs commented May 16, 2024

Checklist

Describe the bug

Reproduction

Environment

Error traceback