1

Currently, I am applying h.264 compression to a single image.
As you can see in the picture below, I want to compress that image based on h.264 with 40 constant quantization.

enter image description here However, there are no relevant functionality anywhere, such as Python-based libraries (opencv, ffmpeg).

Also, there is no github for applying to single image and well-structured h.264 algorithm.

So, is there any github implemetation or library to do it?

Thanks in advance.

3
  • 2
    Why? h.264 "is a block-oriented motion-compensation-based video compression standard". Your image doesn't have motion (frames before/after) which is required for h.264 compression to function...
    – sorifiend
    Commented Apr 28, 2021 at 8:40
  • @sorifiend. I see. h.264 codec is for video compression not for single image.
    – JaeJu
    Commented Apr 28, 2021 at 8:52
  • For what it's worth the so-called webp still image format is basically a single-frame VP8 video.
    – O. Jones
    Commented May 3, 2021 at 14:54

2 Answers 2

3

There are cases (like academic purposes) that is does make sense to encode a single frame video.

There is an existing image format named HEIF that based on HEVC (H.265) codec, but for H.264, you must encode a video file.

You may use FFmpeg command line tool for creating a single frame video file:

ffmpeg -i input.png -vcodec libx264 -qp 40 -pix_fmt yuv420p single_frame.mp4

(Assuming your input image is input.png).

The -qp 40 argument applies constant quantization 40.

qp (qp)
Set constant quantization rate control method parameter

See: libx264 documentation

Note:
Since it's a single frame, the frame type is going to be an I-Frame.


In case you prefer elementary H.264 stream (without MP4 container), you may use:

ffmpeg -i input.png -vcodec libx264 -qp 40 -pix_fmt yuv420p single_frame.264

There are Python bindings for FFmpeg, like ffmpeg-python.
You may use it, for encoding a NumPy array that represents a raw video frame.
You may also use a pipe to FFmpeg sub-process as the following sample.


You can find source code for H.264 encoder in GitHub (here for example).
As far a I know, the sources are implemented in C.

Without looking at the sources, I don't think you are going to consider them as "well-structured".

2
  • Thanks expert. I'm sorry if my words sound little harsh :))
    – JaeJu
    Commented Apr 29, 2021 at 7:57
  • FWIW: lower values of qp generate more encoded data (larger compressed files). Higher values generate pictures with more artifacts.
    – O. Jones
    Commented May 3, 2021 at 14:58
0

There aren't many reasons why you would want to do H.264 encoding on a single image. Jpeg is probably what you want. But I had a project where I was building a super resolution model for videos and the video compression artifacts kept getting enhanced as if they were actual edges. So I had to come up with a way to do H.264 video encoding on a single image to train it to fix the artifacts rather than enhance them. (jpg and H264 artifacts are very different)

compress_image_h264 takes an a numpy array image (h, w, c) and an amount of compression (25-50 recommended) and outputs a compressed version.
The result is copied before returning because I got an issue with it returning a "non writeable" array and copying fixed it.

def compress_image_h264(image, amount):
    # Encode the image to PNG format
    _, png_data = cv2.imencode('.png', image)

    # Use ffmpeg to compress the image using H.264 codec and MKV container
    ffmpeg_command = [
        'ffmpeg',
        '-y',                        # Overwrite output files without asking
        '-i', 'pipe:0',              # Input from stdin
        '-vcodec', 'libx264',        # Use H.264 codec
        '-qp', str(amount),          # Quality parameter
        '-pix_fmt', 'yuv420p',       # Pixel format
        '-f', 'matroska',            # Use MKV container
        'pipe:1'                     # Output to stdout
    ]

    result = subprocess.run(
        ffmpeg_command,
        input=png_data.tobytes(),    # Pass PNG data to stdin
        stdout=subprocess.PIPE,      # Capture stdout
        stderr=subprocess.PIPE       # Capture stderr for debugging
    )

    if result.returncode != 0:
        print("FFmpeg error during compression:", result.stderr.decode())
        raise RuntimeError("FFmpeg compression failed")

    # Get the compressed data from stdout
    compressed_data = result.stdout

    return np.copy(decompress_image_h264(compressed_data, image.shape[1], image.shape[0]))
def decompress_image_h264(compressed_data, width, height):
    # Use ffmpeg to decompress the image from H.264 to raw format
    ffmpeg_command = [
        'ffmpeg',
        '-i', 'pipe:0',              # Input from stdin
        '-f', 'rawvideo',            # Output raw video format
        '-pix_fmt', 'bgr24',         # Pixel format
        'pipe:1'                     # Output to stdout
    ]

    result = subprocess.run(
        ffmpeg_command,
        input=compressed_data,       # Pass compressed data to stdin
        stdout=subprocess.PIPE,      # Capture stdout
        stderr=subprocess.PIPE       # Capture stderr for debugging
    )

    if result.returncode != 0:
        print("FFmpeg error during decompression:", result.stderr.decode())
        raise RuntimeError("FFmpeg decompression failed")

    # Get the raw image data from stdout
    raw_image_data = result.stdout

    # Ensure we have enough data to reshape into the desired format
    expected_size = width * height * 3
    if len(raw_image_data) != expected_size:
        print("Unexpected raw image data size:", len(raw_image_data))
        raise ValueError(f"Cannot reshape array of size {len(raw_image_data)} into shape ({height},{width},3)")

    # Convert the raw data to a numpy array
    frame = np.frombuffer(raw_image_data, dtype=np.uint8).reshape((height, width, 3))

    return frame

Not the answer you're looking for? Browse other questions tagged or ask your own question.