15

I have some JPEG files which have some colours wrong after a certain point and also every pixel is shifted to the left. I think this is because of some missing bytes at the point where it changes.

I tried to edit the file with vi but it seems impossible to find out where the missing bytes are, also vi is very complicated to use. I also tried nano but it's not binary-safe.

This is one of the images in question:

enter image description here

So I want to ask you two questions:

  1. How can I repair such images in Linux?
  2. How could I safely open and edit the file in a binary text editor under Linux?

Edit: Using hexedit I discovered that from position 0x27F000 to 0x27F403 there are only ones 0xff, and from 0x27F404 to 0x27FFFF there are only zeroes 0x00.

This makes something like this:

0027EFF0   F8 83 C3 E2  09 35 AF 13  44 6E C5 FD  C7 EF 23 E8  .....5..Dn....#.
0027F000   FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  ................
[...]
0027F400   FF FF FF FF  00 00 00 00  00 00 00 00  00 00 00 00  ................
[...]
0027FFF0   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
00280000   8F 39 6E 47  4F 43 5F 36  7C 73 66 F1  0D AE AD AF  .9nGOC_6|sf.....

Changing these bytes with random numbers I was able to unshift the image, but there is still the colour problem.

Could someone point me to some documentation about JPEG encoding so I can figure out how to know where an 8x8 block ends?

I'm wondering why the positions are so precise (0x27F000 to 0x27FFFF), could this be a bug of my cam or the memory card I used?

7
  • I had this happen once to a large set of images. I just ended up deleting them, a shame really. I'd be interested to know if you succeed in repairing these, you've got a tough job ahead of you.
    – dtmland
    Commented Jun 24, 2013 at 20:16
  • Yes it is it's a shame, I'm trying to figure out how this jpeg files are encoded, it's just one 8x4000px line which I will have to delete. In this file there are exactly 4KB damaged in a file of 4.4MB It's less then 0.1%!!!
    – Falk
    Commented Jun 25, 2013 at 14:23
  • Did you ever find a solution for this @Falk? I would love to know Commented Apr 14, 2022 at 15:56
  • 1
    No, not for Linux. There's Windows tools though, like youtu.be/A33zn_sgm30 Commented Nov 1, 2022 at 13:06
  • 1
    @joep-van-steen This is exactly what I was looking for (although not for Linux). If you write it as an answer I'll accept it. And mark my question as solved.
    – Falk
    Commented Nov 2, 2022 at 18:46

6 Answers 6

2

Unfortunately I am not aware of any Linux tools. This works in Windows: youtu.be/A33zn_sgm30

It can be a little finnicky to get it right, but do-able. Disclosure: I am author of this software, which I originally wrote for own use, which can be found at www.disktuna.com.

There's a free alternative too but it will not be able to handle too severe corruption at www.anderspedersen.net.

Since I wrote answer I also found Jpeg Medic (also Windows only) for repairing jpegs: https://www.jpegmedic.com

2

The wikipedia article on http://en.wikipedia.org/wiki/JPEG#Entropy_coding has a lot of information - the most relevant to your current problem is this one:

The previous quantized DC coefficient is used to predict the current quantized DC coefficient. The difference between the two is encoded rather than the actual value. The encoding of the 63 quantized AC coefficients does not use such prediction differencing.

The color shifting in the remainder of the image is caused by a single bad DC coefficient that cascades to all the rest. You might be able to find a small area (maybe one byte, maybe two - it's probably actually some sequence of bits) that reliably affects the colors, and try a large number of different values for that.

It may be easier to simply fix the image in a graphic editor - it looks like the one you posted, other than the shift (and wraparound), may simply have a lower brightness, you could select the area and use the Levels tool. Others that have more involved color shifts, you might be able to get a good enough result by looking at the decomposition in color channels (JPGs may be in RGB or Y'CbCr) and fixing each channel separately, possibly swapping channels.

EDIT: Oops, I didn't see how old your question was. Well, maybe this will be useful to you or someone else.

2
  • 1
    Lots of thanks,don’t care about the age of the question, is still have the image and some more.
    – Falk
    Commented Jan 2, 2015 at 22:49
  • I think it's a shame that the camera didn't expect such a situation, they should put some key blocks every n (say 32) rows of blocks. the other problem is that I even don't know if there is a lossless compression applied after the lossy one, like Huffman. I'd rather play around with some bytes than open the image with graphic editors: first most of them don't want to open these images, second I don't think that I could find the exact correction playing around with some sliders. @Random832 thanks and please tell me if You know something more.
    – Falk
    Commented Jan 2, 2015 at 23:09
1

2) How could I safely open and edit the file in a binary text editor under Linux?

Lots of great binary editors can be found here: https://stackoverflow.com/questions/839227/how-to-edit-binary-file-on-the-unix-systems

My personal favorites are vim with :%!xxd hack and hexedit

2
  • 2
    Ok, and something easier to use? like nano.
    – Falk
    Commented Jun 24, 2013 at 14:17
  • something easy like shed?
    – Attie
    Commented Oct 24, 2017 at 19:33
1

I wrote a short Python program that can automatically do the shift and color-correction, or at least make a best-effort attempt:

$ ./FixJpeg.py mNQdX.jpg

Opening mNQdX.jpg.
Found color corruption discontinuity at y=235.
Found image shift discontinuity at x=484.
Unshifted image.
Using 'AFFINE' mode to correct colors.
Found colorspace correction matrix:
array([[ 1.0936184 ,  0.11829611, -0.21800563,  0.10399387],
       [-0.11286427,  1.11502151,  0.03886265,  0.09807938],
       [ 0.22086667, -0.01595861,  1.09931108,  0.04676938],
       [ 0.        ,  0.        ,  0.        ,  1.        ]])
Fixed colors.
Filled in missing pixels.
Saving to 'mNQdX.jpg.fixed.png'.
Saved!

Result:

enter image description here

Compared to "JPEG-Repair Tookit", this works on Linux and is fully automatic so you can run it in a Shell script, but it's maybe less accurate than you could get by hand.

Compared to "JPEG Repair Shop", this has no awareness of the encoding structure of JPEG. It only works on the pixel color values.

Copy the code below and save it as "FixJpeg.py":

#!/usr/bin/env python3

"""
De-rotate and fix colors in a partially corrupted image file.

See: https://superuser.com/questions/611058/repair-broken-jpg-files
"""

from PIL import Image
import numpy as np
import argparse, textwrap, sys

def parser():
    _parser = argparse.ArgumentParser(
        formatter_class=type('ArgFormatter', (argparse.ArgumentDefaultsHelpFormatter, argparse.RawDescriptionHelpFormatter), {}),
        description=__doc__,
        epilog=textwrap.dedent("""
            Spacial distances are [0.0-1.0], relative to the image dimensions.
            Color values are normalized to [0.0-1.0].

            Copyright © 2024 Will Chen

            Usage of the works is permitted provided that this
            instrument is retained with the works, so that any entity
            that uses the works is notified of this instrument.

            DISCLAIMER: THE WORKS ARE WITHOUT WARRANTY.
        """)
    )
    _parser.add_argument('--glitch-vertical-search-radius', metavar='(0.0-1.0]', default=0.05, type=float,
                    help="Look for the glitch boundary within this distance.")
    _parser.add_argument('--glitch-horizontal-smooth-radius', metavar='[0.0-1.0]', default=0.03, type=float,
                    help="Smooth out the discovered glitch boundary.")
    _parser.add_argument('--color-low-quantile', metavar='[0.0-1.0]', default=0.1, type=float,
                    help="With --color-fix-mode RGB_*, fix color using this low quantile in last uncorrupted pixel row.")
    _parser.add_argument('--color-middle-quantile', metavar='[0.0-1.0]', default=0.5, type=float,
                    help="With --color-fix-mode RGB_*, fix color using this middle quantile in last uncorrupted pixel row.")
    _parser.add_argument('--color-high-quantile', metavar='[0.0-1.0]', default=0.9, type=float,
                    help="With --color-fix-mode RGB_*, fix color using this high quantile in last uncorrupted pixel row.")
    _parser.add_argument('--color-check-hsmear', metavar='[0.0-1.0]', default=0.02, type=float,
                    help="Smooth out reference values for fixing colors.")
    _parser.add_argument('--color-match-dist', metavar='[0.0-1.0]', default=0.015, type=float,
                    help="With --color-fix-mode RGB_*, sample all pixels within this difference when computing the inverse transform.")
    _parser.add_argument('--color-check-degamma', metavar='~', default=1.0, type=float,
                    help="With --color-fix-mode AFFINE, account for a naïve exponential gamma transform when fixing colors.")
    _parser.add_argument('--color-fix-mode', default='AFFINE', choices=['AFFINE', 'RGB_EXPO', 'RGB_LINEAR'],
                    help="Algorithm to use for fixing colors.")
    _parser.add_argument('--color-space', default='RGB', choices=['RGB', 'YCbCr', 'LAB', 'HSV'],
                    help="Image colorspace to perform all operations in.")
    _parser.add_argument('-o', '--outfile', default=None,
                    help="Manually specify output file. This also enables clobbering.")
    _parser.add_argument('-n', '--dont-save', action='store_true',
                    help="Don't save the fixed image file.")
    _parser.add_argument('--show', action='store_true',
                    help="Open the fixed image.")
    _parser.add_argument('--debug', action='store_true',
                    help="Expose internal data at the Python module level. This is intended to be used with `python3 -i`.")
    _parser.add_argument('filename',
                    help="Image filepath to fix.")
    return _parser


def report(*a, **kw):
    print(*a, file=sys.stderr, **kw)

def colordist(a1, a2):
    return np.sqrt(np.sum(np.fabs(a1 - a2) ** 2, -1))

def extrude(a):
    return np.repeat(a[:,:,np.newaxis], 3, 2)
def h_smear(a, radius, *, axis=1):
    out = np.zeros_like(a)
    for x_offset in range(-radius, radius+1):
        out += np.roll(a, x_offset, axis)
    return out / (radius*2+1)
def clamp(a):
    return np.maximum(np.minimum(a, 1), 0)

def merge(a1, a2):
    return np.where(np.logical_not(np.isnan(a1)), a1, np.where(np.logical_not(np.isnan(a2)), a2, np.NaN))
def v_infill(a):
    return np.where(np.isnan(a), (np.roll(a, 1, 0) + np.roll(a, -1, 0)) / 2, a)


def fix_image(arr: np.ndarray[float], ARGS: object):
    a_vdist = np.pad(colordist(arr[:-1],  arr[1:]), ((0,1), (0,0)))

    a_row = lambda: np.repeat(np.arange(arr.shape[0])[:,np.newaxis], arr.shape[1], 1)

    sharpest_horizontal_y = np.argmax(np.sum(a_vdist, 1))
    report(f"Found color corruption discontinuity at y={sharpest_horizontal_y}.")
    a_glitchzone = clamp(1 - (np.fabs(a_row() - sharpest_horizontal_y) / (arr.shape[0] * ARGS.glitch_vertical_search_radius)))
    glitch_y = np.argmax(h_smear(a_vdist, round(arr.shape[1] * ARGS.glitch_horizontal_smooth_radius)) * a_glitchzone, 0)
    del a_glitchzone
    a_glitch_ys = np.tile(glitch_y, arr.shape[0]).reshape(a_vdist.shape)

    glitched_bottom = arr[sharpest_horizontal_y:]
    a_hdist = colordist(glitched_bottom, np.roll(glitched_bottom, 1, 1))
    rotation = np.argmax(np.sum(a_hdist, 0))
    report(f"Found image shift discontinuity at x={rotation}.")

    a_output_top = np.where(extrude(a_row() <= a_glitch_ys), arr, np.NaN)
    a_output_bottom = np.where(extrude(a_row() > a_glitch_ys), arr, np.NaN)
    a_output_bottom = np.roll(a_output_bottom, -rotation, 1)
    report(f"Unshifted image.")

    report(f"Using {ARGS.color_fix_mode!r} mode to correct colors.")

    if ARGS.color_fix_mode == 'AFFINE':
        try:
            from transforms3d import _gohlketransforms as ghk
        except ModuleNotFoundError as e:
            report("ERROR: Please install transforms3d in order to use affine transform mode.")
            raise e
        else:
            last_good_row = h_smear(np.sum(extrude((a_row() == a_glitch_ys)) * arr, 0) ** (1/ARGS.color_check_degamma), round(arr.shape[1] * ARGS.color_check_hsmear), axis=0)
            first_bad_row = h_smear(np.roll(np.sum(extrude((a_row() == a_glitch_ys + 1)) * arr, 0), -rotation, 0) ** (1/ARGS.color_check_degamma), round(arr.shape[1] * ARGS.color_check_hsmear), axis=0)
            correction_matrix = ghk.affine_matrix_from_points(first_bad_row.transpose(), last_good_row.transpose(), shear=False, scale=True, usesvd=True)
            report(f"Found colorspace correction matrix:\n{correction_matrix!r}")
            def fixelstream(rgbs):
                bad_pixels = np.pad(rgbs.transpose(), ((0,1),(0,0)), mode='constant', constant_values=1)
                fixed_pixels = correction_matrix @ bad_pixels
                return fixed_pixels[:3].transpose()
            a_output_bottom = fixelstream((a_output_bottom**(1/ARGS.color_check_degamma)).reshape([a_output_bottom.shape[0] * a_output_bottom.shape[1], a_output_bottom.shape[2]])).reshape(a_output_bottom.shape)**ARGS.color_check_degamma
    else:
        last_good_row = h_smear(np.sum(extrude((a_row() == a_glitch_ys)) * arr, 0), round(arr.shape[1] * ARGS.color_check_hsmear), axis=0)
        first_bad_row = h_smear(np.roll(np.sum(extrude((a_row() == a_glitch_ys + 1)) * arr, 0), -rotation, 0), round(arr.shape[1] * ARGS.color_check_hsmear), axis=0)

        color_targets = np.array([np.quantile(last_good_row[:,i], [ARGS.color_low_quantile, ARGS.color_middle_quantile, ARGS.color_high_quantile]) for i in range(arr.shape[2])]).transpose()
        color_target_a_pixels = np.array([np.isclose(last_good_row, quantile, atol=ARGS.color_match_dist) for quantile in color_targets])
        color_corrupteds = np.array([np.where(quantile, first_bad_row, np.NaN) for quantile in color_target_a_pixels])
        color_corrupteds = np.nanmean(color_corrupteds, 1)
        _minus = color_corrupteds[0]
        _div = (color_corrupteds[2]-color_corrupteds[0])
        _expo = np.log(color_targets[1]) / np.log(color_corrupteds[1]) if ARGS.color_fix_mode == 'RGB_EXPO' else 1.0
        _times = (color_targets[2]-color_targets[0])
        _plus = color_targets[0]
        report(f"Found colorspace correction values:\n\t-= {_minus}\n\t/= {_div}\n\t**={_expo}\n\t*= {_times}\n\t+= {_plus}")
        if ARGS.color_fix_mode in ('RGB_EXPO', 'RGB_LINEAR'):
            a_output_bottom -= _minus
            a_output_bottom /= _div
            a_output_bottom **= _expo
            a_output_bottom *= _times
            a_output_bottom += _plus
        elif ARGS.color_fix_mode == 'RGB_CIRC':
            raise NotImplementedError
        else:
            raise ValueError(ARGS.color_fix_mode)

    report(f"Fixed colors.")

    a_output = merge(a_output_top, a_output_bottom)
    a_output = v_infill(a_output)
    report(f"Filled in missing pixels.")

    if ARGS.debug:
        global r
        r = lambda: None
        report(f"Stored internal variables in the module variable `r`. Inspect in an interactive Python interpreter.")
        r.__dict__.update(**locals())

    return a_output


def fix_file(image: str, ARGS: object):
    report(f"Opening {ARGS.filename}.")
    with Image.open(ARGS.filename) as img:
        orig_mode = img.mode
        if img.mode != ARGS.color_space:
            from PIL import ImageMode
            assert ImageMode.getmode(ARGS.color_space).typestr == '|u1'
            assert len(ImageMode.getmode(ARGS.color_space).bands) == 3
            report(f"Converting from {img.mode!r} to {ARGS.color_space!r}.")
            img = img.convert(ARGS.color_space)
        arr = np.array(img).astype(np.float32) / 255
    img = Image.fromarray(np.round(clamp(fix_image(arr, ARGS)) * 255).astype(np.uint8), ARGS.color_space)
    if img.mode != orig_mode:
        report(f"Converting from {img.mode!r} to {orig_mode!r}.")
        img = img.convert(orig_mode)
    return img


if __name__ == '__main__':
    ARGS = parser().parse_args()
    img = fix_file(ARGS.filename, ARGS)
    if ARGS.dont_save:
        report(f"Skipping saving image.")
    else:
        if ARGS.outfile is None:
            import os
            outfile = f'{ARGS.filename!s}.fixed.png'
            i = 0
            while os.path.exists(outfile):
                i += 1
                outfile = f'{ARGS.filename!s}.fixed-{i!s}.png'
        else:
            outfile = ARGS.outfile
        report(f"Saving to {outfile!r}.")
        img.save(outfile)
        report(f"Saved!")
    if ARGS.show:
        img.show()

Requires:

  • numpy.
  • PIL/Pillow.
  • transform3d (Optional, for 'AFFINE' colour repair mode).
$ pip3 install --user numpy Pillow transform3d
1
0

Have you tried photorec? You can install it on Ubuntu like this:

sudo apt-get install testdisk

Check the manual with:

man photorec

and just run photorec from the terminal like so:

photorec

It will ask you to select the source and a destination and try to recover jpg files automatically.

To prevent damaging the original, I recommend making a copy with the dd command. Good luck!

7
  • 4
    Hi, photorec is designed to recover files from a corrupted filesystem, in my case the filesystem is fine, but the image is corrupted so it's a totally different situation.
    – Falk
    Commented Oct 24, 2017 at 22:27
  • 5
    Hi, as I wrote earlier there is nothing wrong with the filesystem, it's the JPEG file which has the error, thanks anyway.
    – Falk
    Commented Oct 25, 2017 at 13:03
  • 2
    I know how photorec works, and it's not what I need, it would just copy the picture as it is with out repairing it.
    – Falk
    Commented Nov 1, 2017 at 11:10
  • 1
    I'm just trying to help. You type would, so you didn't try? I actually ran it on an SD card and yes it copies them but it fixed them too. Can't hurt to try, right? If it's not worth trying, fine, good luck. All I am saying is that it did repair it for me. I will not respond any further to avoid endless discussion. Commented Nov 1, 2017 at 15:24
  • 4
    I wont try it because photorec is ment to be run with a device as parameter, the jpg file I have is on my hdd which is perfectly fine. It's just a diferent type of situation. As I said thanks anyway.
    – Falk
    Commented Nov 1, 2017 at 17:48
-2

I just used Photorec to recover pictures from a SD card that went corrupt. Although it did not recover all files, it did a great job recovering a good number of them. With that said, MP4 videos were recovered but could not be opened. Some JPEG files were recovered but they also couldn't be viewed or were really messed up, as the sample provided at the beginning of this thread. Photorec did not fix them.

Bottom line: Photorec is designed to retrieve lost files from a corrupt FILE SYSTEMS but apparently doesn't do anything to recover the content of corrupted FILES.

3
  • Hi, Wander, I know photorec, but it's not what I'm looking for. It won't fix corrupted files.
    – Falk
    Commented Oct 23, 2019 at 19:01
  • 1
    Your "Bottom line" is complety wrong. Please read: cgsecurity.org/wiki/PhotoRec#How_PhotoRec_works Jpeg files can be corrupted without the underlying file system being corrupted. Any way of fixing broken files is beyond the scope of recovery programs trying the recover information from a disk. What you expect recovery software to do is a separate matter apart from recovery and it's such a demanding task that the software market offers separate commercial solutions for the problem such as Joep von Steen's www.disktuna.com.
    – r2d3
    Commented Nov 2, 2022 at 22:51
  • You can write repair software for every file format if you want to but writing something just for one file format is a hell of a task.
    – r2d3
    Commented Nov 2, 2022 at 22:55

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .