0

I frequently use pdfimages to extract graphic images from pdf files. A lot of the time, I find that the extracted image seems to be in two parts - the original image and a grey-scale mask that screens out the unwanted parts of the original image.

What I'm having a problem with is making the image look like it does in the pdf. I can always use the workaround of zooming in on the image and doing a screen capture, but that seems clumsy and doesn't work if there is text over the image.

However I haven't been able to come up with a working alternative.

Here's a concrete example. I have a head shot of a person with a white/transparent background in the original pdf. The extracted image has a dark background.

The pdf's mask image is grey scale with the background part black and the person's shape white. The border between the two is feathered, a thin grey area between the black & white.

Using GIMP, I open the main image and add an alpha channel since it seems not to have one. Next I open the mask as a new layer and add an alpha channel to it too. Then I add a layer mask to it as transparent according to the grey-scale of the mask image.

This leaves me with a dark background rather than than a transparent one.

Any suggestions on how I can do this? Sorry if this is a newby question but I'm far from a graphics pro.

4
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer.
    – Community Bot
    Commented Jan 11 at 19:40
  • 1
    Thanks K J. I've edited my question to make it (hopefully) clearer. I note that the two .png files extracted by pdfimages don't seem to have alpha channels (GIMP has the option to add an alpha channel to them when I use "layers | transparency"). PDFshaper is a Windows-only product so it doesn't help me.
    – Gary Dale
    Commented Jan 12 at 13:05
  • Thanks again K J. However it looks to me like the pdf's mask operates the normal way, with white being opaque and black being fully transparent. However, how can I get GIMP to use that mask to render the background on the base image transparent? I don't think the fact that the images came from a pdf is relevant. You should be able to take any colour image and any grey-scale shape with some part black and another white to duplicate the issue (e.g. add a black layer then select an oval section and fill it with white) How can I use this to make the base image under the black transparent?).
    – Gary Dale
    Commented Jan 12 at 15:44
  • Extracting the whole page would presumably leave me with the text which could be over an image (fairly common). Moreover, I don't see there being any issue with having an alpha channel, as it can be easily added in GIMP, as the current edit of the question shows. pdfimages already extracts the original image perfectly save for the alpha channel, which shouldn't be a problem. The issue seem to be that the mask is a separate PNG from the base image. Surely there is a way to use a PNG as a mask in GIMP?
    – Gary Dale
    Commented Jan 13 at 16:40

2 Answers 2

1

Found a way to make this work at https://graphicdesign.stackexchange.com/questions/8397/gimp-using-an-image-as-the-transparency-layer-of-another-image

To make this clear the exact steps are:

  1. Open the base image
  2. Add an alpha channel
  3. Add a layer mask - doesn't matter what option you select
  4. Open the mask image
  5. Add an alpha channel
  6. Copy the mask image
  7. In the main image, edit the layer mask
  8. Paste the mask image into the layer mask (e.g. CTRL+V)

This should work for any PDF images that use a mask.

0

You can try using the hexapdf images command which tries to combine the mask with the image before outputting.

This command works in a similar manner to pdfimages. Please note, however, that not all kinds of images stored in a PDF are currently supported but those that are usually output as PNG should work fine.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .