0

I have strings of the form: ...format=<format_type>... where legal format_types can be one of

image/{png,jpeg,tiff} or {kmz,kml}

i want to match on any string with an illegal format type. For example

foo&bar&format=image/png and foo&bar&format=kml&baz

should not match, but

foo&bar&format=image/svg and foo&bar&format=application/pdf&baz

should.

I've tried .*format=(image\/)?.*(?!=(kml|kmz|png|jpeg|tiff)).* but this doesnt work.

2
  • Are you trying to match only format i.e. image/svg or entire text i.e. foo&bar&format=image/svg?
    – Tom
    Commented Jan 9, 2013 at 21:17
  • @Tom: i want to match the entire string.
    – cdk
    Commented Jan 9, 2013 at 21:32

2 Answers 2

3

No doubt there's a regex that matches any illegal format, but writing one that matches looks easier. So a quick workaround could be to find any string that doesn't match the legal pattern instead of finding the strings that match the illegal pattern.

So instead of

if (str =~ m/ ...illegal pattern... /) { ... }

You could use

if not (str =~ m/ ...legal pattern... /) { ... }
unless (str =~ m/ ...legal pattern... /) { ... }

So you get:

if not (str =~ m/^.*format=(image\/(png|jpeg|tiff))|kmz|kml).*$/) { ... }
2
  • 1
    you could also use the does not match operator !~ instead of not Commented Jan 9, 2013 at 21:46
  • Didn't know that operator existed, learned something new, thanks!
    – Sander
    Commented Jan 9, 2013 at 21:47
2

I don't have a PERL interpreter handy, but this seemed to work in Java:

^.*format=(?!(?:image/)?(?:kml|kmz|png|jpeg|tiff)).*$

Here's the snippet that tests it:

private static final Pattern REGEX = 
   Pattern.compile("^.*format=(?!(?:image/)?(?:kml|kmz|png|jpeg|tiff)).*$");

public static void main(String[] args) {
    for (String format : Arrays.asList("foo&bar&format=image/png", 
            "foo&bar&format=kml&baz", "foo&bar&format=image/svg", 
            "foo&bar&format=application/pdf&baz")) {
        System.out.printf("%s %s%n", format, 
            REGEX.matcher(format).matches() ? "matches" : "does not match");
    }
}

Prints:

foo&bar&format=image/png does not match
foo&bar&format=kml&baz does not match
foo&bar&format=image/svg matches
foo&bar&format=application/pdf&baz matches
1
  • This one works fine. Just have to escape the slash after image: /^.*format=(?!(?:image\/)?(?:kml|kmz|png|jpeg|tiff)).*$/.
    – mgamba
    Commented Jan 9, 2013 at 21:42

Not the answer you're looking for? Browse other questions tagged or ask your own question.