2

TL;DR: How can I grab a series of key:value pairs that are comma separated, but only if they start with specific strings?

Hello,

I have a situation where I have a block of text with multiple lines, each that could have an arbitrary number of key value pairs on it.

The key and value are colon separated, and then each pair is comma separated. The lines begin with a specific key value pair, some of which need to be captured, and others ignored. Other lines do not have any key value pairs, but may have colons and commas which could confuse the Regex string.

Here are some the parameters/limitations of the task:

  1. The values will be comma separated and need to be captured, not just matched
  2. Each comma separated value will have a colon pair that needs to be grouped in KEY:VALUE format, where KEY is always group 1, VALUE is always group 2
  3. The regex must not cross a linefeed
  4. No PHP, SED, AWK, Python, PERL, etc. coding allowed, can only use PCRE2 regex
  5. There could be an arbitrary number of pairs, and an arbitrary number of lines that need to be captured

Here is the sample text:


Prompt: professional digital [airbrush:gouache:0.6] art of  (mythical:1.1) demon, wearing ragged jacket and pants, dynamic lighting, summer, Dutch Masters:0.6
Negative prompt: (child), (mangled hands), (badly drawn hands),( badly drawn fingers)
Steps: 70, Sampler: Euler a, CFG scale: 40, Seed: 351468770, Size: 340x512, Model hash: 2db4e932c1, Model: Comics_vimod, Denoising strength: 0.651,  
Mirror Mode: 2, Mirror Style: 0, Mirroring Max Step Fraction: 0.1, X Pan: 0.02, Y Pan: 0.03
Template: professional digital [airbrush:gouache:0.{6|7|8}] art of  ({fantastical|mythical}:1.1) demon, wearing ragged jacket and {skirt|pants|breech cloth|tuxedo}, dynamic lighting, {winter|summer}, Dutch Masters:0.6

I need to capture and group the lines for Steps, Model, and Mirror Mode but ignore Prompt, Negative Prompt, and Template


Desired matching:

Prompt: Don't match<br>
(Steps|70) (Sampler|Euler a) (CFG scale|40) (Seed|351468770) (Size|340x512) 
(Model hash|2db4e932c1) (Model|Comics_vimod) (Denoising strength|0.651)  
(Mirror Mode|2) (Mirror Style|0) (Mirroring Max Step Fraction|0.1) (X Pan|0.02) (Y Pan|0.03)<br>
Template: Don't Match

Regex code tried:

(?:(?<=[,])|^)([^:,]+):([^,]+) ... matches what is needed, but also grabs matches from Template and Prompt and it crosses line feeds so that the last comma separated value in Template is grouped in with "Steps"
([^:\s]+):([^\s]+) ... only matches key value pairs in Prompt and Template because of parentheses and brackets
(?<pair>(?<key>.+?)(?::)(?<value>[^:]+)(?:,|$)) ... grabs way too much, is not limited to the lines needed

Trying to capture with a string ahead of the main regex, like this: Mirror Mode: \d, (?<pair>(?<key>.+?)(?::)(?<value>[^:]+)(?:,|$)) results in only the first CSV key:value pair being grouped

I'd appreciate any assistance. Again, this is to be used in Regex only, no programming tools, no apps, etc. etc.

3
  • 1
    Is this what you want?
    – Toto
    Commented Mar 2, 2023 at 10:53
  • YES! What was the key to capturing more than one group? I get the matching of the key:value but I'm assuming the Branch Reset and \G are the key to repeating the matches? Commented Mar 3, 2023 at 2:50
  • I've made my comment an answer.
    – Toto
    Commented Mar 3, 2023 at 9:02

1 Answer 1

1

Here is a solution using PCRE2 flavour:

(^(?:Steps|Model|Mirror Mode)|\G(?!^).+?):\h*(.+?)(?:,\h*|$)   

The key to match unknown number of matches is \G that restarts the match from the last match position.

Demo & explanation

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .