1

I am fairly new to Notepad++ and trying to use Regex to search for specific values in a field and delete its Parent tag (and all contents including the field).

Essentially I am trying to remove transactions that have certain Store ID's. The files are massive and there are thousands of entries I need to get rid of, sample below!

Sample

<Transaction>
                              <TxnHeader>
                                             <StoreId>6705</StoreId>
                                             <TillNumber>1</TillNumber>
                                             <TxnNumber>343243</TxnNumber>
                                             <StartDate>2019-02-02T07:42:45</StartDate>
                                             <TxnType>1</TxnType>
                              </TxnHeader>
                              <TxnItemLines>
                                             <TxnItemLine>
                                                            <DetailSequence>1</DetailSequence>
                                                            <ItemNumber>6304</ItemNumber>
                                                            <DeptNumber>168</DeptNumber>
                                                            <Quantity>1.000000</Quantity>
                                                            <LineValue>4.470000</LineValue>
                                             </TxnItemLine>
                              </TxnItemLines>
               </Transaction>
               <Transaction>
                              <TxnHeader>
                                             <StoreId>8351</StoreId>
                                             <TillNumber>1</TillNumber>
                                             <TxnNumber>327527</TxnNumber>
                                             <StartDate>2019-02-02T08:02:47</StartDate>
                                             <TxnType>1</TxnType>
                              </TxnHeader>
                              <TxnItemLines>
                                             <TxnItemLine>
                                                            <DetailSequence>1</DetailSequence>
                                                            <ItemNumber>6304</ItemNumber>
                                                            <DeptNumber>168</DeptNumber>
                                                            <Quantity>1.000000</Quantity>
                                                            <LineValue>7.310000</LineValue>
                                             </TxnItemLine>
                              </TxnItemLines>
               </Transaction>
               <Transaction>
                              <TxnHeader>
                                             <StoreId>7837</StoreId>
                                             <TillNumber>1</TillNumber>
                                             <TxnNumber>164728</TxnNumber>
                                             <StartDate>2019-02-02T08:19:47</StartDate>
                                             <TxnType>1</TxnType>
                              </TxnHeader>
                              <TxnItemLines>
                                             <TxnItemLine>
                                                            <DetailSequence>1</DetailSequence>
                                                            <ItemNumber>1902</ItemNumber>
                                                            <DeptNumber>154</DeptNumber>
                                                            <Quantity>1.000000</Quantity>
                                                            <LineValue>10.000000</LineValue>
                                             </TxnItemLine>
                              </TxnItemLines>
               </Transaction>

Desired

<Transaction>
                              <TxnHeader>
                                             <StoreId>6705</StoreId>
                                             <TillNumber>1</TillNumber>
                                             <TxnNumber>343243</TxnNumber>
                                             <StartDate>2019-02-02T07:42:45</StartDate>
                                             <TxnType>1</TxnType>
                              </TxnHeader>
                              <TxnItemLines>
                                             <TxnItemLine>
                                                            <DetailSequence>1</DetailSequence>
                                                            <ItemNumber>6304</ItemNumber>
                                                            <DeptNumber>168</DeptNumber>
                                                            <Quantity>1.000000</Quantity>
                                                            <LineValue>4.470000</LineValue>
                                             </TxnItemLine>
                              </TxnItemLines>
               </Transaction>
               <Transaction>
                              <TxnHeader>
                                             <StoreId>7837</StoreId>
                                             <TillNumber>1</TillNumber>
                                             <TxnNumber>164728</TxnNumber>
                                             <StartDate>2019-02-02T08:19:47</StartDate>
                                             <TxnType>1</TxnType>
                              </TxnHeader>
                              <TxnItemLines>
                                             <TxnItemLine>
                                                            <DetailSequence>1</DetailSequence>
                                                            <ItemNumber>1902</ItemNumber>
                                                            <DeptNumber>154</DeptNumber>
                                                            <Quantity>1.000000</Quantity>
                                                            <LineValue>10.000000</LineValue>
                                             </TxnItemLine>
                              </TxnItemLines>
               </Transaction>

The desired text above has completely remove the Transaction Tag that contain 8351

I tried a Regex find and replace (with nothing) using the query:

<Transaction>.*?<StoreID>8351</StoreID>.*?</Transaction>

and it ended up wrapping a massive chunk of the document from the top all the way down to the end of the first transaction containing 8351

Any help would be greatly appreciated!

1 Answer 1

1
  • Ctrl+H
  • Find what: <Transaction>(?:(?!</Transaction>).)+<StoreId>8351</StoreId>(?:(?!<Transaction>).)+</Transaction>\R
  • Replace with: LEAVE EMPTY
  • check Match case
  • check Wrap around
  • check Regular expression
  • CHECK . matches newline
  • Replace all

Explanation:

<Transaction>               # opening tag
(?:(?!</Transaction>).)+    # tempered greedy token, make sure we haven't </Transaction> before the following
<StoreId>8351</StoreId>     # literally
(?:(?!<Transaction>).)+     # tempered greedy token, make sure we haven't <Transaction> before the following
</Transaction>              # literally, closing tag
\R?                         # optional any kind of linebreak

Screen capture:

enter image description here

More about Tempered Greedy Token

2
  • Seems perfect, thanks! and also thank you for breaking down the logic, good to know :)
    – Jaevwyn
    Commented Mar 26, 2019 at 8:54
  • @Jaevwyn: You're welcome, glad it helps.
    – Toto
    Commented Mar 26, 2019 at 10:03

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .