2

I am trying to extract the "remainder" (the non-matching portion) of a search using Excel 2003. I tried using various combinations of functions but it seems that there's no specific function that can return the non-matched portion of a record found through a match with another part of the record.

Here an illustration of what I am trying to do. I have two lists. List A has 2000 records. Each record is an array of 9 cells, each containing a single digit in columns A through I. The digits are always in ascending order. So that data looks like this:

    [A]  [B]  [C]  [D]  [E]  [F]  [G]  [H]  [I]
     1    2    2    3    4    5    7    7    9
     2    4    5    5    6    6    8    9    9
    . . . etc.

List B has 30 records. These are the values that are searched for in the List A records. Each record is an array of 5 cells, each containing a single digit in columns L through P. These digits are always in ascending order. So this data looks like this:

    [L]  [M]  [N]  [O]  [P]
     1    1    3    4    5
     2    3    5    7    9
     1    2    3    4    5
    . . . etc.

Each record in List B is searched for in List A. It is a match if all five digits in the List B record match five of the digits in the List A record. So in this example, the first List B record does not match anything in List A. The second and third List B records both match the first List A record, and no List B records match the second List A record.

When there is a match, the "remainder" is the other four digits of the List A record that were not part of the match. For the two matches in this example, this would be illustrated like this:

    List A record: 1    2    2    3    4    5    7    7    9
    List B record:      2         3         5    7         9
    Remainder:     1         2         4              7


    List A record: 1    2    2    3    4    5    7    7    9
    List B record: 1    2         3    4    5
    Remainder:               2                   7    7    9

So the result for the first List A record is: 1247 2779 and the result for the second List A record is a blank.

The values in the List B records can change so the solution needs to be "generic" so the results can be updated without changing formulas.

I tried basing a solution on the COUNT function to count the frequency of each digit 0-9 in every item on List A and then use IF and AND functions (specifying which and how many of each digit is required to qualify as a match), to tell me which items on List B match List A. My attempts to use other functions have failed to extract the remainder in the manner that I want it to.

This question is similar to How to extract reminder of a match in MS Excel 2003, but the number patterns are different. A solution was found for that question, but it is difficult to figure out how to modify the answer to fit this problem. From time to time, I have similar problems of this nature, so I'm hoping for an answer that includes a process I can follow to develop similar solutions for other similar problems.

1
  • @fixer1234 can you solve this problem Commented Nov 1, 2014 at 15:15

2 Answers 2

1

I started to create an answer based on the method in the earlier question. Then I realized that you were on the right track. Since everything is in ascending order, you can just base the answer on the counts of each digit. The REPT function will build a string based on how many repetitions of a character you need. The quantity of each digit you need for the remainder is the quantity in the List A record minus the quantity in the List B record. So for a single match, say the A record digits are in A1:I1 and the B record digits are in L1:P1. The remainder would be:

    =REPT(1,COUNTIF(A1:I1,1)-COUNTIF(L1:P1,1))&
     REPT(2,COUNTIF(A1:I1,2)-COUNTIF(L1:P1,2))&
     REPT(3,COUNTIF(A1:I1,3)-COUNTIF(L1:P1,3))&
     REPT(4,COUNTIF(A1:I1,4)-COUNTIF(L1:P1,4))&
     REPT(5,COUNTIF(A1:I1,5)-COUNTIF(L1:P1,5))&
     REPT(6,COUNTIF(A1:I1,6)-COUNTIF(L1:P1,6))&
     REPT(7,COUNTIF(A1:I1,7)-COUNTIF(L1:P1,7))&
     REPT(8,COUNTIF(A1:I1,8)-COUNTIF(L1:P1,8))&
     REPT(9,COUNTIF(A1:I1,9)-COUNTIF(L1:P1,9))

This is all one formula. I've split it across lines so the pieces line up and you can see the pattern. If there is not a match, this will produce an error, which can be handled later.

You want to produce a report of the results for all of the matches. So we can set up a structure to do this.

Lets say List A is in A1:I2000 and List B is in L1:P30. Create a table in R1:AV2002. The table will be a mechanism to collect the results from comparing all B records to all A records. It will also provide an index to represent the row numbers in each list so that you can use a general formula and indirect addressing to perform the matches. Set up the table to look like this:

        [R]     [S]      [T]       [U]   ...  [AV]
[1]             <=======List B Row===============>
[2] List A Row   1        2         3    ...   30
[3]     1
[4]     2
[5]     3
    ...

Each cell of the table will reflect a matchup between one B record and one A record. Actually enter the List A row numbers as labels in column R, starting in row 3, and the List B row numbers as column headings in S2:AV2. These numbers will be used to point to the correct records.

In the formula provided earlier in the answer,the row numbers of the A and B records are hard coded. For this table, we need a general formula that can use the row and column labels to refer to the correct records to use for each cell. So we replace the row numbers with indirect addresses based on the labels. Each reference to A1:I1 gets replaced with:

    INDIRECT("$A"&$R3):INDIRECT("$I"&$R3)

and each reference to L1:P1 gets replaced with:

    INDIRECT("$L"&S$2):INDIRECT("$P"&S$2)

Watch the locations of the $ anchors in these substitutions. Cell S3 would look like this:

    =REPT(1,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),1)-COUNTIF(INDIRECT("$L"&S$2):INDIRECT("$P"&S$2),1))&
     REPT(2,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),2)-COUNTIF(INDIRECT("$L"&S$2):INDIRECT("$P"&S$2),2))&
     REPT(3,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),3)-COUNTIF(INDIRECT("$L"&S$2):INDIRECT("$P"&S$2),3))&
     REPT(4,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),4)-COUNTIF(INDIRECT("$L"&S$2):INDIRECT("$P"&S$2),4))&
     REPT(5,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),5)-COUNTIF(INDIRECT("$L"&S$2):INDIRECT("$P"&S$2),5))&
     REPT(6,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),6)-COUNTIF(INDIRECT("$L"&S$2):INDIRECT("$P"&S$2),6))&
     REPT(7,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),7)-COUNTIF(INDIRECT("$L"&S$2):INDIRECT("$P"&S$2),7))&
     REPT(8,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),8)-COUNTIF(INDIRECT("$L"&S$2):INDIRECT("$P"&S$2),8))&
     REPT(9,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),9)-COUNTIF(INDIRECT("$L"&S$2):INDIRECT("$P"&S$2),9))

Again, this is all one formula split across lines for readability. Once S3 is correct, copy this formula to fill all of the cells in the table. You will know if it is correct by this: references to List A records will be identical for all cells in the same row of the table and the $R3 will increment as you go down the table (second row of the table, this will be $R4). References to List B records will be identical as you go down a column. The S$2 will change column letter as you go across a row of the table (second column of the table, this will be T$2). As verification, T3 should look like this:

    =REPT(1,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),1)-COUNTIF(INDIRECT("$L"&T$2):INDIRECT("$P"&T$2),1))&
     REPT(2,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),2)-COUNTIF(INDIRECT("$L"&T$2):INDIRECT("$P"&T$2),2))&
     REPT(3,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),3)-COUNTIF(INDIRECT("$L"&T$2):INDIRECT("$P"&T$2),3))&
     REPT(4,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),4)-COUNTIF(INDIRECT("$L"&T$2):INDIRECT("$P"&T$2),4))&
     REPT(5,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),5)-COUNTIF(INDIRECT("$L"&T$2):INDIRECT("$P"&T$2),5))&
     REPT(6,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),6)-COUNTIF(INDIRECT("$L"&T$2):INDIRECT("$P"&T$2),6))&
     REPT(7,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),7)-COUNTIF(INDIRECT("$L"&T$2):INDIRECT("$P"&T$2),7))&
     REPT(8,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),8)-COUNTIF(INDIRECT("$L"&T$2):INDIRECT("$P"&T$2),8))&
     REPT(9,COUNTIF(INDIRECT("$A"&$R3):INDIRECT("$I"&$R3),9)-COUNTIF(INDIRECT("$L"&T$2):INDIRECT("$P"&T$2),9))

Each cell of the table will reflect the matchup between a B record and an A record. It will contain either the "remainder" or an error for no match. You want to get a summary of the results for each List A record. Each row of the table represents a List A record. The results can be summarized either at the end of the table (column AW), or to the right of the List A data (column J). The formula for the first summary will be:

=IF(ISERROR(S3),"",S3&" ")&IF(ISERROR(T3),"",T3&" ")&IF(ISERROR(U3),"",U3&" ")& ... &IF(ISERROR(AV3),"",AV3)

Rather than show all 30 terms here, this shows just the first three and the last. Follow the same pattern to add the rest. It builds the result string by concatenating the results of each match. If there is a value in a cell, it adds a space before the next value. If you want a different delimiter, change the space to something else, like comma space. Copy this formula down your summary column for all of the rows in List A.

12
  • taking more aspirin Commented Nov 2, 2014 at 2:20
  • Were you able to follow what the formulas do and how to replicate it for other problems? In terms of the mechanics of entering the formulas, you can save some hassle by copying the code block into a word processor, delete the line formatting and extra spaces, then copy that and paste it into the spreadsheet. Then you only need to copy and paste into 60,000 cells. Do that by dragging across to fill one row, then drag the row down 2,000 rows. Piece of cake.
    – fixer1234
    Commented Nov 2, 2014 at 3:23
  • something is wrong with the formula, when I copied the formula in T3, it is not giving me the correct remainder Commented Nov 2, 2014 at 7:00
  • manually moving the cell reference by dragging it seems to work but can you give me a formula that is off-the-shelf sort of speak like the one you gave in the previous answer because I am more interested in manipulating the data. I'll thinker with manipulating the formula ( i know already cause u taught me) later when I need to but this is the last data set that I need to sort through, hope u understand Commented Nov 2, 2014 at 7:17
  • I edited the answer to include what should be in T3. An easy way to verify long formulas like this is to copy the formula, paste it into an editor/word processor and insert line breaks so the chunks line up like what I did in the answer. Then anomalies stand out.
    – fixer1234
    Commented Nov 2, 2014 at 7:19
0

If you already have an answer for the match, then NOT(...) is likely to get you the non-matching results.

1
  • The issue isn't a match vs. a non-match. It is extracting the "unused" numbers in the larger array after matching a 5 number array to a 9 number array. This is similar to an earlier problem and it took me awhile to figure out what that question was asking for. I already submitted an edit on this question to clarify it because it is confusing as written. Give the edit time for review. It would be great if you saw another approach to solving this.
    – fixer1234
    Commented Nov 1, 2014 at 19:28

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .