0

I have a dataset of 100,000 rows. It is set up in such a way that Column A contains a group name, and then repeats the group name for the number of unique members of that group. I am trying to get a count of how many times a value appears twice and only twice.

  • A value will never appear only once - there will always be "group name" immediately followed by however many members are in that group, in individual rows. So for any distinct entity, there are always at least 2 rows.
  • More often than not, a value will appear 3 or more times.
  • If the value appears 3 or more times, I do not want to include any of those rows in the count. I'm really looking for the number of times a distinct pair appears.

2 Answers 2

1

Copy the following formula down in column b:

=COUNTIF($A$1:$A$100000,A1)=2

This will identify all pairs. It will however show both entries in the pair. What I normally do in these cases (even though it breaks the data providence) is copy this filtered list to another sheet, and remove duplicates.

If you want to do this in a repeatable way that maintains providence, then I'd recommend using an unique list type array formula after performing the count

Option B, is to use a pivot table, placing your values in column A on the rows, count(A) in the values, and filtering rows on values where count=2

0
0

Assuming your data is in A1:A100000

1) Copy all the unique values in a separate column (using Data -> advanced filter) -- I'll use column C for my example.

2) Put the following formula: =Countif(A$1:A$100000, C1) in D1, then fill each cell in D for every Unique value in C.

3) In another cell use the follwing formula: =Countif(D1:D??, 2) (where ?? is the last row of columns C and D)

1
  • I think this approach was similar to the previous answer, but I saw that one first. Thanks for answering! Commented Feb 28, 2018 at 18:22

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .