0

0

I have an Excel file that contains the data of a website's users. In column A I have their username, and in column B I have their email address. Unfortunately, there are many duplicates: both inside the same column, and across.

Example:

  1. Thelegend28 | [email protected]
  2. timmyhs | [email protected]
  3. l33tu53r | [email protected]
  4. Thelegend28 | [email protected]
  5. 2l33t4u | [email protected]
  6. timmyhs | [email protected]

As you can see, not only do I have users that are registered twice with the same username and email (2. and 6.), but I also sometimes have different unique usernames linked to the same email (3. and 5.) and usernames that are linked to more than one address (1. and 4.).

What I need to do, if possible, is format these three occurrences differently.

Of course, any help is greatly appreciated. I'm just a noob, but I'm trying to learn. Thank you all in advance.

1
  • What type of formatting are you trying achieve? You could highlight duplicates with a conditional formatting rule. For example, all duplicates are pink. Is that what you're trying to accomplish?
    – Isolated
    Commented Nov 18, 2020 at 15:27

2 Answers 2

0

Try to add 2 columns with formula COUNTIF:

Column C: =COUNTIF($A$2:$A$7,A2)

Column D: =COUNTIF($B$2:$B$7,B2)

You will get the result as shown in the following picture: enter image description here

Then you can use Conditional Formatting with these formula rules:

=AND($C2=1,$D2>1)

=AND($C2>1,$D2=1)

=AND($C2>1,$D2>1)

enter image description here

0

As always there are many ways to skin an excel cat.

  1. Go with pivot tables and count the number of occurrences in three separate tables (pivot tables are easier than they sound - just watch a youtube vid on it - it will change your outlook on life uhm I mean Excel). One table for duplicate usernames, one for duplicate email addresses and one for a both (combine A1 & B1 in a new column C with =A1&";"&B1). Thats one pivot table per column. This could work well for manual processing (such as bulk emailing users, updating the website db etc), but not so much for deleting or editing duplicate rows in the source spreadsheet. Ps. Don't forget that you can "drill down" from a pivot table by double-clicking a cell.

  2. For the color formatting on the original data, conditional formatting has got you covered. Keep your new column C as above. Select one column at a time, click Home > Conditional formatting > Highlight Cell Rules > Duplicate values... and set your preferred formatting for each column. This method could give different color formatting within one row as opposed to one row being tagged as belonging to one category exclusively.

  3. If you want to be a bit more specific use the COUNTIF function to categorize each row as follows. Again keep your new column C as above. Say your data is A1:C10 then in D1 put

    =IF(COUNTIF($A$1:$A$10, A1)>1,1,0) + IF(COUNTIF($B$1:$B$10, B1)>1,2,0) + IF(COUNTIF($C$1:$C$10, C1)>1,1,0)

    This will give you 0 for uniques, 1 for duplicate usernames, 2 for duplicate email addresses, 4 for complete duplicates and 3 for the special class that has both duplicate emails and usernames but seperately (e.g. john [email protected]; john [email protected]; johnny [email protected]). You could then conditionally format column D using the a custom "icon set" with 5 different icons for (0,1,2,3,4). enter image description here

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .