1

I have a table, where one column is intended to represent unique IDs. Is it possible, to check with a formula, if the uniqueness criterion is fulfilled?

For instance, let's say the table is currently saying

ID        Summary
T1204     Fix user crash.
T1201     Fix tester crash.
T1202     Implement that feature.
T1203     Implement that other feature.

Let's say I now create a new entry

T1204     Make program say "Hello World." on startup

I want a formula that becomes TRUE, if such a duplicate occurs, or a formula that counts the number of such duplicates.

Constraint: Assume, that the columns need to remain ordered independently of the ID column -- e.g. by the category specified in an additional column.

2 Answers 2

1

Much more simply just sort the ID column and then in another column enter a formula that checks if the previous value is equal =if(A2=A1;TRUE;FALSE)

1
  • That assumes, that sorting is possible though. In the use-case that motivated my write-up, the IDs had to be vertically sorted by status. Good point for clarification though.
    – kdb
    Commented Jul 7, 2020 at 17:45
1

The relevant trick is to use Array Formulas.

In the given example:

    |  A      |  B                                |  C(Values) |  C(Formulae)
----|---------|-----------------------------------|------------|---------------------------
 1  |  ID     |  Summary                          |            |
 2  |  T1204  |  Fix user crash.                  |  2         |  {=COUNTIF(A2:A11;A2:A11)}
 3  |  T1201  |  Fix tester crash.                |  1         |
 4  |  T1202  |  Implement that feature.          |  1         |
 5  |  T1203  |  Implement that other feature.    |  3         |
 6  |         |                                   |  0         |
 7  |  T1204  |  Make program say "Hello World."  |  2         |
 8  |  T1203  |  Another duplicate                |  3         |
 9  |  T1203  |                                   |  3         |
10  |         |                                   |  0         |
11  |         |                                   |  0         |

Duplicates can then be recognized by column entries larger than one.

How to enter an array formula

The notation {=...} is used by Libreoffice to denote an Array formula, but cannot be used to enter it. Instead, the formula has to be entered normally as =COUNTIF(A2:A11;A2:A11) but then be declared an array formula, by either:

  • Typing a formula containing ranges, and pressing Ctrl+Shift+Enter, or
  • In the formula dialog (Ctrl+F2) tick the "Array" checkbox.

O(N²) time dependency

The solution does however have quadrativ time dependency; Try to enter a formula far the entire column, such as {=COUNTIF(A:A;A:A)}, and you have a good change of crashing LibreOffice.

What it does should effectively amount to:

for i from 2 to 11
    set cell C,i to sum over:
        for j from 2 to 11:
            1 if cell A,i matches cell A,j
            0 otherwise

In practice, I've found it hard to reproduce, what array formulas REALLY do, and changing them usually involves some trial-and-error.

Alternative: Detect IF there are duplicates.

Alternatively, the array formula {=SUM(COUNTIF(A2:A11;A2:A11)) - COUNTA(A2:A11)} can be used, which will be positive if there are duplicates and zero otherwise. The advantage of this approach is that the array formula can interfere with editing features, such as Drag&Drop of whole table rows.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .