0

I have a List of timestamps in milliseconds and I want to compare them and remove duplicates not considering the milliseconds part. And process each unique value.

For example, millis2 and millis3 are different values if compared without truncating the milliseconds part (2:28:14.100 vs 2:28:14.200). But I need to disregard the millis and if the two values are compared truncated to seconds, they will be considered duplicates.

So I decided to create a List of timestamps, sort it in reverse order. Then iterate over the collection checking if truncated values are not equal. And add unique values to a List<Long> deduped.

    Long millis0 = 1554052261000L; // Sunday, March 31, 2019 5:11:01 PM
    Long millis1 = 1557023292000L; // Sunday, May 5, 2019 2:28:12 AM
    Long millis2 = 1557023294100L; // Sunday, May 5, 2019 2:28:14.100 AM
    Long millis3 = 1557023294200L; // Sunday, May 5, 2019 2:28:14.200 AM

    List<Long> initialTimestamps = Arrays.asList(millis2, millis3, millis0, millis1);

    Comparator<Long> comparator = Collections.reverseOrder();
    Collections.sort(initialTimestamps, comparator);

    Long prevTs = null;
    List<Long> deduped = new ArrayList<>();

    for (Long ts: initialTimestamps) {
        if (prevTs != null && !millisToSeconds(prevTs).equals(millisToSeconds(ts))) {
        deduped.add(prevTs);
        process(prevTs)
    }
    prevTs = ts;
    deduped.add(prevTs);
    process(prevTs)
}

However when printing out the contents of deduped, there are duplicates:

Deduped timestamps ->
1557023294200
1557023294100
1557023294100
1557023292000
1557023292000
1554052261000

But I expect that after deduplication there will remain only 1557023294, 1557023292 and 1554052261. What am I missing here?

1
  • You are calling deduped.add(prevTs) outside of the if-condition, i.e., whether or not this is a new value.
    – floxbr
    Commented Apr 24, 2019 at 11:56

1 Answer 1

4

If you can use java 8, then you can use stream().distinct():

public static void main(String[] args) throws Exception {
    Long millis0 = 1554052261000L; // Sunday, March 31, 2019 5:11:01 PM
    Long millis1 = 1557023292000L; // Sunday, May 5, 2019 2:28:12 AM
    Long millis2 = 1557023294100L; // Sunday, May 5, 2019 2:28:14.100 AM
    Long millis3 = 1557023294200L; // Sunday, May 5, 2019 2:28:14.200 AM

    List<Long> initialTimestamps = Arrays.asList(millis2, millis3, millis0, millis1);
    List<Long> unique = initialTimestamps.stream().distinct().collect(Collectors.toList());

    System.out.println(unique);
}

For java < 8, you can put them in a Set:

public static void main(String[] args) throws Exception {
    Long millis0 = 100L; // Sunday, March 31, 2019 5:11:01 PM
    Long millis1 = 100L; // Sunday, May 5, 2019 2:28:12 AM
    Long millis2 = 200L; // Sunday, May 5, 2019 2:28:14.100 AM
    Long millis3 = 200L; // Sunday, May 5, 2019 2:28:14.200 AM

    List<Long> initialTimestamps = Arrays.asList(millis2, millis3, millis0, millis1);
    Set<Long> unique = new HashSet<Long>(initialTimestamps);

    System.out.println(unique);
}

Update

As per your requirement to ignore the milliseconds, you can use a Map (if you want to preserve the millis) or use one of the approaches above, if you do not care about the milliseconds. In that case just divide the values by 1_000

public static void main(String[] args) throws Exception {
    Long millis0 = 1554052261000L; // Sunday, March 31, 2019 5:11:01 PM
    Long millis1 = 1557023292000L; // Sunday, May 5, 2019 2:28:12 AM
    Long millis2 = 1557023294100L; // Sunday, May 5, 2019 2:28:14.100 AM
    Long millis3 = 1557023294200L; // Sunday, May 5, 2019 2:28:14.200 AM

    List<Long> initialTimestamps = Arrays.asList(millis2, millis3, millis0, millis1);
    Map<Long, Long> unique = new HashMap<>();

    for (Long timestamp : initialTimestamps) {
        unique.put(timestamp / 1000, timestamp);
    }

    System.out.println(unique.values());
}

If you want to preserve the first value of each duplicate, then use

if (!unique.containsKey(timestamp / 1000)) {
    unique.put(timestamp / 1000, timestamp);
}

instead of just put(). If you want to preserve the initial order of all timestamps, the you should use LinkedHashMap instead of HashMap

5
  • Unfortunately I can use only Java 7
    – samba
    Commented Apr 24, 2019 at 11:52
  • 1
    @Deadpool yeah, he updated teh question after I gave my answer. I've updated mine as well Commented Apr 24, 2019 at 11:58
  • Thanks. The only thing is that the order matters and when filtering out (1557023294200 vs 1557023294100) I have to leave 1557023294200 instead of 1557023294100
    – samba
    Commented Apr 24, 2019 at 12:03
  • @samba You mean you always want the first value ? Commented Apr 24, 2019 at 12:05
  • Yes, that's why my initial idea was to first sort the timestamps in reverseOrder
    – samba
    Commented Apr 24, 2019 at 12:05

Not the answer you're looking for? Browse other questions tagged or ask your own question.