9
\$\begingroup\$

Context

I'll be passing an interview coding test in java soon. I am experienced with other programming languages but am very unfamiliar with Java. I am looking for critiques of the coding style much more than of the algorithm or the performance, but I am unfamiliar with the java library so any pointers to appropriate library functions or data structures are welcome.

The problem

(Source: https://adventofcode.com/2023/day/5 )

We are given an input text file containing seed ranges and integer range maps. We must map all the seeds by applying all the maps one by one, then return the minimum of all the resulting ranges.

An example input file:

seeds: 79 14 55 13

seed-to-soil map:
50 98 2
52 50 48

soil-to-fertilizer map:
0 15 37
37 52 2
39 0 15

fertilizer-to-water map:
49 53 8
0 11 42
42 0 7
57 7 4

water-to-light map:
88 18 7
18 25 70

light-to-temperature map:
45 77 23
81 45 19
68 64 13

temperature-to-humidity map:
0 69 1
1 0 69

humidity-to-location map:
60 56 37
56 93 4

The seed ranges are given as pairs range_start range_length, so in this case the four numbers 79 14 55 13 represent two ranges [79, 79+14) and [55, 55+13).

The maps are given as triplets destination_range_start source_range_start range_length, so for instance the first map:

seed-to-soil map:
50 98 2
52 50 48

maps range [98, 98+2) to [50, 50+2) and range [50, 50+48) to [52, 52+48).

After applying the seven maps to the initial list of seed ranges, we get a list of location ranges, and we must output the smallest location number.

The actual input file is much longer than this example and contains numbers that don't fit in int, so I used long throughout.

I stored the seed ranges in a Queue<long[]> as an attribute of class SeedMaps, and the maps in a List<List<long[]>> also as an attribute of SeedMaps. There is only one instance of this class in the program.

Class ReadFile is used to parse the input file but it only has one static function and is never instanciated.

The main algorithm consists in popping a range from the queue, searching for a range in the map that intersects it, then depending on how they intersect, adding the subrange that was mapped to the list of resulting intervals, and adding the subrange that wasn't mapped, if any, back to the queue.

import java.io.File; // read file
import java.io.FileNotFoundException;  // read file
import java.util.Scanner; // read text files
import java.util.Map; // Map.Entry for pairs
import java.util.AbstractMap; // AbstractMap.SimpleImmutableEntry for pairs
import java.util.List; // list of seed-to-soil, soil-to-fertilizer, etc., maps
import java.util.ArrayList; // list of seed-to-soil, etc., maps
import java.util.Arrays; // Arrays.stream(...).mapToInt(Integer::parseInt).toArray();
import java.util.Queue; // queue of seed intervals
import java.util.LinkedList; // queue of seed intervals

class ReadFile {
  public static Map.Entry<long[], List<List<long[]>>> readMaps(String filename) {
    try {
      File myInputFile = new File(filename);
      Scanner myReader = new Scanner(myInputFile);
      long[] seeds = null;
      List<List<long[]>> maps = List.of(new ArrayList<long[]>(), new ArrayList<long[]>(), new ArrayList<long[]>(), new ArrayList<long[]>(), new ArrayList<long[]>(), new ArrayList<long[]>(), new ArrayList<long[]>());
      //List<List<long[]>> maps = new ArrayList<List<long[]>>();
      //maps.add(new ArrayList<long[]>());
      int mapIndex = -1;
      while (myReader.hasNextLine()) {
        String line = myReader.nextLine();
        if (line.startsWith("seeds:")) {
          String[] seedsAsStrings = line.split(":", 2)[1].trim().split("\s+");
          seeds = Arrays.stream(seedsAsStrings).mapToLong(Long::parseLong).toArray();
        } else if (line.indexOf("map:") != -1) {
          mapIndex += 1;
        } else if (line.length() < 2) {
          ;
        } else {
          long[] numbers = Arrays.stream(line.split("\s+")).mapToLong(Long::parseLong).toArray();
          long sourceStart = numbers[1];
          long sourceEnd = numbers[1] + numbers[2];
          long increment = numbers[0] - numbers[1];
          long[] startEndIncr = {sourceStart, sourceEnd, increment};
          maps.get(mapIndex).add(startEndIncr);
        }

      }
      myReader.close();
      return new AbstractMap.SimpleImmutableEntry<>(seeds, maps);
    } catch (FileNotFoundException e) {
      System.out.println("Error: File not found.");
      e.printStackTrace();
    }
    return null;
  }
}

class SeedMaps {
  private List<List<long[]>> maps;
  private Queue<long[]> seedsQueue;

  public SeedMaps(long[] seedsArray, List<List<long[]>> maps) {
    this.maps = maps;
    this.seedsQueue = new LinkedList<long[]>();
    for (int i = 0; i + 1 < seedsArray.length; i += 2) {
      long[] seedInterval = {seedsArray[i], seedsArray[i] + seedsArray[i+1]};
      this.seedsQueue.add(seedInterval);
    }
  }

  void mapIntervalAndUpdateQueues(long[] range, long[] sourceInterval, Queue<long[]> destQueue) {
    if (range[0] <= sourceInterval[0]) {
      if (range[1] < sourceInterval[1]) { // r[0] <= src[0] <= r[1] < src[1]
        long[] destInterval = {sourceInterval[0] + range[2], range[1] + range[2]};
        long[] sourceRightInterval = {range[1], sourceInterval[1]};
        seedsQueue.add(sourceRightInterval);
        destQueue.add(destInterval);
      } else { // source entirely contained in range
        long[] destInterval = {sourceInterval[0] + range[2], sourceInterval[1] + range[2]};
        destQueue.add(destInterval);
      }
    } else {// src[0] < r[0] <= src[1]
      if (sourceInterval[1] <= range[1]) { // src[0] < r[0] < src[1] <= r[1]
        long[] destInterval = {range[0] + range[2], sourceInterval[1] + range[2]};
        long[] sourceLeftInterval = {sourceInterval[0], range[0]};
        seedsQueue.add(sourceLeftInterval);
        destQueue.add(destInterval);
      } else { // src[0] < r[0] < r[1] < src[1]
        long[] destInterval = {range[0], range[1]};
        long[] sourceRightInterval = {range[1], sourceInterval[1]};
        long[] sourceLeftInterval = {sourceInterval[0], range[0]};
        seedsQueue.add(sourceLeftInterval);
        seedsQueue.add(sourceRightInterval);
        destQueue.add(destInterval);
      }
    }
  }
  void applyMapToSeeds(int mapIndex) {
    Queue<long[]> destQueue = new LinkedList<>();
    while (!seedsQueue.isEmpty()) {
      long[] sourceInterval = seedsQueue.remove();
      boolean foundIntersection = false;
      for (long[] range : maps.get(mapIndex)) { // range = {start, end, increment}
        if (sourceInterval[0] < range[1] && range[0] < sourceInterval[1]) {
          foundIntersection = true;
          mapIntervalAndUpdateQueues(range, sourceInterval, destQueue);
          break;
        }
      }
      if (!foundIntersection) { // interval is not modified by map
        destQueue.add(sourceInterval);
      }
    }
    seedsQueue = destQueue;
  }

  public void applyMapsToSeeds() {
    for (int mapIndex = 0; mapIndex < 7; ++mapIndex) {
      applyMapToSeeds(mapIndex);
    }
  }

  public void printSeeds(String name) {
    System.out.println(name + ": ");
    for (long[] interval: seedsQueue) {
      System.out.print("[" + interval[0] + "-" + interval[1] + ") ");
    }
    System.out.println();
  }
  public void printMaps() {
    for (int mapIndex = 0; mapIndex < 7; ++mapIndex) {
      System.out.println("\nMap " + mapIndex);
      for (long[] line : maps.get(mapIndex)) {
        //System.out.print(  "[" + line[1] + "-" + (line[1] + line[2]) + ") --> ");
        //System.out.println((line[0] - line[1] >= 0 ? "+" : "") + (line[0] - line[1]));
        System.out.print(  "[" + line[0] + "-" + line[1] + ") --> ");
        System.out.println((line[2] >= 0 ? "+" : "") + line[2]);
        //System.out.println("[" + line[0] + "-" + (line[0] + line[2]) + "]");
      }
    }
  }

  public long getMinSeed() {
    List<Long> ll = seedsQueue.stream().map((interval) -> interval[0]).toList();
    return ll.stream().mapToLong((x) -> x).min().getAsLong();
  }
}

public class Part2 {
    public static void main(String[] args) {
        String filename = args[0];
    System.out.println("Reading from input file: " + filename);
    Map.Entry<long[], List<List<long[]>>> seedsAndMaps = ReadFile.readMaps(filename);
    SeedMaps maps = new SeedMaps(seedsAndMaps.getKey(), seedsAndMaps.getValue());

    maps.printSeeds("Seeds");

    maps.applyMapsToSeeds();

    maps.printSeeds("Locations");

    System.out.println("Minimum location: " + maps.getMinSeed());
    }
}

On this example input file, the output is:

Reading from input file: input0.txt
Seeds:
[79-93) [55-68)
Locations:
[60-61) [86-90) [94-97) [82-85) [46-56) [56-60) [97-99)
Minimum location: 46
\$\endgroup\$
1
  • 2
    \$\begingroup\$ You might start by using an autoformatter. \$\endgroup\$
    – ggorlen
    Commented Apr 29 at 22:36

2 Answers 2

9
\$\begingroup\$

Your code needs a complete rewrite. I'll show you some pointers towards the next iteration.

import java.io.File; // read file

Forget these comments. The reason why you need an import should be evident from the code. If you need to explain an import, your class may be too big and should be refactored. The first step would be to put each class in its own file.

class ReadFile {

Class names in imperative are odd. I'd expect the class to be named something like SeedParser or similar. I would not name it Reader as Readers have a specific meaning in Java. And I would change the parser to operate on a Reader instead of a File. That way you can write tests using StringReader instead of having to generate a physical file for each test case.

public static Map.Entry<long[], List<List<long[]>>> readMaps(String filename) {

Oh dear. A map of long array to list of list of long arrays... Have you tried documenting that? :) It's almost never a good idea to create nested generic collections or to use arrays as map keys. You should create a domain specific class for holding this data structure and provide domain specific meaningful methods for accessing it. Having human readable method names reduces the need for writing documentation.

The rest of the method just assumes that the input file format is correct and does not do any error checking. This results in a multitude of possible exceptions being thrown, none of which you have documented. Whoever ends up using this code will have to throw a wide net and catch (Exception ex) { ... which is never a good thing.

You should use try with resources when you are working with Files, InputStreams and Readers.

} else if (line.length() < 2) {
    ;

Instead of writing a lone semicolon, write a comment explaining why the branch can be ignored. I'm not going to dig too deep into this readFile method as it's just a huge wall of text with very little structuring. You should try to find logical code blocks from the method and add an empty line between them to help the reader.

List<List<long[]>> maps = List.of(new ArrayList<long[]>(), new ArrayList<long[]>(), new ArrayList<long[]>(), new ArrayList<long[]>(), new ArrayList<long[]>(), new ArrayList<long[]>(), new ArrayList<long[]>());

I was expecting a data structure that contained several Map instances instead of a nested list when I saw the name maps. Use a name that describes the purpose of the variable. This is not a Java-specific thing, you should do this in all languages.

long[] numbers = Arrays.stream(line.split("\s+")).mapToLong(Long::parseLong).toArray();

Numbers is always a bad variable name, it's way too generic. We already know that they are numbers because the data type is a long array, so saying that they're "just numbers" is redundant. Again, use a name that tells the reader what the numbers mean.

void mapIntervalAndUpdateQueues(long[] range, long[] sourceInterval, Queue<long[]> destQueue) {

Instead of working with magic length arrays, you should create a Range class that has human readable names for the range start and end (or length, whichever approach you end up using in your code). you can put validation of constructor parameters to that class so you always know that the range you are working on is valid.

Oh, and remember that the most important thing you can do in an interview coding assignment is to ask questions. It shows that you are able to spot the (maybe intentional) ambiguities in the assignment. Good luck.

\$\endgroup\$
2
  • \$\begingroup\$ Thanks for all the great advice! Map.Entry<long[], List<List<long[]>>> was not meant to be a map of long array to list of list of long arrays. I was looking on stackoverflow what they recommended when a function is supposed to return or modify more than one variable. In some languages a function can return a pair, in some languages a function can have an "out" parameter. I found three stackoverflow questions about it and they all had several answers recommending to use Map.Entry to return a pair. \$\endgroup\$
    – Stef
    Commented May 3 at 8:40
  • \$\begingroup\$ Now I'm thinking it's a bad idea and I should instead try to use dedicated types, and try to never write a function that returns a pair of unrelated variables. In this case I could return a SeedMaps object directly. \$\endgroup\$
    – Stef
    Commented May 3 at 8:40
7
+50
\$\begingroup\$

Use proper types / Avoid abusing Arrays and Collections

Put on yourself in the shoes of the code reader and try to answer the question: "What do they make of this type?"

Map.Entry<long[], List<List<long[]>>>

It's not quite right at all... in many ways.

Starting with the simplest:

  • Favor collections over array, and avoid mixing them.
  • Map.Entry used outside Map.ofEntries() is a smell that the code author is skimping on defining proper domain types. And by the way, since Java 9 we have Map.entry(), no need to use nested classes of AbstractMap; but the point remains: an instance of employing map-entry beside the case of defining an immutable map using method ofEntries() is likely an abuse.
  • Don't use neither arrays, no collections to substitute your domain objects.

Utilizing built-in data containers in such way make the code convoluted and difficult to follow. It hides the intent of the code from the reader.

I understand that the task itself is very contrived: there are some numbers in the file, they relate to each other in some way, read the file and do some number-crunching. This challenge gives little room for applying proper design because all types (seed, fertilizer, etc.) are deprived of business logic and properties. But despite that, the implementation might look better.

When I said about using proper domain types, I didn't mean something like SeedMaps which in your code models these weird conversions between the types that the challenge author came up with, rather that the types (Seed, Location, etc.).

How can the reader of the code get a clue that a notion of Location is hiding somewhere inside this monstrosity Map.Entry<long[],List<List<long[]>>>?

(remained: finding the nearest location is the goal of the challenge)

Try to look at your code from the reader's perspective

Methods such as readMaps() and mapIntervalAndUpdateQueues() bear too much cognitive load, they are exceedingly long and filled to the brim with low-level logic. That the indicator of a deficient design, don't blind on it.

Closing Resources

Method readMaps() might leak the resource, for instance, if a file is malformed and parsing fails. In such scenario, the system will not be notified that the file is no longer used because method close() is invoked only inside the try.

Either use try-with-resources (as TorbenPutkonen suggested) or move the invocation of close() to the finally block (which is more noisy).

Don't keep commented out code in place

Simply remove it.

Use Git, don't pollute code elements by keeping commented out old code. You can always reset the commit if needed.

Avoid returning null

Depending on the use case, either throw an Exception of an appropriate type, or return an Optional. "Happy path" should produce an optional containing a value, and other brunches should yield an empty optional.

To get a better idea on how to leverage Optional to its full power take a look at this article from the Oracle's Java Magazine 12 recipes for using the Optional class as it’s meant to be used.

These are also a few other advanced approaches like Null-object pattern, which are less commonly used.

Refactoring

As I said, the challenge itself is rather a cartoon example and has nothing to do with how domain objects might interact with each other in a real system.

In a real world scenario, we're unlikely to use such kind of generic converters as shown below, Fertilizer would probably be a rich domain entity, Temperature and Humidity will be value objects, and we wouldn't be literally "converting" them into one another.

So please, focus on the overarching design.

Firstly, let's introduce proper types Seed, Soil, Location, etc., regardless of how much data they are intended to hold in this dummy task, they are doing a very important job of making the code reader familiar with the fundamental notions involved in the problem at hand. They are no longer array indices, but explicit concepts.

To describe conversions between these types, we can define a generic class Converter tasked with transforming one domain object into another. Converter would require all this information about ranges, so to let's also define ConversionRange record to represent them.

Converter:

/**
 * @param <S> the type of the source object
 * @param <D> the type of the destination object
 */
public class Converter<S extends Convertible, D> {
    private final List<ConversionRange> ranges;
    private final LongFunction<D> destFactory;
    
    public Converter(List<ConversionRange> ranges,
                     LongFunction<D> destFactory) {
        this.ranges      = ranges;
        this.destFactory = destFactory;
    }
    
    public D convert(S source) {
        long sourceValue = source.getConversionValue();
        
        return ranges.stream()
            .filter(range -> range.isWithinRange(sourceValue))
            .findFirst()
            .map(range -> getDestValue(range, sourceValue))
            .map(destFactory::apply)
            .orElseGet(() -> destFactory.apply(sourceValue));
    }
    
    private long getDestValue(ConversionRange range, long sourceValue) {
        return range.destStart() + sourceValue - range.sourceStart();
    }
}

ConversionRange is a perfect candidate to be a Java 16 record, a transparent data-carrier:

public record ConversionRange(long sourceStart, long destStart, long rangeLength) {
    private static final Pattern WHITE_SPACE = Pattern.compile("\\s+");
    private static final Predicate<String> IS_VALID_RANGE = Pattern
        .compile("\\d+\\s+\\d+\\s+\\d+")
        .asMatchPredicate();
    
    public boolean isWithinRange(long sourceId) {
        return sourceId >= sourceStart
            && sourceId <  sourceStart + rangeLength;
    }
    
    public static ConversionRange parse(String input) {
        var values = WHITE_SPACE.split(input);
        
        return new ConversionRange(
            Long.parseLong(values[1]),
            Long.parseLong(values[0]),
            Long.parseLong(values[2])
        );
    }
    
    public static boolean isValid(String rowRange) {
        return IS_VALID_RANGE.test(rowRange);
    }
}

Interface Convertible that all domain types a supposed to implement:

note that it is inappropriate for them to have a common parent class because it doesn't make sense for Location, Humidity and Firtilizer to have any sort of hierarchical relationships

public interface Convertible {
    long getConversionValue();
}

Domain types:

public record Seed(long id) implements Convertible {
    @Override
    public long getConversionValue() { return id; }
}

// Soil, Fertilizer, etc. not  shown for bravity

public record Location(long id) implements Convertible {
    @Override
    public long getConversionValue() { return id; }
}

These are the abstractions representing the basic notions in the task at hand.

Now it's time to address the task of finding the nearest location, the approach can be described as follows:

  • parse seeds;
  • parse converters;
  • apply all converters in the stream to produce a stream of locations from a stream of seeds, and then find the nearest one.

These hi-level methods representing this logic:

private static final Logger LOGGER = LoggerFactory.getLogger(NameOfEnclosingClassHere.class);

public static Optional<Location> nearestLocation(Path path) {
    try (BufferedReader reader = Files.newBufferedReader(path)) {
        return parseFile(reader);
    } catch (IOException e) {
        LOGGER.error("failed to read the file " + path, e);
        return Optional.empty();
    }
}

private static Optional<Location> parseFile(BufferedReader reader) throws IOException {
    return parseSeeds(reader)
        .map(parseSoilConverter(reader)::convert)
        // ... apply other converters
        .map(parseLocationConverter(reader)::convert)
        .min(comparingLong(Location::getConversionValue));
}

The code above shows a simple example of how to log an exception. You can just add logback-classic dependency and it'll work out of the box, autoconfigured console appender is used by default. For more information take a look at Logback manual.

Also note, that it's not nessesary to handle exceptions right on the spot where they occur, instead exception handling in many applications is centralized (frameworks, like Spring, are offering capabilities for this). So, an alternative option is to return a valid location (which is not null) and if something went wrong throw a custom subtype of RuntimeException, discrabing your use case, that wraps caught checked exception.

Note that in method references like parseSoilConverter(reader)::convertthat invocation of parseSoilConverter() happence only once when the object representing the method reference is being constructed (not when method reference is being evaluated for every stream element)

Below are methods containing the actual low-level logic conserned with reading and parsing data.

An invocation of advanceToMarker() that moves the reader to a line containg the target data precedes each phase of parsing.

Parsing Seeds:

private static final String SEEDS = "seeds:";
private static final String SOIL_MAPPING = "seed-to-soil map:";
// other markers expected to be present in the file
private static final String LOCATION_MAPPING = "humidity-to-location map:";

private static final Pattern WHITE_SPACE = Pattern.compile("\\s+");
private static final Predicate<String> IS_NUMBER = Pattern
    .compile("\\d+")
    .asMatchPredicate();

private static Stream<Seed> parseSeeds(BufferedReader reader) throws IOException {
    return WHITE_SPACE
        .splitAsStream(advanceToMarker(SEEDS, reader))
        .filter(IS_NUMBER)
        .map(Integer::valueOf)
        .map(Seed::new);
}

private static String advanceToMarker(String marker,
                                      BufferedReader reader) throws IOException {
    String line = "";
    while (!line.startsWith(marker)) {
        line = reader.readLine();
    }
    return line.strip();
}

Core logic for parsing Converters:

private static <S extends Convertible, D> Converter<S, D> parseConverter(List<String> rowRanges,
                                                                         LongFunction<D> destFactory) throws IOException {
    return rowRanges.stream()
        .filter(ConversionRange::isValid)
        .map(ConversionRange::parse)
        .collect(collectingAndThen(
            toList(),
            ranges -> new Converter<>(ranges, destFactory)
        ));
}

private static List<String> readRanges(BufferedReader reader) throws IOException {
    List<String> rowRanges = new ArrayList<>();
    String line;
    while (!(line = reader.readLine()).isBlank()) {
        rowRanges.add(line);
    }
    return rowRanges;
}

Helper-methods for parsing individual converters:

private static Converter<Seed, Soil> parseSoilConverter(BufferedReader reader) throws IOException {
    advanceToMarker(SOIL_MAPPING, reader);
    return parseConverter(readRanges(reader), Soil::new);
}

private static Converter<Soil, Fertilizer> parseFertilizerConverter(BufferedReader reader) throws IOException {
    advanceToMarker(FERTILIZER_MAPPING, reader);
    return parseConverter(readRanges(reader), Fertilizer::new);
}

// methods for parsing converters of Water, Light, etc. implemented analogously

private static Converter<Humidity, Location> parseLocationConverter(BufferedReader reader) throws IOException {
    advanceToMarker(LOCATION_MAPPING, reader);
    return parseConverter(readRanges(reader), Location::new);
}

Finally, client code using nearestLocation()

public static void main(String[] args) {
    Path file = Path.of(" Path to the file here "); // either relative or absolute
    
    nearestLocation(file)
        .ifPresent(System.out::println);
}

And if you need to retry in case if something went wrong and the Optional is empty, you need only add one call in this fluent chain:

nearestLocation(file)
    .or(() -> nearestLocation(alternativeFile))
    .ifPresent(System.out::println);
\$\endgroup\$
2
  • \$\begingroup\$ Thanks for the ideas! This task is indeed a bit too number-crunchy but I'll try to use more domain types in the next tasks. \$\endgroup\$
    – Stef
    Commented May 3 at 8:36
  • 1
    \$\begingroup\$ @Stef I'm glad that you found this review helpful and appreciate your decision to distinguish it. I revised it and added some clarifications regarding logging and exception handling, simplified some parts of code and added validation so that parsing is guaranteed to succeed. So you might be interested in revisiting it. \$\endgroup\$ Commented May 7 at 20:47

Not the answer you're looking for? Browse other questions tagged or ask your own question.