7
\$\begingroup\$

I have this code which I wrote to parse arithmetic expressions.

However, many people say there are problems with it but don't tell me what the problem is. Please let me know if you see any.

Note that I know there is another way to do this. But I did it specifically this way as I am influenced by NLP parsing style and I wanted to do it the NLP style as well.

public class Parser {

private static final String P_CLOSE = ")";
private static final String P_OPEN = "(";


private Map<Integer, Expression> frontCachingChart = new HashMap<>();
private Map<Integer, Expression> backCachingChart = new HashMap<>();

// maps opener to closer
private Map<Integer, Integer> closingMapper = new HashMap<>();
// maps closer to opener
private Map<Integer, Integer> openingMapper = new HashMap<>();

private LinkedList<Op> opsList = new LinkedList<>();

public Expression parse(String input) {
    init();

    String org = input;
    String[] tokens = extractTokens(input);

    LinkedList<Integer> stack = new LinkedList<>();
    int depth = 0;
    for (int i = 0; i < tokens.length; i++) {
        String token = tokens[i];
        if (P_OPEN.equals(token)) {
            depth++;
            stack.push(i);
        } else if (P_CLOSE.equals(token)) {
            depth--;
            Integer opener = stack.pop();
            closingMapper.put(opener, i);
            openingMapper.put(i, opener);
        } else {
            OpType opType = Op.getOp(token);
            if (opType != null) {
                Op op;
                if (!stack.isEmpty()) {
                    Integer opener = stack.peek();
                    op = new Op(i, depth, opener, opType);
                } else {
                    op = new Op(i, depth, 0, opType); // top level might have no opener
                    closingMapper.put(0, tokens.length - 1);
                }
                opsList.add(op);
            } else {
                // this is a literal
                Expression value = new Expression();
                value.literal = token;

                // literals must have open and close as well
                value.open = i;
                value.close = i;

                frontCachingChart.put(i, value);
                backCachingChart.put(i, value);

                // additional literal parsing for things like
                // ((a)) or (((a)))
                additionalWrappingParsing(i, i + 1, tokens, value);
            }
        }
    }
    Comparator<Op> opPriorityComparator = new Comparator<Op>() {
        @Override
        public int compare(Op o1, Op o2) {
            // we want high priority op type first
            return o2.type.priority - o1.type.priority;
        }
    };
    // make sure higher priority op types are looked at first
    Collections.sort(opsList, opPriorityComparator);

    // make sure deeper ops are looked at first, the sort is stable so higher priority of same depth come first
    // we will do bottom up parsing in this case to ease parsing left to right of expression like a + b + c + d
    Collections.sort(opsList);

    Expression ex = parse(tokens);
    ex.org = org;
    return ex;
}

private void init() {
    frontCachingChart.clear();
    backCachingChart.clear();
    closingMapper.clear();
    openingMapper.clear();
    opsList.clear();
}

private String[] extractTokens(String input) {
    input = input.replaceAll("\\(", " ( ");
    input = input.replaceAll("\\)", " ) ");
    input = input.replaceAll("\\+", " + ");
    input = input.replaceAll("-", " - ");
    input = input.replaceAll("\\*", " * ");
    input = input.replaceAll("/", " / ");

    String[] tokens = input.trim().split("\\s+");
    return tokens;
}

private Expression parse(String[] tokens) {

    // this loop will be linear only
    Expression ret = null;
    while(!opsList.isEmpty()) {
        Op oneOp = opsList.poll();
        Expression subExpression = new Expression();

        subExpression.op = oneOp;
        subExpression.open = oneOp.opener;
        subExpression.close = closingMapper.get(subExpression.open);

        // working on left hand side
        int leftOfOpIndex = oneOp.index - 1;
        subExpression.left = backCachingChart.get(leftOfOpIndex);
        if (subExpression.left != null) {
            subExpression.open = subExpression.left.open;
        }

        // working on right hand side
        int rightOfOpIndex = oneOp.index + 1;
        subExpression.right = frontCachingChart.get(rightOfOpIndex);
        subExpression.close = subExpression.right.close;

        // simplify the special case of "(-2)"
        if (subExpression.left == null && subExpression.op.type == OpType.MINUS) {
            subExpression.literal = "-" + subExpression.right.literal;
            subExpression.op = null;
            subExpression.open = oneOp.index;
        }

        frontCachingChart.put(subExpression.open, subExpression);
        backCachingChart.put(subExpression.close, subExpression);

        additionalWrappingParsing(subExpression.open,
                subExpression.close + 1, tokens, subExpression);
        ret = subExpression;
    }

    return ret;
}

/**
 * Dealing with cases of '((a))' or '(((a)))' or '(((a + b)))'
 * @param subExpression 
 */
private void additionalWrappingParsing(int start, int end, String[] tokens, Expression expression) {
    int diff = 1;
    int leftIndex = start - diff;
    int rightIndex = end + diff - 1;
    while(leftIndex >= 0 && rightIndex < tokens.length) {
        String left = tokens[leftIndex];
        String right = tokens[rightIndex];
        if (P_OPEN.equals(left) && P_CLOSE.equals(right)) {
            // literals must have open and close too
            expression.open = leftIndex;
            expression.close = rightIndex;

            frontCachingChart.put(leftIndex, expression);
            backCachingChart.put(rightIndex, expression);
        } else {
            break;
        }
        diff++;
        leftIndex= start - diff;
        rightIndex = end + diff - 1;
    }
}

static class Expression {
    int open;
    int close;
    Op op;

    Expression left;
    Expression right;

    String literal;

    String org;

    @Override
    public String toString() {
        if (literal != null) {
            return literal;
        } else {
            return P_OPEN + left + ") " + op.type + " (" + right + P_CLOSE;
        }
    }
}

static class Op implements Comparable<Op> {
    OpType type;
    int index;
    int depth;
    int opener;

    Op(int index, int depth, int opener, OpType type) {
        this.index = index;
        this.depth = depth;
        this.opener = opener;
        this.type = type;
    }

    @Override
    public int compareTo(Op o) {
        return o.depth - depth;
    }

    public static OpType getOp(String token) {
        if ("+".equals(token)) {
            return OpType.PLUS;
        } else if ("-".equals(token)) {
            return OpType.MINUS;
        } else if ("*".equals(token)) {
            return OpType.MUL;
        } else if ("/".equals(token)) {
            return OpType.DIV;
        }
        return null;
    }

}

static enum OpType {
    PLUS(0, "+"), MINUS(0, "-"), MUL(1, "*"), DIV(1, "/");

    int priority;
    String str;

    private OpType(int priority, String str) {
        this.priority = priority;
        this.str = str;
    }

}
}
\$\endgroup\$
0

1 Answer 1

5
\$\begingroup\$

Is this homework? Or just for fun? If it is for real work, I would suggest using an existing framework. I have not done parsing myself, but it is a common use-case and there are many frameworks. I vaguely know of antlr.

I am not a specialist in the domain, so I can't realy comment on your algorithm. But here are some general observations.

  1. You can put the getOp method as a static method in the enum OpType. There is probably also an awful lot more code which you could put in this enum.

  2. Initializing your member variables to -1 is a bit unusual. There is only one constructor which sets all of them, so the initialization is useless (in Op).

  3. You should use getters/setters for class Expression. I know you don't actually need to, but it is the Java style. Besides, this library might grow at some point and you would likely extract Expression in its own file, where you would definitely need to add getters/setters.

  4. Your methods are much too long. One method extraction which I noticed right away: at the start of parse(String input), you use about 10 lines to split the string, so put that in a method. Even inside for/while loops, you can extract some blocks of code as methods. The code becomes more readable that way.

  5. You should think a bit more because I am certain you can make this much more OO. What screamed the need for OO to me was this:

    private static Expression parse(String[] tokens, LinkedList<Op> opsList,
        Map<Integer, Integer> closingMapper, Map<Integer, Integer> openingMapper,
        Map<Integer, Expression> frontCachingChart,
        Map<Integer, Expression> backCachingChart )
    
\$\endgroup\$
1
  • \$\begingroup\$ For your question: This is definitely NOT for homework. Purely for fun, not for production at all. Using any framework would defeat the purpose. \$\endgroup\$
    – InformedA
    Commented Jul 14, 2014 at 16:07

Not the answer you're looking for? Browse other questions tagged or ask your own question.