18
\$\begingroup\$

We can write mathematical expressions using the standard math operators (,),+,*,/,- available pretty much universally. We allow the symbols a, b, c, d and integers (e.g. 1, 45, etc.) but will restrict to just these four symbols. (Bonus points if you can handle more.) The goal is to take an expression as input and output a shorter expression that is mathematically equivalent to the input.

Edit: implicit multiplication such as 2a is not valid. It should be 2*a.

[To make this a bit easier, we will only ever divide by integers, and never the symbols. In other words, an expression is a polynomial in 4 variables with rational coefficients.]

Examples

input  1: 1+1/2  (5 characters)
output 1: 3/2    (3 characters)

input  2: a*(1-b)*(1+b)  (13 characters)
output 2: a-a*b*b        (7 characters)

input  3: a*b*2/7-a*b*b/7  (15 characters)
output 3: a*b*(2-b)/7      (11 characters)

input  4: b*c*c+2*a*c*d-2*a*c-2*b*c-c*c-2*a*d-2*c*d+4*c+2*d  (49 characters)
output 4: c*(b*c-2*a-2*b-c+4)+2*d*(1-a)*(1-c)                (35 characters)

input  5: a*c+b*c+a*d+b*d+3*a+3*b+c+d+4  (29 characters)
output 5: (1+a+b)*(3+c+d)+1              (17 characters)

Scoring

Shortest (valid) output wins. As a result, whoever has the shortest total (valid) output from the 20 randomly generated expressions is the winner. I will also verify that the winning solution actually works by creating 20 more, similar expressions and testing those.

There is a shortest solution in each case, so if multiple people come up with a solution that finds the shortest answer for every input, and not just the ones below, whoever got it first wins.

"Random" Inputs

(2*a*a*a-2*a*a*d+2*b*d*d+d*d*d-2*a*a+2*a*b-b*c-a*d+b*d+a+c)/2
(-d*d*d+2*c+d)/2
-b*c*c-2*a*c*d+2*a*c+2*b*c+2*a*d
2*a*a*c+2*a*c*d-2*c*d*d-2*b*b+2*a*c+2*b*c-2*c*d-2*d*d-2*a-b+2*c+d+1
-b*c*c-2*a*c*d+c*c+2*c*d+2*a+b
(-2*a*a*b*b+a*a*a*c+b*c*c*d+2*c*d*d*d-d*d*d-2*a*c+2*b*c+2*c*c+d*d)/3
(-2*a*a+b*c-2*b*d+2*a-2*b-1)/2
(b*b*c+a*c*c-2*b*d*d-c*d*d-a*a-2*a*b-b*b-a*c+c*c+2*a*d-b*d-2*d*d)/4
(a*a+b*d-c*d-c-2*d)/2
-a*a+2*a*b+2*b*b+2*a*c-a*d-2
-b*c*c-2*a*c*d+2*a+b
(a*b*b+2*a*b*c-b*c*c-a*d*d-b*b+a*c+b*c-c*c+2*a*d+a+2*b+d)/4
(-8*a*b*c-4*a*a*d+4*a*a+8*a*b+4*a*c+4*b*c+4*a*d-8*a-4*b+3)/3
(-2*a*a*a*a+c*c*c*c-a*b*c*d+2*b*b*c*d-a*c*c*d-2*c*c*c*d+a*a-2*b*b+2*a*c+a*d)/3
b*c*c+2*a*c*d-2*a*c-2*b*c-c*c-2*a*d-2*c*d+4*c+2*d
(-2*a*b*b*b+c*c*c*c-2*a*a)/3
\$\endgroup\$
9
  • 11
    \$\begingroup\$ Welcome to Code Golf! For future reference, we recommend using the Sandbox to get feedback on challenge ideas before posting them to main. I can guarantee you that someone will be able to come up with a program which always produces the optimal output. Your examples include ^, but you don't mention this in the first line. I'd suggest removing ^ and using multiplication instead, otherwise the task becomes exponentially more complex and (cont.) \$\endgroup\$ Commented Apr 15, 2021 at 18:01
  • 1
    \$\begingroup\$ (cont.) will include difficult-to-handle edge cases, such as raising one variable to the power of another. Additionally, I've replaced the [math] tag with the [string] tag, as it's more relevant given the input will be a string \$\endgroup\$ Commented Apr 15, 2021 at 18:01
  • 3
    \$\begingroup\$ I didn't know that the Sandbox exists, my bad. Also, added a winning criteria in case people get the optimal output for the inputs I supplied. If you think there is a better winning criteria, let me know. \$\endgroup\$
    – Jay
    Commented Apr 15, 2021 at 18:16
  • 3
    \$\begingroup\$ Is there any restriction on running time? If I try to generate all possible strings one by one, and check equality, I'm sure I can get shortest result (in theory). \$\endgroup\$
    – tsh
    Commented Apr 16, 2021 at 5:11
  • 3
    \$\begingroup\$ I suppose there should be a restriction. I'm not sure what is reasonable, though. Perhaps 1 hour per input. That would eliminate exhaustive search. \$\endgroup\$
    – Jay
    Commented Apr 16, 2021 at 14:53

1 Answer 1

9
+200
\$\begingroup\$

Python 3.10

import copy, collections
import math, itertools

REDUCE_COUNT = itertools.count(1)

def parse_expr(s, paren=False):
    expr_vars, expr_val, expr_demoninator, expr_groups = [], None, 1, []
    expr_container, flag = [], True
    while s and (not paren or s[0] != ')'):
        if s[0].isdigit():
            expr_val = (expr_val or 1) * int(s[0])
        elif s[0].isalpha():
            expr_vars.append(s[0])
        elif s[0] == '*':
            pass
        elif s[0] == '/':
            expr_demoninator *= int(s[1])
            s, flag = s[2:], False
        elif s[0] in {'+', '-'}:
            if expr_val is not None or expr_vars or expr_groups:
                expr_container.append({'type':'expr', 'coefficient':expr_val or 1, 'denominator':expr_demoninator, 'vars':sorted(expr_vars, key=lambda x:(isinstance(x, (dict, list)), str(x))), 'groups':sorted(expr_groups, key=str)})
            expr_vars, expr_val, expr_demoninator, expr_groups = [], 1 if s[0] == '+' else -1, 1, []
        elif s[0] == '(':
            container, s = parse_expr(s[1:], True)
            expr_groups.append(container)
        elif s[0] == ' ':
            pass
        if flag:
            s = s[1:]
        flag = True
    if expr_val is not None or expr_vars or expr_groups:
        expr_container.append({'type':'expr', 'coefficient':expr_val or 1, 'denominator':expr_demoninator, 'vars':sorted(expr_vars, key=lambda x:(isinstance(x, (dict, list)), str(x))), 'groups':sorted(expr_groups, key=str)})
    return {'type':'container', 'exprs':expr_container}, s

def expand_expr(expr):
    def update_expr_vars(expr, expr_vars):
        expr['coefficient'], expr['vars'], expr['denominator'] = expr['coefficient']*expr_vars['coefficient'], sorted(expr['vars']+expr_vars['vars']), expr['denominator']*expr_vars['denominator']
        return expr

    def to_expr_var(expr):
        return {'coefficient':expr['coefficient'], 'vars':expr['vars'], 'denominator':expr['denominator']}

    def distribute(expr, expr_vars={'coefficient':1, 'vars':[], 'denominator':1}):
        if expr['type'] == 'container':
            for i in expr['exprs']:
                yield from distribute(i, expr_vars)
        else:
            new_expr = update_expr_vars(eval(str(expr)), expr_vars)
            if not new_expr['groups']:
                yield new_expr
            else:
                all_vars = [to_expr_var(new_expr)]
                for i in new_expr['groups']:
                    all_vars = [x for y in all_vars for x in distribute(i, y)]
                yield from all_vars

    return {'type':'container', 'exprs':[*distribute(expr)]}

def reduce_terms(expr):
    lcm, d = math.lcm(*[i['denominator'] for i in expr['exprs']]), collections.defaultdict(int)
    for i in expr['exprs']:
        d[str([i['vars'], i['groups'], i.get('reduce', 0)])] += i['coefficient']*(lcm/i['denominator'])

    return {'type':'container', 'exprs':[{'type':'expr', 'coefficient':int(b), 'denominator':lcm, 'vars':eval(a)[0], 'groups':eval(a)[1]} for a, b in d.items() if b]}
    
def render_expr(expr, d=0):
    if expr['type'] == 'container':
        r_s = ''
        for i in expr['exprs']:
            if (v:=render_expr(i, 1)):
                if v[0] == '-' and r_s and r_s[-1] == '+':
                    r_s = r_s[:-1]+v+'+'
                else:
                    r_s += v + '+'
        if r_s and r_s[-1] == '+':
            r_s = r_s[:-1]
        return '' if not r_s else '('*d + r_s + ')'*d

    if not any([expr['vars'], expr['groups']]):
        return str(expr['coefficient'])+('/'+str(expr['denominator']) if expr['denominator'] != 1 else '')

    result = ('' if abs(expr['coefficient']) == 1 else str(expr['coefficient'])) + ('-' if expr['coefficient'] == -1 else '')
    if expr['vars']:
        result += ('*'*(bool(result) and result[-1] != '-'))+'*'.join(expr['vars'])
    if expr['groups']:
        result += ('*'*(bool(result) and result[-1] != '-'))+'*'.join(render_expr(i, 1) for i in expr['groups'])
    if expr['denominator'] != 1:
        result += '/'+str(expr['denominator'])
    
    return result


def factor_combos(all_combos, c = []):
    if c: yield c
    if all_combos:
        for i in all_combos[0]:
            yield from factor_combos(all_combos[1:], c+[i])

        yield from factor_combos(all_combos[1:])

def expr_factors(expr):
    assert expr['type'] == 'expr'
    yield from factor_combos([
        [{'type':'coefficient', 'v':i} for i in range(2, abs(expr['coefficient'])+1) if not expr['coefficient']%i], 
        [{'type':'vars', 'v':[*j]} for x in range(len(expr['vars'])+1) for j in itertools.combinations(expr['vars'], x)],
        [{'type':'groups', 'v':[*j]} for x in range(len(expr['groups'])+1) for j in itertools.combinations(expr['groups'], x)] 
        ])

def factor_match(expr, factor):
    new_expr = eval(str(expr))
    if 'coefficient' in factor:
        if expr['coefficient']%factor['coefficient']: 
            return False
        new_expr['coefficient'] = int(new_expr['coefficient']/factor['coefficient'])
    if 'vars' in factor:
        c1, c2 = collections.Counter(expr['vars']), collections.Counter(factor['vars'])
        for a, b in c2.items():
            if a not in c1 or b > c1[a]:
                return False
            c1[a] -= b
        new_expr['vars'] = sorted([x for a, b in c1.items() for x in ([a]*b)])

    if 'groups' in factor:
        c1, c2 = collections.Counter(map(str, expr['groups'])), collections.Counter(map(str, factor['groups']))
        for a, b in c2.items():
            if a not in c1 or b > c1[a]:
                return False
            c1[a] -= b
        
        new_expr['groups'] = sorted([eval(x) for a, b in c1.items() for x in ([a]*b)], key=str)

    return new_expr

def group_common_factor(exprs, factor):
    factor = {i['type']:i['v'] for i in factor}
    matched, non_matched = [], []
    for expr in exprs:
        if (m:=factor_match(expr, factor)):
            matched.append(m)
        else:
            non_matched.append(expr)

    if len(matched) < 2:
        return [], non_matched

    
    old_denominator = matched[0]['denominator']
    for i in matched:
        i['denominator'] = 1

    new_expr_var = {'type':'expr', 'coefficient':factor.get('coefficient', 1), 'denominator':old_denominator, 'vars':sorted(factor.get('vars', []), key=str), 'groups': sorted(factor.get('groups', [])+[{'type':'container', 'exprs':matched}], key=str)}
    return new_expr_var, non_matched

def factor_ints(d, c = []):
    def gen_num(n, denom):
        return {'type':'expr', 'coefficient':n, 'denominator':denom, 'vars':[], 'groups':[], 'reduce':next(REDUCE_COUNT)}
    
    if not d:
        yield c
    
    else:
        yield from factor_ints(d[1:], c+[d[0]])
        if not any([d[0]['vars'], d[0]['groups']]):
            for i in range(1, d[0]['coefficient']//2):
                yield from factor_ints(d[1:], c+[gen_num(i, d[0]['denominator']), gen_num(d[0]['coefficient'] - i, d[0]['denominator'])])


def base_factor_combos(d, c = []):
    if len(c) > 1 and len(c) < len(d):
        yield sorted(c)
    for i in d:
        if i not in c:
            yield from base_factor_combos(d, c+[i])

def full_base_factors(d):
    return list(map(eval, set(map(str, base_factor_combos(d)))))     

def produce_factor_combinations(exprs, c = [], f = 0):
    yield {'type':'container', 'exprs':sorted(c+exprs, key=str)} 
    if exprs:
        seen_factors = []
        for i in exprs:
            for _factor_group in expr_factors(i):
                if (factor_group:=[*filter(lambda x:x['v'], _factor_group)]) and factor_group not in seen_factors:
                    seen_factors.append(factor_group)
                    matched, non_matched = group_common_factor(exprs, factor_group)
                    if matched:
                        yield from produce_factor_combinations(non_matched, c = c+[matched], f=f)
                        if not f:
                            for base_factor in full_base_factors([*range(len(non_matched))]):
                                if len(base_factor) > 2:
                                    yield from produce_factor_combinations([a for j, a in enumerate(non_matched) if j not in base_factor], c = c+[matched]+[{'type':'expr', 'coefficient':1, 'denominator':matched['denominator'], 'vars':[], 'groups': [{'type':'container', 'exprs':[non_matched[j] for j in base_factor]}], 'base_factor':1}], f = 1)

def run_transformation(expr, len_cache):
    queue, seen = collections.deque([(0, expr)]), []
    while queue:
        level, expr = queue.popleft()
        reduced = reduce_terms(expr)
        s = render_expr(reduced)
        len_cache[len(s)]= s
        if s not in seen and (not len_cache or (len(s) < min(len_cache) or level < 2)):
            seen.append(s)
            for i in produce_factor_combinations(reduced['exprs']):
                queue.append((level+1, i))

def main(expr):
    full_len_cache = {}
    parse_result, _ = parse_expr(expr)
    expanded = expand_expr(parse_result)
    for i, _expr in enumerate(factor_ints(expanded['exprs'])):
        len_cache = {}
        run_transformation({'type': 'container', 'exprs':_expr}, len_cache)
        full_len_cache.update(len_cache)

    return full_len_cache[min(full_len_cache)]
                
if __name__ == '__main__':
    print(main('1+1/2'))
    print(main('a*(1-b)*(1+b)'))
    print(main('a*b*2/7-a*b*b/7'))
    print(main('a*c+b*c+a*d+b*d+3*a+3*b+c+d+4'))
    print(main('b*c*c+2*a*c*d-2*a*c-2*b*c-c*c-2*a*d-2*c*d+4*c+2*d'))

This solution takes in an expression, parses it, and then repeats a process of grouping on possible factors and then combining like terms. While the code is somewhat verbose, the overall idea is very simple. It is somewhat slow on large inputs, and could be optimized further.

\$\endgroup\$
1
  • 5
    \$\begingroup\$ you're eligible for this bounty btw \$\endgroup\$
    – naffetS
    Commented Apr 18, 2022 at 20:35

Not the answer you're looking for? Browse other questions tagged or ask your own question.