I'm interested in how you might detect similarities in tree structures. For example, in these three JSON documents;
(a) { name="foo", length=23 }
(b) { name="boo", length=99, redundant=true }
(c) { quux="baz", a=44, b=66, c=99 }
it's obvious to the human eye that (a) and (b) are more similar, in content and structure, than (a) and (c). However, I don't know of heuristics I might use to quantify that similarity.
I'm sure such algorithms exist -- they might exist, say, in code that detects similarities in parse trees or abstract syntax trees, to detect duplicate code structures -- but I've no idea how to search for more information because I don't know what to look for.
Is there a name for this kind of algorithm, where you're looking to be able to calculate 'near-ness' between different trees, graphs, or sets?
(Aside: I've just found out about the Levenshtein Distance so maybe there is a common technique like 'print out your data structures as strings and calculate distances between the printed strings' but I'm very new to the concept.)