I am trying to setup a good log analyzer for our tomcat application. I was able to create a basic parser and analyzer with python regex and panda stats feature.
The parser basically parses the timestamp, log level, thread class, thread name and error part.
However the error part is not uniform and doesn't follow a specific pattern. Even with ignoring stack trace and parsing only the main error part, it still doesn't have a specific pattern, due to the use of plugins from different vendors and each follow different rules to show the errors.
One thing that can be done is to extensively identify and group errors manually and create a reference sub parsing rule file. We already did that by using a reference xml (based on one provided by Vendor to identify known errors). But it needs a lot of additional efforts to add new rules for unknown errors
I am thinking if we can parse manually, is it possible to do that with parser alone and without a reference sheet.
Example:
logAncestorsTableFailure Detected ancestors table corruption for pageId: 715588532. Access to this page is blocked for all users as inherited permissions cannot be determined. To resolve this, rebuild the ancestors table. See https://confluence.atlassian.com/display/DOC/Rebuilding+the+Ancestor+Table
logAncestorsTableFailure Detected ancestors table corruption for pageId: 685814402. Access to this page is blocked for all users as inherited permissions cannot be determined. To resolve this, rebuild the ancestors table. See https://confluence.atlassian.com/display/DOC/Rebuilding+the+Ancestor+Table
To identify above kind of error, I could create a reference rule for it: logAncestorsTableFailure Detected ancestors table corruption for pageId
There are 2 problems with this approach:
- We don't know if the remaining part of the error message is similar or different. Hence in this method, we may miss important errors that were not identified before.
- As discussed before, it requires initial efforts to identify all possible error messages and all patterns.
Hence is it possible to parse it without using such reference or we need to step into AI for such things?
If I look the error myself, I could spot the pattern. Is that human intelligence that is doing the job there?
In other words, can some sort of aggregator be used to automatically identify similar patterns?