15

I have log files ( named in the format YYMMDD ) and I'd like to create a script that get only important information from the files ( like the lines that contains "O:NVS:VOICE" ). I have never used Python before so please help!

2
  • 1
    We need some idea of what you've tried already and where you're having trouble. Do you need help opening a file? Parsing the data that's there? Printing/writing out the information you're interested in?
    – thegrinner
    Commented Apr 15, 2013 at 14:12
  • well actually i've just started my project in python so i'm still thinking about the solution, i'd like to get some ideas from people who have experience.. the script aim to get the like that countains specific words as i said, from the log files that are generated daily by a server, and then put them in mysql database so i don't know how i can get the lines since they are many and they are created daily..
    – James H
    Commented Apr 15, 2013 at 14:29

2 Answers 2

35

This should get you started nicely:

infile = r"D:\Documents and Settings\xxxx\Desktop\test_log.txt"

important = []
keep_phrases = ["test",
              "important",
              "keep me"]

with open(infile) as f:
    f = f.readlines()

for line in f:
    for phrase in keep_phrases:
        if phrase in line:
            important.append(line)
            break

print(important)

It's by no means perfect, for example there is no exception handling or pattern matching, but you can add these to it quite easily. Look into regular expressions, that may be better than phrase matching. If your files are very big, read it line by line to avoid a MemoryError.

Input file:

This line is super important!
don't need this one...
keep me!
bla bla
not bothered
ALWAYS include this test line

Output:

['This line is super important!\n', 'keep me!\n', 'ALWAYS include this test line']

Note: This is Python 3.3.

1
  • 4
    You can avoid the issue with large files by looping over the file object, rather than calling readlines. Just move your for line in f inside your with and get rid of f.readlines() Commented Jan 6, 2016 at 20:22
23

You'll need to know how to loop over files in a directory, regular expressions to make sure your log file format matches to file you are looping over, how to open a file, how to loop over the lines in the open file, and how to check if one of those lines contains what you are looking for.

And here some code to get you started.

with open("log.log", 'r') as f:
    for line in f:
        if "O:NVS:VOICE" in line:
            print line

Not the answer you're looking for? Browse other questions tagged or ask your own question.