I have log files ( named in the format YYMMDD ) and I'd like to create a script that get only important information from the files ( like the lines that contains "O:NVS:VOICE" ). I have never used Python before so please help!
-
1We need some idea of what you've tried already and where you're having trouble. Do you need help opening a file? Parsing the data that's there? Printing/writing out the information you're interested in?– thegrinnerCommented Apr 15, 2013 at 14:12
-
well actually i've just started my project in python so i'm still thinking about the solution, i'd like to get some ideas from people who have experience.. the script aim to get the like that countains specific words as i said, from the log files that are generated daily by a server, and then put them in mysql database so i don't know how i can get the lines since they are many and they are created daily..– James HCommented Apr 15, 2013 at 14:29
2 Answers
This should get you started nicely:
infile = r"D:\Documents and Settings\xxxx\Desktop\test_log.txt"
important = []
keep_phrases = ["test",
"important",
"keep me"]
with open(infile) as f:
f = f.readlines()
for line in f:
for phrase in keep_phrases:
if phrase in line:
important.append(line)
break
print(important)
It's by no means perfect, for example there is no exception handling or pattern matching, but you can add these to it quite easily. Look into regular expressions, that may be better than phrase matching. If your files are very big, read it line by line to avoid a MemoryError.
Input file:
This line is super important!
don't need this one...
keep me!
bla bla
not bothered
ALWAYS include this test line
Output:
['This line is super important!\n', 'keep me!\n', 'ALWAYS include this test line']
Note: This is Python 3.3.
-
4You can avoid the issue with large files by looping over the file object, rather than calling readlines. Just move your
for line in f
inside yourwith
and get rid off.readlines()
Commented Jan 6, 2016 at 20:22
You'll need to know how to loop over files in a directory, regular expressions to make sure your log file format matches to file you are looping over, how to open a file, how to loop over the lines in the open file, and how to check if one of those lines contains what you are looking for.
And here some code to get you started.
with open("log.log", 'r') as f:
for line in f:
if "O:NVS:VOICE" in line:
print line