Is there a way to remove the first N
lines from a log that is being actively appended by an application?
7 Answers
No, operating systems like Linux, and it's filesystems, don't make provision for removing data from the start of a file. In other words, the start point of storage for a file is fixed.
Removing lines from the start of a file is usually accomplished by writing the remaining data to a new file and deleting the old. If a program has the old file open for writing, that file's deletion is postponed until the application closes the file.
As commenters noted, for the reasons given in my previous sentence, you usually need to coordinate logfile pruning with the programs that are writing the logs. Exactly how you do this depends on the programs. Some programs will close and reopen their logfiles when you send them a signal (e.g. HUP) and this can be used to prevent log records being written to a 'deleted' logfile, without disrupting service.
There are many utilities available for managing the size of log files, for example logrotate
Some programs have their own utilities. For example, the Apache webserver includes a rotatelogs utility.
-
3But you should not do this while something is still has the file open and still appending to it, because it would write to the now deleted file, and you would lose those log messages. Commented Mar 27, 2012 at 11:37
-
-
1too bad the OS's don't let you, that would sure be convenient for log rotaters to not have to reload processes after rotation :| Commented Feb 2, 2016 at 17:33
I think this task can be achieved with sed
sed -i '1,10d' myfile
would remove the lines from 1st to the 10th line form the file.
I think everybody should at least have a look at this sed 1 liners.
Note that this does not work for logfiles that are being actively appended to by an application (as stated in the question).
sed -i
will create a new file and 'delete' the file that is being written to. Most applications will continue to write log records to the deleted log file and will continue to fill disk space. The new, truncated, log file will not be appended to. This will only cease when the application is restarted or is otherwise signalled to close and reopen its log files. At which point there will be a gap (missing log records) in the new log file if there has been any loggable activity between the use of sed and the application restart.
A safe way to do this would be to halt the application, use sed to truncate the log, then restart the application. This approach can be unacceptable for some services (e.g. a web-server with high throughput and high service-continuity requirements)
-
2Do you know what happens to the applications that are appending? Commented Mar 27, 2012 at 14:38
-
1Let's assume a normal open file handler which appends lines and flushes every now and then. Commented Mar 27, 2012 at 19:01
-
1I know my way around sed, and extracting lines to a new file is a no-brainer with sed. The problems is to keep it all in the same file. Commented Mar 28, 2012 at 8:25
-
13No, this should not work.
sed -i
creates a new file with the edited content and the old one is removed so you are not editing the active file:$ ls -i --- 6823554 testfile --- $ sed -i 's/test/final/' testfile --- $ ls -i --- 6823560 testfile
------ Please check how doessed -i
work. Why does this wrong answer have so many upvotes? Commented Dec 16, 2013 at 9:57 -
3The question states "from a log that is being actively appended by an application". The operative word is "actively". Perhaps that clarification was added after your answer appeared. But as it stands, readers who gravitate to "most upvotes" WILL be mislead. I could only downvote once. Commented May 14, 2019 at 21:25
This is an answer, not a solution. There is NO solution to the question. The asker clearly states: "from a log that is being actively appended by an application". You can read on to understand more, and skip to the end for a suggestion I make based on my presumption why this code isn't following logging best practices.
To be clear: other "answers" here offer the false promise. No amount of renaming will trick the application into using the new file. The most useful information is buried in the comments made to these incorrect answers.
ACTIVE files are not some kind of container you simply put data into. A filename points to ONE inode (start of the file) and every inode has a pointer to another inode (if there is more data). That means a continually written-to file has a constant stream of inodes being added to it, and what you think of a "file" is actually a log sequence of inodes.
Imagine you were tracking someone on Google Maps, and that person could teleport anywhere in the world, at any time, and you were trying to connect these dots.
The Linux tool "truncate" can discard data at the end of the file, by simply walking the inode tree and (at the location/size you designate) it will discard all subsequent pointers in the stack. To do the reverse - discard data at the start of the file - would be a such horribly complex and risky process of rewriting the inode tree in real-time that nobody will write such tools for the public, because they would often fail and lead to data loss. The Inodes wiki is short but explains some of these concepts.
Back to your problem: This is probably an internal application (else someone would already have contributed a patch to fix). Flag this behavior for Code Review because this isn't following logging best practices. Explore possible impacts.. are you desperately trying to prevent an outage due to disk full? That should be a scenario documented somewhere in review, as a Risk.
No. A solution to this generic problem of log file growth is log rotation. This involves the regular (nightly or weekly, typically) moving of an existing log file to some other file name and starting fresh with an empty log file. After a period the old log files get thrown away.
See: http://www-uxsup.csx.cam.ac.uk/~jw35/courses/apache/html/x1670.htm
-
1For the log rotation to work reliably the application has to cooperate. See unix.stackexchange.com/questions/440004/… or stackoverflow.com/questions/53188731/… but ignore the wrong (accepted) answer by artm, ideally downvote it. Commented Jun 24, 2021 at 7:18
I like the simple solution below...
As in a log I cannot predict how many first lines I will remove, I keep the last n:
echo "$(tail -1000 /var/log/messages)" > /var/log/messages
echo -e "\n### Log reduced via cronjob at $(date) ###\n" >> /var/log/messages
Put both lines above in a cron job and you have something like a log rotation. Explanation of each part of the commands:
echo "string" >
will overwrite the entire file, and add the word "string" at the end of the file.
echo "string" >>
will just add the word "string" at the end of the file.
echo -e
the -e
option enables the interpretation of backslash escapes.
echo "$(command)"
will return the output of the command to the file.
tail -1000 /var/log/messages
return the last 1000 lines from /var/log/messages
file.
-
2Good practical solution! The
echo "string" >
andecho "string" >>
write to the same file (same inode, checked withls -i
), so if the log file is under active use, it will receive future log messages without issue. People might argue that some new log messages may be added before the tailed portion, but in practice this is a minimal issue.– taniusCommented Nov 22, 2022 at 0:05
Maybe copy, truncate, tail the copy back on to the size=0 truncation, and delete the copy?
Better yet tail to tail-copy, truncate original, concat tail-copy onto original.
You get lines in the log at tail length so better then a byte length limit.
Amending details from comment:
First we have a logger script in Python3 whatever you want
from time import sleep
idx = 0
while 1 == 1:
idx = (idx + 1)
lf = open('tailTrunc.log', 'a')
lf.write("line to file " + str(idx) + '\n')
lf.close()
sleep(0.01)
Then we have our truncator
#!/usr/bin/env bash
trap "kill 0" EXIT
rm tailTrunc.log
touch tailTrunc.log
python3 logLoop.py &
loggerPID=$!
sleep 1
kill -STOP $loggerPID
tail -10 tailTrunc.log > trimEnd.log
truncate -s 0 tailTrunc.log
kill -CONT $loggerPID
sleep 1
trimEnd.log shows 80 to 89
log shows 90 to end
Anyway where there's a will there's a way.
Many more complicated examples of consolidators and how the write stream is opened or closed may need adjusting per cpu core etc. just pause writing and queue if you can in your logger of the logging process etc.
-
2"from a log that is being actively appended by an application". The problem your solution overlooks is that the logfile is "permanently" in use by the application - meaning the inode of the logfile remains in play. Your solution does "back up" the logfile data, which may have uses outside of this question. Commented May 14, 2019 at 21:23
-
1Thanks for your comment and down vote? I've amended a quick cheap example as food for thought you'll have to think more deeply about your situation, but where there's a will there's a way. Commented May 19, 2019 at 10:35
-
1Don't think it was my down vote, but I think the point hammered in the other answer's comments: IF you copy a logfile, then it's no longer the active logfile... no matter what you do. The application's filehandle will always point at the inode of the original logfile. Think of it this way: you have an application which uses non-standard logging functions, and continually adds bytes to the file it has open. Commented May 23, 2019 at 14:21
-
1Right sorry to infer. Yes inode needs to stay the same that's why the example/proof given uses truncate, and again it depends on the situation (options for all are apparently hiding in plain site). Commented May 25, 2019 at 8:43
Opening in sublime text Deleting the lines and saving the file works somehow, even if the file is getting appended, but I came here for searching the solution for a command-line solution, so I would just leave this working but useless solution here!!
-
No this does not work. It may seem to work often, but sooner or later you'll lose data.– NavinCommented Sep 9, 2021 at 14:11