53

I have a bunch (hundreds) of files that are supposed to have Unix line endings. I strongly suspect that some of them have Windows line endings, and I want to programmatically figure out which ones do.

I know I can just run

flip -u
or something similar in a script to convert everything, but I want to be able to identify those files that need changing first.

7 Answers 7

72

You can use the file tool, which will tell you the type of line ending. Or, you could just use dos2unix -U which will convert everything to Unix line endings, regardless of what it started with.

6
  • 5
    file doesn't show line ending. Ex. : "file .bashrc => .bashrc: ASCII English text" Need some extra keys ? Commented Feb 9, 2012 at 10:32
  • 10
    @Fedir: Yes, it does, it's just that if the file has regular LF line endings, then it won't print any output. But if the file has CRLF, bare CR, or mixed line endings, it will tell you that. Commented Feb 9, 2012 at 21:55
  • 2
    Didn't work for me on a CRLF-only Perl script on OS X. Might be a GNU extension?
    – Tim Yates
    Commented Jun 11, 2012 at 20:34
  • 3
    This works on some file types but not others. On Linux, it doesn't report the line endings for html files for example. Commented Apr 9, 2013 at 4:48
  • "file foo.txt" worked fine on OS X 10.9. It printed "foo.txt: ASCII text, with CRLF line terminators" Commented Mar 14, 2014 at 17:19
29

You could use grep

egrep -l $'\r'\$ *
6
  • 3
    for some reason, when I run this command in a MacOS X shell, I get a list of all files in the directory. Even one that I newly generate with "echo "test" >torderform6.cpp". Any idea what might be going wrong? Commented Feb 25, 2009 at 18:33
  • 7
    It just lists all files in the folder for me on Ubuntu as well.
    – rjmunro
    Commented May 9, 2011 at 11:52
  • 2
    This command will still list files that have had dos2unix run on them.
    – Phyxx
    Commented Jan 17, 2012 at 2:16
  • 1
    On mac I use: grep -E -rl '\r' .
    – LanDenLabs
    Commented Aug 18, 2016 at 13:52
14

Something along the lines of:

perl -p -e 's[\r\n][WIN\n]; s[(?<!WIN)\n][UNIX\n]; s[\r][MAC\n];' FILENAME

though some of that regexp may need refining and tidying up.

That'll output your file with WIN, MAC, or UNIX at the end of each line. Good if your file is somehow a dreadful mess (or a diff) and has mixed endings.

4
  • Worked for me on Ubuntu, the accepted answer seems to just list all files Commented Jul 1, 2011 at 12:14
  • Doesn't work for me, gives: Unmatched ) in regex; marked by <-- HERE in m/(?&lt;!WIN) <-- HERE \n/ at -e line 1.
    – moshen
    Commented May 13, 2013 at 19:24
  • you need to replace the &lt; with <
    – Joseph
    Commented Jan 15, 2014 at 11:14
  • The < symbol was messed up in a previous edit. I've fixed it now. Commented Apr 1, 2014 at 4:29
5

Here's the most failsafe answer. Stimms answer doesn account for subdirectories and binary files

find . -type f -exec file {} \; | grep "CRLF" | awk -F ':' '{ print $1 }'
  • Use file to find file type. Those with CRLF have windows return characters. The output of file is delimited by a :, and the first field is the path of the file.
2
  • Indeed the most failsafe way. To convert only all found files just run find . -type f -exec file {} \; | grep "CRLF" | awk -F ':' '{ print $1 }' | xargs flip -ub afterwards. Commented Jan 24, 2017 at 9:36
  • 3
    Most failsafe it is not -- file does not always even tell "CRLF" in its output, that depends on what kind of file it is. I have discovered that for SVG files -- containing text much like plaintext files -- file does not mention the kind of line ending used. This script thus is not file type agnostic. Just saying. Otherwise looks like a sane one-liner, aforementioned limitation non-withstanding. Commented Apr 16, 2017 at 12:51
2

Unix uses one byte, 0x0A (LineFeed), while windows uses two bytes, 0x0D 0x0A (Carriage Return, Line feed).

If you never see a 0x0D, then it's very likely Unix. If you see 0x0D 0x0A pairs then it's very likely MSDOS.

0

Windows use char 13 & 10 for line ending, unix only one of them ( i don't rememeber which one ). So you can replace char 13 & 10 for char 13 or 10 ( the one, which use unix ).

0

When you know which files has Windows line endings (0x0D 0x0A or \r \n), what you will do with that files? I supose, you will convert them into Unix line ends (0x0A or \n). You can convert file with Windows line endings into Unix line endings with sed utility, just use command:

$> sed -i 's/\r//' my_file_with_win_line_endings.txt

You can put it into script like this:

#!/bin/bash

function travers()
{
    for file in $(ls); do
        if [ -f "${file}" ]; then
            sed -i 's/\r//' "${file}"
        elif [ -d "${file}" ]; then
            cd "${file}"
            travers
            cd ..
        fi
    done
}

travers

If you run it from your root dir with files, at end you will be sure all files are with Unix line endings.

Not the answer you're looking for? Browse other questions tagged or ask your own question.