304

How can I convert tabs to spaces in every file of a directory (possibly recursively)?

Also, is there a way of setting the number of spaces per tab?

4
  • You want to replace tabs in files or filenames?
    – cppcoder
    Commented Jun 19, 2012 at 4:32
  • 3
    pr is a wonderful utility for this. See this answer. Commented Jun 9, 2017 at 2:28
  • Replacing tabs with spaces is not advised as it will harm others who work with the same files. Simply adjust the tools for the desired tab width instead. Commented Jan 16, 2021 at 13:36
  • 1
    expand and unexpand do the wonderful job. Commented Mar 19, 2023 at 12:10

19 Answers 19

384

Simple replacement with sed is okay but not the best possible solution. If there are "extra" spaces between the tabs they will still be there after substitution, so the margins will be ragged. Tabs expanded in the middle of lines will also not work correctly. In bash, we can say instead

find . -name '*.java' ! -type d -exec bash -c 'expand -t 4 "$0" > /tmp/e && mv /tmp/e "$0"' {} \;

to apply expand to every Java file in the current directory tree. Remove / replace the -name argument if you're targeting some other file types. As one of the comments mentions, be very careful when removing -name or using a weak, wildcard. You can easily clobber repository and other hidden files without intent. This is why the original answer included this:

You should always make a backup copy of the tree before trying something like this in case something goes wrong.

36
  • 2
    @JeffreyMartinez Great question. gniourf_gniourf edited my original answer on 11 November and made disparaging remarks about not knowing the proper way to use {}. Looks like he didn't know about $0 when -c is used. Then dimo414 changed from my use of a temp in the conversion directory to /tmp, which will be much slower if /tmp is on a different mount point. Unfortunately I don't have a Linux box available to test your $0 proposal. But I think you are correct.
    – Gene
    Commented Nov 26, 2013 at 2:12
  • 1
    @Gene, thanks for the clarification, that sounds like stackoverflow alright :p . While I'm at it though, I'll add I had to use quotes around '*.java' for proper escaping of the *.java. Commented Nov 26, 2013 at 3:34
  • 2
    If anybody is having a 'unknown primary or operator' error from find, then here is the full command which will fix it: find . -name '*.java' ! -type d -exec bash -c 'expand -t 4 "$0" > /tmp/e && mv /tmp/e "$0"' {} \;
    – Doge
    Commented Apr 4, 2014 at 19:58
  • 4
    I thought this answer hadn't enough comments as it was, so this is mine: if use use sponge from joeyh.name/code/moreutils, you can write find . -name '*.py' ! -type d -exec bash -c 'expand -t 8 "$0" | sponge "$0"' {} \;
    – tokland
    Commented Oct 9, 2014 at 9:40
  • 8
    Don't be stupid and use find . -name '*', I just destroyed my local git repo
    – Gautam
    Commented Mar 22, 2015 at 3:18
225

Try the command line tool expand.

expand -i -t 4 input | sponge output

where

  • -i is used to expand only leading tabs on each line;
  • -t 4 means that each tab will be converted to 4 whitespace chars (8 by default).
  • sponge is from the moreutils package, and avoids clearing the input file. On macOS, the package moreutils is available via Homebrew (brew install moreutils) or MacPorts (sudo port install moreutils).

Finally, you can use gexpand on macOS, after installing coreutils with Homebrew (brew install coreutils) or MacPorts (sudo port install coreutils).

10
  • 6
    It's one of GNU_Core_Utilities
    – kev
    Commented Jun 19, 2012 at 4:57
  • 37
    You should pass -i to expand to only replace leading tabs on each line. This helps avoids replacing tabs that might be part of code. Commented Aug 8, 2014 at 16:00
  • 12
    how about for every single file in a directory recursively?
    – ahnbizcad
    Commented Jun 10, 2015 at 18:44
  • 4
    Every time I try to use this it blanks some (usually all) of the files. :\ Commented Jun 23, 2015 at 19:16
  • 5
    @ThorSummoner: if input is the same file as output the bash clobbers the content before even starting expand. This is how > works. Commented Sep 16, 2015 at 10:51
74

Warning: This will break your repo.

This will corrupt binary files, including those under svn, .git! Read the comments before using!

find . -iname '*.java' -type f -exec sed -i.orig 's/\t/ /g' {} +

The original file is saved as [filename].orig.

Replace '*.java' with the file ending of the file type you are looking for. This way you can prevent accidental corruption of binary files.

Downsides:

  • Will replace tabs everywhere in a file.
  • Will take a long time if you happen to have a 5GB SQL dump in this directory.
13
  • 12
    for visual space that are a mix of tabs and spaces, this approach give incorrect expansion.
    – pizza
    Commented Jun 19, 2012 at 7:32
  • 7
    I would also add a file matcher like for example for only .php files find ./ -iname "*.php" -type f -exec sed -i 's/\t/ /g' {} \; Commented Mar 26, 2013 at 10:04
  • 102
    DO NOT USE SED! If there's an embedded tab in a string, you may end up mangling your code. This is what expand command was meant to handle. Use expand.
    – David W.
    Commented Nov 12, 2013 at 17:11
  • 6
    @DavidW. I would simply update this command to only replace tabs from the beginning of the line. find ./ -type f -exec sed -i 's/^\t/####/g' {} \;. But I wasn't aware of the expand command - very useful! Commented May 7, 2014 at 16:08
  • 31
    DO NOT USE! This answer also just wrecked my local git repository. If you have files containing mixed tabs and spaces it will insert sequences of #'s. Use the answer by Gene or the comment by Doge below instead.
    – puppet
    Commented Aug 18, 2014 at 13:06
48

Collecting the best comments from Gene's answer, the best solution by far, is by using sponge from moreutils.

sudo apt-get install moreutils
# The complete one-liner:
find ./ -iname '*.java' -type f -exec bash -c 'expand -t 4 "$0" | sponge "$0"' {} \;

Explanation:

  • ./ is recursively searching from current directory
  • -iname is a case insensitive match (for both *.java and *.JAVA likes)
  • type -f finds only regular files (no directories, binaries or symlinks)
  • -exec bash -c execute following commands in a subshell for each file name, {}
  • expand -t 4 expands all TABs to 4 spaces
  • sponge soak up standard input (from expand) and write to a file (the same one)*.

NOTE: * A simple file redirection (> "$0") won't work here because it would overwrite the file too soon.

Advantage: All original file permissions are retained and no intermediate tmp files are used.

0
22

Use backslash-escaped sed.

On linux:

  • Replace all tabs with 1 hyphen inplace, in all *.txt files:

    sed -i $'s/\t/-/g' *.txt
    
  • Replace all tabs with 1 space inplace, in all *.txt files:

    sed -i $'s/\t/ /g' *.txt
    
  • Replace all tabs with 4 spaces inplace, in all *.txt files:

    sed -i $'s/\t/    /g' *.txt
    

On a mac:

  • Replace all tabs with 4 spaces inplace, in all *.txt files:

    sed -i '' $'s/\t/    /g' *.txt
    
2
  • 2
    @Маша sed -i '' $'s/\t/ /g' $(find . -name "*.txt")
    – xyzale
    Commented Apr 10, 2018 at 10:30
  • It breaks tabs that are part of the code. Commented Mar 19, 2023 at 12:40
10

You can use the generally available pr command (man page here). For example, to convert tabs to four spaces, do this:

pr -t -e=4 file > file.expanded
  • -t suppresses headers
  • -e=num expands tabs to num spaces

To convert all files in a directory tree recursively, while skipping binary files:

#!/bin/bash
num=4
shopt -s globstar nullglob
for f in **/*; do
  [[ -f "$f" ]]   || continue # skip if not a regular file
  ! grep -qI "$f" && continue # skip binary files
  pr -t -e=$num "$f" > "$f.expanded.$$" && mv "$f.expanded.$$" "$f"
done

The logic for skipping binary files is from this post.

NOTE:

  1. Doing this could be dangerous in a git or svn repo
  2. This is not the right solution if you have code files that have bare tabs embedded in string literals
1
6

My recommendation is to use:

find . -name '*.lua' -exec ex '+%s/\t/  /g' -cwq {} \;

Comments:

  1. Use in place editing. Keep backups in a VCS. No need to produce *.orig files. It's good practice to diff the result against your last commit to make sure this worked as expected, in any case.
  2. sed is a stream editor. Use ex for in place editing. This avoids creating extra temp files and spawning shells for each replacement as in the top answer.
  3. WARNING: This messes with all tabs, not only those used for indentation. Also it does not do context aware replacement of tabs. This was sufficient for my use case. But might not be acceptable for you.
  4. EDIT: An earlier version of this answer used find|xargs instead of find -exec. As pointed out by @gniourf-gniourf this leads to problems with spaces, quotes and control chars in file names cf. Wheeler.
8
  • ex might not be available on every Unix system. Substituting it with vi -e might work on more machines. Also, your regex replaces any number of starting tab characters with two spaces. Replace the regex with +%s/\t/ /g to no destroy multi level indentation. However this also affects tab characters that are not used for indentation. Commented Jun 14, 2016 at 11:43
  • ex is part of POSIX [1] so should be available. Good point about multi level indendation. I had actually used the /\t/ / variant on my files, but opted for /\t\+// to not break non-indenting tabs. Missed the issues with multi-indentation! Updating the Answer. [1] man7.org/linux/man-pages/man1/ex.1p.html#SEE%C2%A0ALSO Commented Jun 14, 2016 at 13:24
  • 2
    Using xargs in this way is useless, inefficient and broken (think of filenames containing spaces or quotes). Why don't you use find's -exec switch instead? Commented Jun 14, 2016 at 13:33
  • I'd argue that filenames with spaces and quotes are broken ; ) If you need to support that I'd opt for: -print0 options to find / xargs. I like xargs over -exec since: a) Separation of concerns b) it can be swapped with GNU parallel more easily. Commented Jun 14, 2016 at 13:43
  • Updated adding @gniourf_gniourf comments. Commented Jun 14, 2016 at 16:19
6

You can use find with tabs-to-spaces package for this.

First, install tabs-to-spaces

npm install -g tabs-to-spaces

then, run this command from the root directory of your project;

find . -name '*' -exec t2s --spaces 2 {} \;

This will replace every tab character with 2 spaces in every file.

5

How can I convert tabs to spaces in every file of a directory (possibly recursively)?

This is usually not what you want.

Do you want to do this for png images? PDF files? The .git directory? Your Makefile (which requires tabs)? A 5GB SQL dump?

You could, in theory, pass a whole lot of exlude options to find or whatever else you're using; but this is fragile, and will break as soon as you add other binary files.

What you want, is at least:

  1. Skip files over a certain size.
  2. Detect if a file is binary by checking for the presence of a NULL byte.
  3. Only replace tabs at the start of a file (expand does this, sed doesn't).

As far as I know, there is no "standard" Unix utility that can do this, and it's not very easy to do with a shell one-liner, so a script is needed.

A while ago I created a little script called sanitize_files which does exactly that. It also fixes some other common stuff like replacing \r\n with \n, adding a trailing \n, etc.

You can find a simplified script without the extra features and command-line arguments below, but I recommend you use the above script as it's more likely to receive bugfixes and other updated than this post.

I would also like to point out, in response to some of the other answers here, that using shell globbing is not a robust way of doing this, because sooner or later you'll end up with more files than will fit in ARG_MAX (on modern Linux systems it's 128k, which may seem a lot, but sooner or later it's not enough).


#!/usr/bin/env python
#
# http://code.arp242.net/sanitize_files
#

import os, re, sys


def is_binary(data):
    return data.find(b'\000') >= 0


def should_ignore(path):
    keep = [
        # VCS systems
        '.git/', '.hg/' '.svn/' 'CVS/',

        # These files have significant whitespace/tabs, and cannot be edited
        # safely
        # TODO: there are probably more of these files..
        'Makefile', 'BSDmakefile', 'GNUmakefile', 'Gemfile.lock'
    ]

    for k in keep:
        if '/%s' % k in path:
            return True
    return False


def run(files):
    indent_find = b'\t'
    indent_replace = b'    ' * indent_width

    for f in files:
        if should_ignore(f):
            print('Ignoring %s' % f)
            continue

        try:
            size = os.stat(f).st_size
        # Unresolvable symlink, just ignore those
        except FileNotFoundError as exc:
            print('%s is unresolvable, skipping (%s)' % (f, exc))
            continue

        if size == 0: continue
        if size > 1024 ** 2:
            print("Skipping `%s' because it's over 1MiB" % f)
            continue

        try:
            data = open(f, 'rb').read()
        except (OSError, PermissionError) as exc:
            print("Error: Unable to read `%s': %s" % (f, exc))
            continue

        if is_binary(data):
            print("Skipping `%s' because it looks binary" % f)
            continue

        data = data.split(b'\n')

        fixed_indent = False
        for i, line in enumerate(data):
            # Fix indentation
            repl_count = 0
            while line.startswith(indent_find):
                fixed_indent = True
                repl_count += 1
                line = line.replace(indent_find, b'', 1)

            if repl_count > 0:
                line = indent_replace * repl_count + line

        data = list(filter(lambda x: x is not None, data))

        try:
            open(f, 'wb').write(b'\n'.join(data))
        except (OSError, PermissionError) as exc:
            print("Error: Unable to write to `%s': %s" % (f, exc))


if __name__ == '__main__':
    allfiles = []
    for root, dirs, files in os.walk(os.getcwd()):
        for f in files:
            p = '%s/%s' % (root, f)
            if do_add:
                allfiles.append(p)

    run(allfiles)
1
5

I like the "find" example above for the recursive application. To adapt it to be non-recursive, only changing files in the current directory that match a wildcard, the shell glob expansion can be sufficient for small amounts of files:

ls *.java | awk '{print "expand -t 4 ", $0, " > /tmp/e; mv /tmp/e ", $0}' | sh -v

If you want it silent after you trust that it works, just drop the -v on the sh command at the end.

Of course you can pick any set of files in the first command. For example, list only a particular subdirectory (or directories) in a controlled manner like this:

ls mod/*/*.php | awk '{print "expand -t 4 ", $0, " > /tmp/e; mv /tmp/e ", $0}' | sh

Or in turn run find(1) with some combination of depth parameters etc:

find mod/ -name '*.php' -mindepth 1 -maxdepth 2 | awk '{print "expand -t 4 ", $0, " > /tmp/e; mv /tmp/e ", $0}' | sh
2
  • 1
    Shell globbing will break sooner or later, because the total amount of filenames can only be of ARG_MAX length. This is 128k on Linux systems, but I've encountered this limit enough times to not rely on shell globbing. Commented Aug 12, 2015 at 14:14
  • 1
    You don't really need to adapt them. find can be told -maxdepth 1, and it only processes the entries of the directory being modified, not the whole tree. Commented Oct 29, 2015 at 23:24
4

I used astyle to re-indent all my C/C++ code after finding mixed tabs and spaces. It also has options to force a particular brace style if you'd like.

4

One can use vim for that:

find -type f \( -name '*.css' -o -name '*.html' -o -name '*.js' -o -name '*.php' \) -execdir vim -c retab -c wq {} \;

As Carpetsmoker stated, it will retab according to your vim settings. And modelines in the files, if any. Also, it will replace tabs not only at the beginning of the lines. Which is not what you generally want. E.g., you might have literals, containing tabs.

3
  • :retab will change all the tabs in a file, not those at the start. it also depends on what your :tabstop and :expandtab settings are in the vimrc or modeline, so this may not work at all. Commented Aug 12, 2015 at 14:17
  • @Carpetsmoker Good point about tabs at the start of the lines. Does any of the solutions here handles this case? As for the tabstop and expandtab settings, it will work out if you're using vim. Unless you have mode lines in the files.
    – x-yuri
    Commented Aug 12, 2015 at 17:13
  • @x-yuri good question, but generally moot. Most people use \t not actual tabs in literals. Commented Dec 4, 2015 at 17:02
4

To convert all Java files recursively in a directory to use 4 spaces instead of a tab:

find . -type f -name *.java -exec bash -c 'expand -t 4 {} > /tmp/stuff;mv /tmp/stuff {}' \;
2
  • How is this answer different from this which was posted 4 years ago?
    – P.P
    Commented Jun 29, 2016 at 17:06
  • 2
    So does your answer. In fact, this is an inferior version of Gene's answer: 1) Gene's answer take care of directories with same name. 2) It doesn't move if expand failed.
    – P.P
    Commented Jun 30, 2016 at 7:52
4

Git repository friendly method

git-tab-to-space() (
  d="$(mktemp -d)"
  git grep --cached -Il '' | grep -E "${1:-.}" | \
    xargs -I'{}' bash -c '\
    f="${1}/f" \
    && expand -t 4 "$0" > "$f" && \
    chmod --reference="$0" "$f" && \
    mv "$f" "$0"' \
    '{}' "$d" \
  ;
  rmdir "$d"
)

Act on all files under the current directory:

git-tab-to-space

Act only on C or C++ files:

git-tab-to-space '\.(c|h)(|pp)$'

You likely want this notably because of those annoying Makefiles which require tabs.

The command git grep --cached -Il '':

  • lists only the tracked files, so nothing inside .git
  • excludes directories, binary files (would be corrupted), and symlinks (would be converted to regular files)

as explained at: How to list all text (non-binary) files in a git repository?

chmod --reference keeps the file permissions unchanged: https://unix.stackexchange.com/questions/20645/clone-ownership-and-permissions-from-another-file Unfortunately I can't find a succinct POSIX alternative.

If your codebase had the crazy idea to allow functional raw tabs in strings, use:

expand -i

and then have fun going over all non start of line tabs one by one, which you can list with: Is it possible to git grep for tabs?

Tested on Ubuntu 18.04.

4

No body mentioned rpl? Using rpl you can replace any string. To convert tabs to spaces,

rpl -R -e "\t" "    "  .

very simple.

2
  • 1
    This corrupted all binary files in my repo. Commented Nov 6, 2019 at 20:14
  • 1
    An excellent command, but potentially dangerous with the recursive and all files in folder option as specified above. I would add the --dry-run option "just in case" to make sure you are sitting in the right folder. Commented Jan 6, 2020 at 10:59
3

Download and run the following script to recursively convert hard tabs to soft tabs in plain text files.

Execute the script from inside the folder which contains the plain text files.

#!/bin/bash

find . -type f -and -not -path './.git/*' -exec grep -Iq . {} \; -and -print | while read -r file; do {
    echo "Converting... "$file"";
    data=$(expand --initial -t 4 "$file");
    rm "$file";
    echo "$data" > "$file";
}; done;
2

The use of expand as suggested in other answers seems the most logical approach for this task alone.

That said, it can also be done with Bash and Awk in case you may want to do some other modifications along with it.

If using Bash 4.0 or greater, the shopt builtin globstar can be used to search recursively with **.

With GNU Awk version 4.1 or greater, sed like "inplace" file modifications can be made:

shopt -s globstar
gawk -i inplace '{gsub("\t","    ")}1' **/*.ext

In case you want to set the number of spaces per tab:

gawk -i inplace -v n=4 'BEGIN{for(i=1;i<=n;i++) c=c" "}{gsub("\t",c)}1' **/*.ext
0
-1

Converting tabs to space in just in ".lua" files [tabs -> 2 spaces]

find . -iname "*.lua" -exec sed -i "s#\t#  #g" '{}' \;
3
  • Obviously, the amount of space that a tab expands to depends on the context. Thus, sed is a completely inappropriate tool for the task.
    – Sven
    Commented Mar 30, 2015 at 20:15
  • ?? @Sven, my sed command does the same thing that expand command does (expand -t 4 input >output)
    – Makah
    Commented Mar 31, 2015 at 19:32
  • 3
    Of course not. expand -t 4 will expand the tab in a\tb to 3 spaces and the tab in aa\tb to 2 spaces, just as it should be. expand takes the context of a tab into account, sed does not and will replace the tab with the amount of spaces your specify, regardless of the context.
    – Sven
    Commented Mar 31, 2015 at 20:43
-1

Use the vim-way:

$ ex +'bufdo retab' -cxa **/*.*
  • Make the backup! before executing the above command, as it can corrupt your binary files.
  • To use globstar (**) for recursion, activate by shopt -s globstar.
  • To specify specific file type, use for example: **/*.c.

To modify tabstop, add +'set ts=2'.

However the down-side is that it can replace tabs inside the strings.

So for slightly better solution (by using substitution), try:

$ ex -s +'bufdo %s/^\t\+/  /ge' -cxa **/*.*

Or by using ex editor + expand utility:

$ ex -s +'bufdo!%!expand -t2' -cxa **/*.*

For trailing spaces, see: How to remove trailing whitespaces for multiple files?


You may add the following function into your .bash_profile:

# Convert tabs to spaces.
# Usage: retab *.*
# See: https://stackoverflow.com/q/11094383/55075
retab() {
  ex +'set ts=2' +'bufdo retab' -cxa $*
}
3
  • I downvoted many answers in this thread, not just yours ;-) Reasons are: :retab may not work at all, shell globbing is a bad solution for this sort of thing, your :s command will replace any amount of tabs with 2 spaces (which you almost never want), starting ex just to run an :!expand process is silly... Commented Aug 12, 2015 at 14:22
  • ...and all your solutions will clobber binary files and such (like .png files, .pdf files, etc.) Commented Aug 12, 2015 at 14:22
  • This is frankly a horrible suggestion for documentation - one has to be intimately acquainted with a number of fairly opaque syntax and semantic issues of several programs to be able to comprehend this. Commented May 22, 2016 at 10:46

Not the answer you're looking for? Browse other questions tagged or ask your own question.