How do I copy a file in Python?
-
3It would be nice if you specify a reason for copy. For many applications hard linking may be a viable alternative. And people looking for such solutions may also think about this "copy" opportunity (like me, but my answer here was downvoted).– Yaroslav NikitenkoCommented Oct 18, 2022 at 17:33
21 Answers
shutil
has many methods you can use. One of which is:
import shutil
shutil.copyfile(src, dst)
# 2nd option
shutil.copy(src, dst) # dst can be a folder; use shutil.copy2() to preserve timestamp
- Copy the contents of the file named
src
to a file nameddst
. Bothsrc
anddst
need to be the entire filename of the files, including path. - The destination location must be writable; otherwise, an
IOError
exception will be raised. - If
dst
already exists, it will be replaced. - Special files such as character or block devices and pipes cannot be copied with this function.
- With
copy
,src
anddst
are path names given asstr
s.
Another shutil
method to look at is shutil.copy2()
. It's similar but preserves more metadata (e.g. time stamps).
If you use os.path
operations, use copy
rather than copyfile
. copyfile
will only accept strings.
-
10In Python 3.8 this has received some significant speed boosts (~50% faster, depending on OS). Commented Mar 20, 2020 at 9:17
-
2
-
2@KansaiRobot: Yes, otherwise you get an exception:
FileNotFoundError: Directory does not exist: foo/
– JohnCommented Mar 5, 2023 at 10:36 -
It does not copy file metadata like the file owner's group on POSIX OSes (GNU/Linux, FreeBSD, ..). Commented Oct 4, 2023 at 8:52
Function | Copies metadata |
Copies permissions |
Uses file object | Destination may be directory |
---|---|---|---|---|
shutil.copy | No | Yes | No | Yes |
shutil.copyfile | No | No | No | No |
shutil.copy2 | Yes | Yes | No | Yes |
shutil.copyfileobj | No | No | Yes | No |
-
17Note that even the
shutil.copy2()
function cannot copy all file metadata.– wovanoCommented Mar 11, 2022 at 10:34
copy2(src,dst)
is often more useful than copyfile(src,dst)
because:
- it allows
dst
to be a directory (instead of the complete target filename), in which case the basename ofsrc
is used for creating the new file; - it preserves the original modification and access info (mtime and atime) in the file metadata (however, this comes with a slight overhead).
Here is a short example:
import shutil
shutil.copy2('/src/dir/file.ext', '/dst/dir/newname.ext') # complete target filename given
shutil.copy2('/src/file.ext', '/dst/dir') # target filename is /dst/dir/file.ext
In Python, you can copy the files using
shutil
moduleos
modulesubprocess
module
import os
import shutil
import subprocess
1) Copying files using shutil
module
shutil.copyfile
signature
shutil.copyfile(src_file, dest_file, *, follow_symlinks=True)
# example
shutil.copyfile('source.txt', 'destination.txt')
shutil.copy
signature
shutil.copy(src_file, dest_file, *, follow_symlinks=True)
# example
shutil.copy('source.txt', 'destination.txt')
shutil.copy2
signature
shutil.copy2(src_file, dest_file, *, follow_symlinks=True)
# example
shutil.copy2('source.txt', 'destination.txt')
shutil.copyfileobj
signature
shutil.copyfileobj(src_file_object, dest_file_object[, length])
# example
file_src = 'source.txt'
f_src = open(file_src, 'rb')
file_dest = 'destination.txt'
f_dest = open(file_dest, 'wb')
shutil.copyfileobj(f_src, f_dest)
2) Copying files using os
module
os.popen
signature
os.popen(cmd[, mode[, bufsize]])
# example
# In Unix/Linux
os.popen('cp source.txt destination.txt')
# In Windows
os.popen('copy source.txt destination.txt')
os.system
signature
os.system(command)
# In Linux/Unix
os.system('cp source.txt destination.txt')
# In Windows
os.system('copy source.txt destination.txt')
3) Copying files using subprocess
module
subprocess.call
signature
subprocess.call(args, *, stdin=None, stdout=None, stderr=None, shell=False)
# example (WARNING: setting `shell=True` might be a security-risk)
# In Linux/Unix
status = subprocess.call('cp source.txt destination.txt', shell=True)
# In Windows
status = subprocess.call('copy source.txt destination.txt', shell=True)
subprocess.check_output
signature
subprocess.check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False)
# example (WARNING: setting `shell=True` might be a security-risk)
# In Linux/Unix
status = subprocess.check_output('cp source.txt destination.txt', shell=True)
# In Windows
status = subprocess.check_output('copy source.txt destination.txt', shell=True)
-
4Thanks for many options you listed. IMHO, "os.popen('cp source.txt destination.txt') " (with its likes) is bad design. Python is a platform-independent language, and with code like this you destroy this great feature. Commented Sep 10, 2022 at 15:28
-
6Only the first example answers the question. Running shell commands isn't portable and also isn't really performing actions. Commented Nov 17, 2022 at 16:50
-
You technically can copy using a shell command, but why? Is there any case where
shutil
's provided API is not safer, more robust and covering all the edge cases in a single API, apart from being cross-platform? I'd at least mention near those examples a comment about them not being recommended. Commented Apr 25 at 7:51 -
what is different between using shutil and os modules ? do they work in the same way? Commented Jun 5 at 13:11
You can use one of the copy functions from the shutil
package:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Function preserves supports accepts copies other permissions directory dest. file obj metadata ―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― shutil.copy ✔ ✔ ☐ ☐ shutil.copy2 ✔ ✔ ☐ ✔ shutil.copyfile ☐ ☐ ☐ ☐ shutil.copyfileobj ☐ ☐ ✔ ☐ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Example:
import shutil
shutil.copy('/etc/hostname', '/var/tmp/testhostname')
-
1
-
@wim you have to compare my answer with jesrael's answer at the time I posted mine. So i added another column, used better column headers and less distracting table layout. I also added an example for the most common case. You are not the first one who asked this and I've answered this question several times already, but unfortunately, some moderator deleted the old comments here. After checking my old notes it even appears that you, wim, already had asked me this question in the past, and your question with my answer was deleted, too. Commented Mar 22, 2023 at 22:57
-
@wim I also added links into the Python documentation. The added column contains information about directory destinations, Jezrael's answer didn't include. Commented Mar 22, 2023 at 23:18
-
1Oh dear, that's funny that I already asked. Yes I agree that the doc links and extra column are improvements, however, it may have been better to edit them into other answer. Looks like some users subsequently did so, and now we have two answers which aren't significantly different apart from the formatting - recommend deletion?– wimCommented Mar 23, 2023 at 0:30
Copying a file is a relatively straightforward operation as shown by the examples below, but you should instead use the shutil stdlib module for that.
def copyfileobj_example(source, dest, buffer_size=1024*1024):
"""
Copy a file from source to dest. source and dest
must be file-like objects, i.e. any object with a read or
write method, like for example StringIO.
"""
while True:
copy_buffer = source.read(buffer_size)
if not copy_buffer:
break
dest.write(copy_buffer)
If you want to copy by filename you could do something like this:
def copyfile_example(source, dest):
# Beware, this example does not handle any edge cases!
with open(source, 'rb') as src, open(dest, 'wb') as dst:
copyfileobj_example(src, dst)
Use the shutil module.
copyfile(src, dst)
Copy the contents of the file named src to a file named dst. The destination location must be writable; otherwise, an IOError exception will be raised. If dst already exists, it will be replaced. Special files such as character or block devices and pipes cannot be copied with this function. src and dst are path names given as strings.
Take a look at filesys for all the file and directory handling functions available in standard Python modules.
Directory and File copy example, from Tim Golden's Python Stuff:
import os
import shutil
import tempfile
filename1 = tempfile.mktemp (".txt")
open (filename1, "w").close ()
filename2 = filename1 + ".copy"
print filename1, "=>", filename2
shutil.copy (filename1, filename2)
if os.path.isfile (filename2): print "Success"
dirname1 = tempfile.mktemp (".dir")
os.mkdir (dirname1)
dirname2 = dirname1 + ".copy"
print dirname1, "=>", dirname2
shutil.copytree (dirname1, dirname2)
if os.path.isdir (dirname2): print "Success"
For small files and using only Python built-ins, you can use the following one-liner:
with open(source, 'rb') as src, open(dest, 'wb') as dst: dst.write(src.read())
This is not optimal way for applications where the file is too large or when memory is critical, thus Swati's answer should be preferred.
-
1not safe towards symlinks or hardlinks as it copies the linked file multiple times– TcllCommented Apr 24, 2023 at 12:20
There are two best ways to copy file in Python.
1. We can use the shutil
module
Code Example:
import shutil
shutil.copyfile('/path/to/file', '/path/to/new/file')
There are other methods available also other than copyfile, like copy, copy2, etc, but copyfile is best in terms of performance,
2. We can use the OS
module
Code Example:
import os
os.system('cp /path/to/file /path/to/new/file')
Another method is by the use of a subprocess, but it is not preferable as it’s one of the call methods and is not secure.
-
It ought to link to documentation (but not naked links). Commented Dec 7, 2022 at 4:40
-
1
-
@PeterMortensen, added links. The answer would add value as other answers are either too broad or either too straight, this answer contains the helpful information that can be used easiy. Commented Dec 7, 2022 at 5:03
-
1I think you may be confused? The python docs for os.system note: "The subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using this function." You note that there are potential security issues with subprocess. However, that is only true if shell=True (shell=False is default), plus those same sec issues exist when using os.system (unsanitised user input). Rather than your os.system example, this is preferable: subprocess.run([''cp', dst, src]). Commented Sep 28, 2023 at 23:31
Firstly, I made an exhaustive cheat sheet of the shutil methods for your reference.
shutil_methods =
{'copy':['shutil.copyfileobj',
'shutil.copyfile',
'shutil.copymode',
'shutil.copystat',
'shutil.copy',
'shutil.copy2',
'shutil.copytree',],
'move':['shutil.rmtree',
'shutil.move',],
'exception': ['exception shutil.SameFileError',
'exception shutil.Error'],
'others':['shutil.disk_usage',
'shutil.chown',
'shutil.which',
'shutil.ignore_patterns',]
}
Secondly, explaining methods of copy in examples:
shutil.copyfileobj(fsrc, fdst[, length])
manipulate opened objectsIn [3]: src = '~/Documents/Head+First+SQL.pdf' In [4]: dst = '~/desktop' In [5]: shutil.copyfileobj(src, dst) AttributeError: 'str' object has no attribute 'read' # Copy the file object In [7]: with open(src, 'rb') as f1,open(os.path.join(dst,'test.pdf'), 'wb') as f2: ...: shutil.copyfileobj(f1, f2) In [8]: os.stat(os.path.join(dst,'test.pdf')) Out[8]: os.stat_result(st_mode=33188, st_ino=8598319475, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067347, st_mtime=1516067335, st_ctime=1516067345)
shutil.copyfile(src, dst, *, follow_symlinks=True)
Copy and renameIn [9]: shutil.copyfile(src, dst) IsADirectoryError: [Errno 21] Is a directory: ~/desktop' # So dst should be a filename instead of a directory name
shutil.copy()
Copy without preseving the metadataIn [10]: shutil.copy(src, dst) Out[10]: ~/desktop/Head+First+SQL.pdf' # Check their metadata In [25]: os.stat(src) Out[25]: os.stat_result(st_mode=33188, st_ino=597749, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516066425, st_mtime=1493698739, st_ctime=1514871215) In [26]: os.stat(os.path.join(dst, 'Head+First+SQL.pdf')) Out[26]: os.stat_result(st_mode=33188, st_ino=8598313736, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516066427, st_mtime=1516066425, st_ctime=1516066425) # st_atime,st_mtime,st_ctime changed
shutil.copy2()
Copy with preserving the metadataIn [30]: shutil.copy2(src, dst) Out[30]: ~/desktop/Head+First+SQL.pdf' In [31]: os.stat(src) Out[31]: os.stat_result(st_mode=33188, st_ino=597749, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067055, st_mtime=1493698739, st_ctime=1514871215) In [32]: os.stat(os.path.join(dst, 'Head+First+SQL.pdf')) Out[32]: os.stat_result(st_mode=33188, st_ino=8598313736, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=13507926, st_atime=1516067063, st_mtime=1493698739, st_ctime=1516067055) # Preserved st_mtime
shutil.copytree()
Recursively copy an entire directory tree rooted at src, returning the destination directory.
As of Python 3.5 you can do the following for small files (ie: text files, small jpegs):
from pathlib import Path
source = Path('../path/to/my/file.txt')
destination = Path('../path/where/i/want/to/store/it.txt')
destination.write_bytes(source.read_bytes())
write_bytes
will overwrite whatever was at the destination's location
-
-
@Kevin my mistake I should have written down the why when it was fresh. I'll have to review again and update this as I can't remember anymore ¯_(ツ)_/¯– MarcCommented Sep 12, 2022 at 16:29
-
2
You could use os.system('cp nameoffilegeneratedbyprogram /otherdirectory/')
.
Or as I did it,
os.system('cp '+ rawfile + ' rawdata.dat')
where rawfile
is the name that I had generated inside the program.
This is a Linux-only solution.
-
17this is not portable, and unnecessary since you can just use shutil. Commented Jun 12, 2017 at 14:05
-
6Even when
shutil
is not available -subprocess.run()
(withoutshell=True
!) is the better alternative toos.system()
. Commented Jul 9, 2017 at 11:52 -
2
-
2
subprocess.run()
as suggested by @maxschlepzig is a big step forward, when calling external programs. For flexibility and security however, use the['cp', rawfile, 'rawdata.dat']
form of passing the command line. (However, for copying,shutil
and friends are recommended over calling an external program.) Commented Apr 29, 2019 at 3:09 -
2
Use
open(destination, 'wb').write(open(source, 'rb').read())
Open the source file in read mode, and write to the destination file in write mode.
-
2All answers need explanation, even if it is one sentence. No explanation sets bad precedent and is not helpful in understanding the program. What if a complete Python noob came along and saw this, wanted to use it, but couldn't because they don't understand it? You want to be helpful to all in your answers.– moltarzeCommented Mar 30, 2019 at 22:19
-
3Isn't that missing the
.close()
on all of thoseopen(...)
s? Commented Apr 22, 2019 at 22:37 -
No need of .close(), as we are NOT STORING the file pointer object anywhere(neither for the src file nor for the destination file).– SunCommented Apr 23, 2019 at 21:19
-
AFAIK, it is undefined when the files are actually closed, @SundeepBorra. Using
with
(as in the example above) is recommended and not more complicated. Usingread()
on a raw file reads the entire file into memory, which may be too big. Use a standard function like fromshutil
so that you and whoever else is involved in the code does not need to worry about special cases. docs.python.org/3/library/io.html#io.BufferedReader Commented Apr 29, 2019 at 3:17 -
2Same suboptimal memory-wasting approach as yellow01's answer. Commented Apr 29, 2019 at 11:04
Use subprocess.call
to copy the file
from subprocess import call
call("cp -p <file> <file>", shell=True)
-
22
-
10Such a
call
is unsecure. Please refere to the subproces docu about it.– buhtzCommented Apr 2, 2017 at 7:57 -
6this is not portable, and unnecessary since you can just use shutil. Commented Jun 12, 2017 at 14:05
-
5
-
Maybe detect the operating system before starting (whether it's DOS or Unix, because those are the two most used) Commented Nov 9, 2018 at 15:51
For large files, I read the file line by line and read each line into an array. Then, once the array reached a certain size, append it to a new file.
for line in open("file.txt", "r"):
list.append(line)
if len(list) == 1000000:
output.writelines(list)
del list[:]
-
2this seems a little redundant since the writer should handle buffering.
for l in open('file.txt','r'): output.write(l)
should work find; just setup the output stream buffer to your needs. or you can go by the bytes by looping over a try withoutput.write(read(n)); output.flush()
wheren
is the number of bytes you'd like to write at a time. both of these also don't have an condition to check which is a bonus.– ownsCommented Jun 12, 2015 at 17:30 -
1Yes, but I thought that maybe this could be easier to understand because it copies entire lines rather than parts of them (in case we don't know how many bytes each line is).– rassa45Commented Jun 13, 2015 at 18:42
-
@owns To add to this question a year later,
writelines()
has shown slightly better performance overwrite()
since we don't waste time consistently opening a new filestream, and instead write new lines as one large bytefeed.– rassa45Commented Nov 30, 2016 at 3:33 -
1looking at the source - writelines calls write, hg.python.org/cpython/file/c6880edaf6f3/Modules/_io/bytesio.c. Also, the file stream is already open, so write wouldn't need to reopen it every time.– ownsCommented May 3, 2017 at 0:24
-
3This is awful. It does unnecessary work for no good reason. It doesn't work for arbitrary files. The copy isn't byte-identical if the input has unusual line endings on systems like Windows. Why do you think that this might be easier to understand than a call to a copy function in
shutil
? Even when ignoringshutil
, a simple block read/write loop (using unbuffered IO) is straight forward, would be efficient and would make much more sense than this, and thus is surely easier to teach and understand. Commented Apr 29, 2019 at 19:04
In case you've come this far down. The answer is that you need the entire path and file name
import os
shutil.copy(os.path.join(old_dir, file), os.path.join(new_dir, file))
Here is a simple way to do it, without any module. It's similar to this answer, but has the benefit to also work if it's a big file that doesn't fit in RAM:
with open('sourcefile', 'rb') as f, open('destfile', 'wb') as g:
while True:
block = f.read(16*1024*1024) # work by blocks of 16 MB
if not block: # end of file
break
g.write(block)
Since we're writing a new file, it does not preserve the modification time, etc.
We can then use os.utime
for this if needed.
Similar to the accepted answer, the following code block might come in handy if you also want to make sure to create any (non-existent) folders in the path to the destination.
from os import path, makedirs
from shutil import copyfile
makedirs(path.dirname(path.abspath(destination_path)), exist_ok=True)
copyfile(source_path, destination_path)
As the accepted answers notes, these lines will overwrite any file which exists at the destination path, so sometimes it might be useful to also add: if not path.exists(destination_path):
before this code block.
For the answer everyone recommends, if you prefer not to use the standard modules, or have completely removed them as I've done, preferring more core C methods over poorly written python methods
The way shutil works is symlink/hardlink safe, but is rather slow due to os.path.normpath()
containing a while (nt, mac) or for (posix) loop, used in testing if src
and dst
are the same in shutil.copyfile()
This part is mostly unneeded if you know for certain src
and dst
will never be the same file, otherwise a faster C approach could potentially be used.
(note that just because a module may be C doesn't inherently mean it's faster, know that what you use is actually written well before you use it)
After that initial testing, copyfile()
runs a for loop on a dynamic tuple of (src, dst)
, testing for special files (such as sockets or devices in posix).
Finally, if follow_symlinks
is False, copyfile()
tests if src
is a symlink with os.path.islink()
, which varies between nt.lstat()
or posix.lstat()
(os.lstat()
) on Windows and Linux, or Carbon.File.ResolveAliasFile(s, 0)[2]
on Mac.
If that test resolves True, the core code that copies a symlink/hardlink is:
os.symlink(os.readlink(src), dst)
Hardlinks in posix are done with posix.link()
, which shutil.copyfile()
doesn't call, despite being callable through os.link()
.
(probably because the only way to check for hardlinks is to hashmap the os.lstat()
(st_ino
and st_dev
specifically) of the first inode we know about, and assume that's the hardlink target)
Else, file copying is done via basic file buffers:
with open(src, 'rb') as fsrc:
with open(dst, 'wb') as fdst:
copyfileobj(fsrc, fdst)
(similar to other answers here)
copyfileobj()
is a bit special in that it's buffer safe, using a length
argument to read the file buffer in chunks:
def copyfileobj(fsrc, fdst, length=16*1024):
"""copy data from file-like object fsrc to file-like object fdst"""
while 1:
buf = fsrc.read(length)
if not buf:
break
fdst.write(buf)
Hope this answer helps shed some light on the mystery of file copying using core mechanisms in python. :)
Overall, shutil isn't too poorly written, especially after the second test in copyfile()
, so it's not a horrible choice to use if you're lazy, but the initial tests will be a bit slow for a mass copy due to the minor bloat.
I would like to propose a different solution.
def copy(source, destination):
with open(source, 'rb') as file:
myFile = file.read()
with open(destination, 'wb') as file:
file.write(myFile)
copy("foo.txt", "bar.txt")
The file is opened, and it's data is written to a new file of your choosing.
-
in Linux, this is not safe towards symlinks or hardlinks, the linked file is rewritten multiple times– TcllCommented Apr 24, 2023 at 12:24