15

I'd like to give a try at copying the contents of a file over to another one by using memory mapped I/O in Linux via mmap(). The intention is to check by myself if that's better than using fread() and fwrite() and how would it deal with big files (like couple of GiBs for example, since the file is read whole I want to know if I need to have such amount of memory for it).

This is the code I'm working with right now:

// Open original file descriptor:
int orig_fd = open(argv[1], O_RDONLY);
// Check if it was really opened:
if (orig_fd == -1) {
    fprintf(stderr, "ERROR: File %s couldn't be opened:\n", argv[1]);
    fprintf(stderr, "%d - %s\n", errno, strerror(errno));
    exit(EX_NOINPUT);
}
// Idem for the destination file:
int dest_fd = open(argv[2], O_WRONLY | O_CREAT | O_TRUNC, 0644);
// Check if it was really opened:
if (dest_fd == -1) {
    fprintf(stderr, "ERROR: File %s couldn't be opened:\n", argv[2]);
    fprintf(stderr, "%d - %s\n", errno, strerror(errno));
    // Close original file descriptor too:
    close(orig_fd);
    exit(EX_CANTCREAT);
}

// Acquire file size:
struct stat info = {0};
if (fstat(orig_fd, &info)) {
    fprintf(stderr, "ERROR: Couldn't get info on %s:\n", argv[1]);
    fprintf(stderr, "%d - %s\n", errno, strerror(errno));
    // Close file descriptors:
    close(orig_fd);
    close(dest_fd);
    exit(EX_IOERR);
}
// Set destination file size:
if (ftruncate(dest_fd, info.st_size)) {
    fprintf(stderr, "ERROR: Unable to set %s file size:\n", argv[2]);
    fprintf(stderr, "%d - %s\n", errno, strerror(errno));
    // Close file descriptors:
    close(orig_fd);
    close(dest_fd);
    exit(EX_IOERR);
}

// Map original file and close its descriptor:
char *orig = mmap(NULL, info.st_size, PROT_READ, MAP_PRIVATE, orig_fd, 0);
if (orig == MAP_FAILED) {
    fprintf(stderr, "ERROR: Mapping of %s failed:\n", argv[1]);
    fprintf(stderr, "%d - %s\n", errno, strerror(errno));
    // Close file descriptors:
    close(orig_fd);
    close(dest_fd);
    exit(EX_IOERR);
}
close(orig_fd);
// Map destination file and close its descriptor:
char *dest = mmap(NULL, info.st_size, PROT_WRITE, MAP_SHARED, dest_fd, 0);
if (dest == MAP_FAILED) {
    fprintf(stderr, "ERROR: Mapping of %s failed:\n", argv[2]);
    fprintf(stderr, "%d - %s\n", errno, strerror(errno));
    // Close file descriptors and unmap first file:
    munmap(orig, info.st_size);
    close(dest_fd);
    exit(EX_IOERR);
}
close(dest_fd);

// Copy file contents:
int i = info.st_size;
char *read_ptr = orig, *write_ptr = dest;
while (--i) {
    *write_ptr++ = *read_ptr++;
}

// Unmap files:
munmap(orig, info.st_size);
munmap(dest, info.st_size);

I think it may be a way of doing it but I keep getting an error trying to map the destination file, concretely code 13 (permission denied).

I don't have a clue on why is it failing, I can write to that file since the file gets created and all and the file I'm trying to copy is just a couple of KiBs in size.

Can anybody spot the problem? How come I had permission to map the original file but not the destination one?

NOTE: If anyone is to use the loop to copy bytes posted in the question instead of memcpy for example, the loop condition should be i-- instead to copy all contents. Thanks to jxh for spotting that.

3
  • Can you check to see if the ftruncate call succeeded?
    – jxh
    Commented Jun 19, 2013 at 23:06
  • Thank you jxh, the destination file seems to be created with the appropriate size. I was checking manually up to now, I'll update the code I posted with that. Commented Jun 19, 2013 at 23:15
  • Memory mapping is only useful if you use it to gather data from a source file. It's not that efficient as a write target. I'd say you'd be best off using unistd.h low-level I/O (open(), read(), write(), fdatasync(), and close()) with buffers (chunk sizes) based on what fstat() suggests for files; largeish powers of two like 262144 or larger otherwise. There are even faster ways to copy files, though. Commented Jun 20, 2013 at 2:14

3 Answers 3

20

From the mmap() man page:

EACCES
A file descriptor refers to a non-regular file. Or MAP_PRIVATE was requested, but fd is not open for reading. Or MAP_SHARED was requested and PROT_WRITE is set, but fd is not open in read/write (O_RDWR) mode. Or PROT_WRITE is set, but the file is append-only.

You are opening your destination file with O_WRONLY. Use O_RDWR instead.

Also, you should use memcpy to copy the memory rather than using your own loop:

memcpy(dest, orig, info.st_size);

Your loop has an off by 1 bug.

1
  • Thanks, that seems to be the problem; destination file has to be opened for reading too. And you're right, that loop to copy bytes misses one of them it should be i-- instead or --i. Commented Jun 20, 2013 at 0:00
1

This works for me. Note that I had to open the destination O_RDWR. I suspect the kernel attempts to map whole pages from the file into memory (reading it) because you're updating it a byte or word at a time, and that might not change the whole page.

A couple of other points:

  1. You don't need to close and unmap stuff on error if you're just going to exit.

  2. Use memcpy and don't write your own byte-copying loop. Memcpy will be a lot better optimised in general. (Though it's not always the absolute best.)

  3. You might want to read the source code to FreeBSD's "cp" utility. Take a look here and search for the use of mmap. http://svnweb.freebsd.org/base/stable/9/bin/cp/utils.c?revision=225736&view=markup


#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <string.h>
#include <sys/stat.h>

int main(int argc, char *argv[])
{
        int s, d;
        struct stat st;
        void *sp, *dp;
        s = open(argv[1], O_RDONLY);
        if (s == -1) {
                perror("open source");
                exit(1);
        }
        d = open(argv[2], O_RDWR | O_CREAT | O_TRUNC, 0644);
        if (d == -1) {
                perror("open destintation");
                exit(1);
        }
        if (fstat(s, &st)) {
                perror("stat source");
                exit(1);
        }
        if (ftruncate(d, st.st_size)) {
                perror("truncate destination");
                exit(1);
        }
        sp = mmap(NULL, st.st_size, PROT_READ, MAP_SHARED, s, 0);
        if (sp == MAP_FAILED) {
                perror("map source");
                exit(1);
        }
        dp = mmap(NULL, st.st_size, PROT_WRITE | PROT_READ, MAP_SHARED, d, 0);
        if (dp == MAP_FAILED) {
                perror("map destintation");
                exit(1);
        }
        memcpy(dp, sp, st.st_size);
        return 0;
}
3
  • Thank you for the answer! I was always forced to unload stuff, free memory and all even before the program was to terminate but I always thought that the OS should de-allocate and close all resources a program was using upon completion. I'll certainly look at that link you posted to learn more about the matter. Commented Jun 20, 2013 at 0:09
  • @JamesRussell: Performing those cleanups is unnecessary from a correctness point of view, but it does help sanitize the code when using a memory debugger.
    – jxh
    Commented Jun 20, 2013 at 0:14
  • You can be sure that a Unix-like system will clean up most resources when your process exits (or is killed). The exceptions are mostly things with names in the filesystem (files, devices, named sockets, etc.)
    – rptb1
    Commented Jun 20, 2013 at 0:15
1

Original File: O_RDONLY open, MAP_PRIVATE mmap

destination file: O_WRONLY open, MAP_SHARED mmap

You need to open with O_RDWR flag for using MAP_SHARED.

Don't you actually need to do MAP_FILE | MAP_SHARED ?

1
  • Thank you for the answer, you're right regarding O_RDWR for the destination file, but MAP_FILE doesn't seem to exist in my mmap man file. Commented Jun 20, 2013 at 0:05

Not the answer you're looking for? Browse other questions tagged or ask your own question.