I continued my research on how to best solve the problem with permissions and believe I have found a better way than the ones suggested above by me.
I am adding this as a second answer, so that you can read details of my previous investigation.
Definitions common to all the solutions below
First, let me start with a few common definitions:
- webserver - Apache or nginx; unless specified otherwise in parts of it, this answer applies equally to both LAMP and LEMP stacks
- site - a project running/accessible under its own user; in Apache represented as one virtual host; one webserver runs multiple sites
- site user - user having access to the site; both PHP and SFTP processes are running as a site user
- site-user, site-group - the name of the system user and group of the site user
- ws-group - group the webserver runs under
- public files - files intended to be accessible by the webserver, potentially published to the internet
- private files - files that must not be published to the internet and should not be accessible by the webserver
Problem statement, refined
How to set up permissions when the webserver is used to host multiple sites, so that:
- each site runs as a separate user using PHP-FPM
- each site user can see only his own files
- site files are accessible through SFTP
- the webserver can see all files
- all that works with file uploads
- all that works with Wordpress - however, this problem is not Wordpress-specific, as mentioned in one of the comments above, but it concerns any application that manipulates with filesystem permissions
For my use, I have extended the problem statement with these nice-to-haves:
- Each site can have public and private files; the webserver can see only public files (that is, not everything). PHP and SFTP should have full access to both public and private files.
- There's a directory for private files in place; files are made private by merely being placed in this directory. This is to minimise the requirements on the webmasters and their applications on how to make files private. Specifically, the need for using
chmod()
with a a concrete permissions constant to make files private should be avoided.
- Usage of
chmod()
and other permission-related operations should not be limited in PHP, e.g. by replacing them with stub functions.
Directory structure of a site
The directory structure for an example site used throughout this answer is as follows:
drwxr-x---+ root site-group / # site root
drwxr-x--- site-user site-group /home # private files
drwxr-x---+ site-user site-group /htdocs # public files
drwxr-x--- root site-group /logs # logs root
drwxr-S--- root site-group /logs/ws # webserver logs
drwxr-x--- site-user site-group /logs/php # PHP logs
drwxr-x--- root site-group /tmp # tmp files root
drwxr-x--- site-user site-group /tmp/php # PHP tmp files
drwxr-x---+ site-user site-group /tmp/upload # uploaded files
Notes:
- root-owned directories (site root, logs root, tmp files root) are to make sure that nobody can create or remove subdirectories or mess with their permissions; specifically, this makes sure that e.g. the site user won't be able to delete the
/htdocs
directory, or he won't be able to give access to all users
- the
+
symbol indicates filesystem ACLs; the concrete ACLs used are discussed in the "ACL" solution below and are not relevant to other solutions
- the
/logs/ws
directory has permissions designed specifically for Apache; Apache opens its logs on startup before dropping permissions, therefore the directory can be root-owned; the setgid bit is set there to make sure the site-group sticks, so that the site user has read (and not write!) access to the logs
- Both PHP and SFTP processes run with umask
0027
Such a structure is different from the one in the original question, but it's equivalent with regards to the solutions described below - they will apply equally to the original question.
The solutions
The setgid
approach
As the author of the question found out, one way how to address this is to have the whole site being owned by the site user, except for the /htdocs
directory being group-owned by the ws-group.
This way, site user (PHP, SFTP) has access everywhere and the webserver only to the /htdocs
directory.
However, since it's not desirable for the site user to belong to the ws-group (as discussed below in the "shared group" solution), the files newly created by the site user in the /htdocs
directory would not be accessible by the ws-group, that is not accessible by the webserver.
The solution to ensure the group ownership also for new files is to set the setgid on the whole /htdocs
directory recursively:
chmod -R g+s -- ./htdocs
That comes with a few challenges:
- As asked in the original question, that does not work well with file uploads.
- It's very fragile - the Wordpress (or any other application) can easily destroy the setgid flag by a wrong configuration or accident.
- The ws-group will be set to all uploaded files, even if they were not aimed at being placed in the public files directory. Ideally, the group should be applied only when moving to the
/htdocs
directory.
1. File uploads
When a PHP uploads a file, it puts the file into the temporary directory as configured by the upload-tmp-dir configuration directive. Since this directory is almost certainly outside of /htdocs
, the setgid won't take effect. The file won't get the ws-group and move_uploaded_file()
function in PHP won't change that fact. TLDR: file won't be accessible by the webserver.
The solution to this problem is simple: create a dedicated "upload" directory (as suggested above in the example directory structure and using the aforementioned PHP configuration directive upload-tmp-dir), make it group owned by the ws-group and apply the setgid there too. This is secure enough - the directory will be used only for PHP uploads, other temporary files will be placed elsewhere. And it will make the files group-owned by ws-group.
2. Fragility
setgid is extremely fragile - any mkdir()
or chmod()
operation can easily destroy it and Wordpress is full of such calls.
Even if you change the behaviour of Wordpress using the FS_CHMOD_DIR
and FS_CHMOD_FILE
configuration constants, some plugins will still ignore this and change the mode to "more secure" 0400
or so - efectively turning off setgid.
While I believe Wordpress plugins should NOT assume how to secure files on the server, it is what it is and this approach can't be used in a stable way.
I've also witnessed similar behaviour with some other frameworks.
Shared group for all PHP processes
Another solution suggested above is to put all PHP processes into the ws-group.
This will work, but as the author adds, it's a security issue - all PHP processes would be able to read data across all sites and there's not much to do to stop this: open_basedir
is not bullet-proof and chroot
is tricky to setup.
As such, this solution is not acceptable for the security concerns.
Adding the webserver to all site groups
The other way round could be to add the webserver to all site groups. This way, PHP processes would see only their own files, but the webserver could see everything.
This is not matching my extended requirements as all PHP's files, including private files, would be visible to the webserver unless their access mode would be restricted to owner only. It's not possible to rely on the webmasters to be disciplined enough and/or to understand deeply how linux file system permissions work.
A bigger problem with this solution, however, was that I couldn't make Apache member of more than 30 groups. At least in the Alpine distribution, once having added Apache to more than 30 groups, the following error appeared:
initgroups: unable to set groups for User apache...
As a result, Apache was not added to any groups.
Even if I'd be able to solve it, another problem is that the webserver would have to monitor for changes in sites and dynamically add/remove itself to/from site groups. At least in Apache, that also requires Apache to restart (not only reload).
Too clumsy, so I decided to look further.
Using mpm-itk
Apache's module
One interesting Apache-only solution popped up: mpm-itk module.
This module allows each Apache's virtual host to run as a separate user.
More about it can be found on the module homepage and e.g. also in this StackOverflow answer.
The downside of this solution is that it's not bundled in the Alpine distribution since Apache version 2.4 and I would have to either switch to e.g. Ubuntu (where it's still present), or to compile it myself.
Another downside is that it gives Apache access to everything that is accessible by the PHP group. Similarly to the previous solution, webmasters would have to make sure that private files are not group-accessible.
Using filesystem ACLs
Finally, I came across a solution with filesystem ACLs.
Essentially, it's very similar to the setgid approach: it allows for more fine-grained access control and allows to give selected system groups access to directories/files even if they don't own or group-own them.
My solution is to use it as follows:
setfacl -m "g:ws-group:x" -- "./"
setfacl -d -R -m "g:ws-group:rX" -- "./htdocs"
setfacl -R -m "g:ws-group:rX" -- "./htdocs"
The first command gives "search" permission to the webserver to the site root. Without it, the webserver won't be able to access anything.
The second command sets the default ACLs (applied to all new directories or files) to the /htdocs
directory recursively and the third one adds the ACLs to already existing contents of the directory.
Similarly to the setgid approach, this also needs to be set on the temporary directory for uploaded files:
setfacl -d -m "g:ws-group:r" -- "./tmp/upload"
While it's effectively the same as the setgid solution, the advantage of this approach is that the ACLs are not as fragile as setgid and chmod()
or mkdir()
operations do not affect them.
Server administrators can easily control which directories are by default public and which are private.
Webmasters can choose, whether in order to have some files not published, whether to use the private files directory and to skip the troubles understanding linux filesystem permissions model, or if they want to manually control permissions by removing group access to selected files. As opposed to setgid, if you remove group access to a directory/folder, the ACLs stay. That means, by granting group access back, the ACLs will become effective again.
This approach fits all my needs and is my solution of choice.
Closing thoughts
The filesystem permissions seem to be the current interface on which application controls access to files.
I'm deeply convinced that's wrong, as that leads to application authors taking various assumptions about the hosting, such as:
- "If I remove the access to a file from group and others, it won't be accessible publicly."
- "In order to make a file publicly accessible, I need to give it access mode of
0666
(or 0444
)."
Both of these assumptions might be wrong. But even if they would be true, the existence of this SuperUser question shows how things can easily go wrong:
- The application creates a directory and makes it private by assigning access mode
0700
.
- The application creates files in the new directory.
- Later on, the application decides to make the access public by granting group access (
chmod -R g+rX dir
).
- Not knowing that the removed setgid was needed, the files are not in the correct group and are thus still not accessible.
- The application will then decide to make the files readable by all users (
chmod -R go+rX dir
).
- Now, sensitive data might be exposed to other users of the system.
Of course, a solution is for the hosting provider to declare guarantees and requirements and the applications to follow it. While I essentially agree with that, I am suggesting, that the current guarantees and requirements are too vague or too complex to be used reliably and securely enough.
In my opinion, hosted applications should not mess with the underlying filesystem permissions at all. That's the job of the infrastructure. That's why I suggest hosting offering "public" resp. "private" directories and making it the job of the underlying infrastructure to grant resp. revoke public access to them.
That certainly also demands webmasters to have certain knowledge of the infrastructure, but it's a very simple concept that can be easily agreed upon. It's also way easier to implement, I would argue it's easier than to have a semi-complex logic around filesystem permissions, as can be seen on the Wordpress example.
I am aware that this does not apply well to bigger applications where the infrastructure will be tuned to their needs. However, there's still a plenty of small applications around the world that should adhere to the requirements of mass-virtual hostings, as it does not pay off for hosting providers to run each container in a separate container.