If the pathname starts with the '/' character, the starting lookup directory is
the root directory of the current process. (A process inherits its root
directory from its parent. Usually this will be the root directory of the file
hierarchy. A process may get a different root directory by use of the
chroot(2) system call. A process may get an entirely private namespace
in case it - or one of its ancestors - was started by an invocation of the
clone(2) system call that had the CLONE_NEWNS flag set.) This handles
the '/' part of the pathname.
If the pathname does not start with the '/' character, the starting lookup
directory of the resolution process is the current working directory of the
process. (This is also inherited from the parent. It can be changed by use of
the chdir(2) system call.)
Pathnames starting with a '/' character are called absolute pathnames. Pathnames
not starting with a '/' are called relative pathnames.
Set the current lookup directory to the starting lookup directory. Now, for each
non-final component of the pathname, where a component is a substring
delimited by '/' characters, this component is looked up in the current lookup
If the process does not have search permission on the current lookup directory,
an EACCES error is returned ("Permission denied").
If the component is not found, an ENOENT error is returned ("No such file
If the component is found, but is neither a directory nor a symbolic link, an
ENOTDIR error is returned ("Not a directory").
If the component is found and is a directory, we set the current lookup
directory to that directory, and go to the next component.
If the component is found and is a symbolic link (symlink), we first resolve
this symbolic link (with the current lookup directory as starting lookup
directory). Upon error, that error is returned. If the result is not a
directory, an ENOTDIR error is returned. If the resolution of the symlink is
successful and returns a directory, we set the current lookup directory to
that directory, and go to the next component. Note that the resolution process
here involves recursion. In order to protect the kernel against stack
overflow, and also to protect against denial of service, there are limits on
the maximum recursion depth, and on the maximum number of symlinks followed.
An ELOOP error is returned when the maximum is exceeded ("Too many levels
of symbolic links").
The lookup of the final component of the pathname goes just like that of all
other components, as described in the previous step, with two differences: (i)
the final component need not be a directory (at least as far as the path
resolution process is concerned - it may have to be a directory, or a
non-directory, because of the requirements of the specific system call), and
(ii) it is not necessarily an error if the component is not found - maybe we
are just creating it. The details on the treatment of the final entry are
described in the manual pages of the specific system calls.
If a pathname ends in a '/', that forces resolution of the preceding component
as in Step 2 - it has to exist and resolve to a directory. Otherwise a
trailing '/' is ignored. (Or, equivalently, a pathname with a trailing '/' is
equivalent to the pathname obtained by appending '.' to it.)
If the last component of a pathname is a symbolic link, then it depends on the
system call whether the file referred to will be the symbolic link or the
result of path resolution on its contents. For example, the system call
lstat(2) will operate on the symlink, while stat(2) operates on
the file pointed to by the symlink.
The permission bits of a file consist of three groups of three bits, cf.
chmod(1) and stat(2). The first group of three is used when the
effective user ID of the current process equals the owner ID of the file. The
second group of three is used when the group ID of the file either equals the
effective group ID of the current process, or is one of the supplementary
group IDs of the current process (as set by setgroups(2)). When neither
holds, the third group is used.
Of the three bits used, the first bit determines read permission, the second
write permission, and the last execute permission in case of ordinary files,
or search permission in case of directories.
Linux uses the fsuid instead of the effective user ID in permission checks.
Ordinarily the fsuid will equal the effective user ID, but the fsuid can be
changed by the system call setfsuid(2).
(Here "fsuid" stands for something like "file system user
ID". The concept was required for the implementation of a user space NFS
server at a time when processes could send a signal to a process with the same
effective user ID. It is obsolete now. Nobody should use setfsuid(2).)
Similarly, Linux uses the fsgid instead of the effective group ID. See
If the permission bits of the file deny whatever is asked, permission can still
be granted by the appropriate capabilities.
Traditional systems do not use capabilities and root (user ID 0) is
all-powerful. Such systems are presently (2.6.7) handled by giving root all
capabilities except for CAP_SETPCAP. More precisely, at exec time a process
gets all capabilities except CAP_SETPCAP and the five capabilities CAP_CHOWN,
CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH, CAP_FOWNER, CAP_FSETID, in case it has
zero euid, and it gets these last five capabilities in case it has zero fsuid,
while all other processes get no capabilities.
The CAP_DAC_OVERRIDE capability overrides all permission checking, but will only
grant execute permission when at least one of the three execute permission
bits is set.
The CAP_DAC_READ_SEARCH capability will grant read and search permission on
directories, and read permission on ordinary files.
The CAP_SYS_ADMIN capability will (e.g.) allow a process to violate the limit
(visible in /proc/sys/fs/file-max) on the maximum number of open files
in the system, where a process lacking that capability would see an ENFILE