mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-01-22 16:06:04 -05:00
dad2f20715
Improve user space and kernel documentation. Signed-off-by: Daniel Burgener <dburgener@linux.microsoft.com> Link: https://lore.kernel.org/r/20241015172647.2007644-1-dburgener@linux.microsoft.com [mic: Extend commit message, reword ptrace restriction as discussed in the thread] Signed-off-by: Mickaël Salaün <mic@digikod.net>
694 lines
28 KiB
ReStructuredText
694 lines
28 KiB
ReStructuredText
.. SPDX-License-Identifier: GPL-2.0
|
|
.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
|
|
.. Copyright © 2019-2020 ANSSI
|
|
.. Copyright © 2021-2022 Microsoft Corporation
|
|
|
|
=====================================
|
|
Landlock: unprivileged access control
|
|
=====================================
|
|
|
|
:Author: Mickaël Salaün
|
|
:Date: October 2024
|
|
|
|
The goal of Landlock is to enable restriction of ambient rights (e.g. global
|
|
filesystem or network access) for a set of processes. Because Landlock
|
|
is a stackable LSM, it makes it possible to create safe security sandboxes as
|
|
new security layers in addition to the existing system-wide access-controls.
|
|
This kind of sandbox is expected to help mitigate the security impact of bugs or
|
|
unexpected/malicious behaviors in user space applications. Landlock empowers
|
|
any process, including unprivileged ones, to securely restrict themselves.
|
|
|
|
We can quickly make sure that Landlock is enabled in the running system by
|
|
looking for "landlock: Up and running" in kernel logs (as root):
|
|
``dmesg | grep landlock || journalctl -kb -g landlock`` .
|
|
Developers can also easily check for Landlock support with a
|
|
:ref:`related system call <landlock_abi_versions>`.
|
|
If Landlock is not currently supported, we need to
|
|
:ref:`configure the kernel appropriately <kernel_support>`.
|
|
|
|
Landlock rules
|
|
==============
|
|
|
|
A Landlock rule describes an action on an object which the process intends to
|
|
perform. A set of rules is aggregated in a ruleset, which can then restrict
|
|
the thread enforcing it, and its future children.
|
|
|
|
The two existing types of rules are:
|
|
|
|
Filesystem rules
|
|
For these rules, the object is a file hierarchy,
|
|
and the related filesystem actions are defined with
|
|
`filesystem access rights`.
|
|
|
|
Network rules (since ABI v4)
|
|
For these rules, the object is a TCP port,
|
|
and the related actions are defined with `network access rights`.
|
|
|
|
Defining and enforcing a security policy
|
|
----------------------------------------
|
|
|
|
We first need to define the ruleset that will contain our rules.
|
|
|
|
For this example, the ruleset will contain rules that only allow filesystem
|
|
read actions and establish a specific TCP connection. Filesystem write
|
|
actions and other TCP actions will be denied.
|
|
|
|
The ruleset then needs to handle both these kinds of actions. This is
|
|
required for backward and forward compatibility (i.e. the kernel and user
|
|
space may not know each other's supported restrictions), hence the need
|
|
to be explicit about the denied-by-default access rights.
|
|
|
|
.. code-block:: c
|
|
|
|
struct landlock_ruleset_attr ruleset_attr = {
|
|
.handled_access_fs =
|
|
LANDLOCK_ACCESS_FS_EXECUTE |
|
|
LANDLOCK_ACCESS_FS_WRITE_FILE |
|
|
LANDLOCK_ACCESS_FS_READ_FILE |
|
|
LANDLOCK_ACCESS_FS_READ_DIR |
|
|
LANDLOCK_ACCESS_FS_REMOVE_DIR |
|
|
LANDLOCK_ACCESS_FS_REMOVE_FILE |
|
|
LANDLOCK_ACCESS_FS_MAKE_CHAR |
|
|
LANDLOCK_ACCESS_FS_MAKE_DIR |
|
|
LANDLOCK_ACCESS_FS_MAKE_REG |
|
|
LANDLOCK_ACCESS_FS_MAKE_SOCK |
|
|
LANDLOCK_ACCESS_FS_MAKE_FIFO |
|
|
LANDLOCK_ACCESS_FS_MAKE_BLOCK |
|
|
LANDLOCK_ACCESS_FS_MAKE_SYM |
|
|
LANDLOCK_ACCESS_FS_REFER |
|
|
LANDLOCK_ACCESS_FS_TRUNCATE |
|
|
LANDLOCK_ACCESS_FS_IOCTL_DEV,
|
|
.handled_access_net =
|
|
LANDLOCK_ACCESS_NET_BIND_TCP |
|
|
LANDLOCK_ACCESS_NET_CONNECT_TCP,
|
|
.scoped =
|
|
LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
|
|
LANDLOCK_SCOPE_SIGNAL,
|
|
};
|
|
|
|
Because we may not know which kernel version an application will be executed
|
|
on, it is safer to follow a best-effort security approach. Indeed, we
|
|
should try to protect users as much as possible whatever the kernel they are
|
|
using.
|
|
|
|
To be compatible with older Linux versions, we detect the available Landlock ABI
|
|
version, and only use the available subset of access rights:
|
|
|
|
.. code-block:: c
|
|
|
|
int abi;
|
|
|
|
abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
|
|
if (abi < 0) {
|
|
/* Degrades gracefully if Landlock is not handled. */
|
|
perror("The running kernel does not enable to use Landlock");
|
|
return 0;
|
|
}
|
|
switch (abi) {
|
|
case 1:
|
|
/* Removes LANDLOCK_ACCESS_FS_REFER for ABI < 2 */
|
|
ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
|
|
__attribute__((fallthrough));
|
|
case 2:
|
|
/* Removes LANDLOCK_ACCESS_FS_TRUNCATE for ABI < 3 */
|
|
ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE;
|
|
__attribute__((fallthrough));
|
|
case 3:
|
|
/* Removes network support for ABI < 4 */
|
|
ruleset_attr.handled_access_net &=
|
|
~(LANDLOCK_ACCESS_NET_BIND_TCP |
|
|
LANDLOCK_ACCESS_NET_CONNECT_TCP);
|
|
__attribute__((fallthrough));
|
|
case 4:
|
|
/* Removes LANDLOCK_ACCESS_FS_IOCTL_DEV for ABI < 5 */
|
|
ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_IOCTL_DEV;
|
|
__attribute__((fallthrough));
|
|
case 5:
|
|
/* Removes LANDLOCK_SCOPE_* for ABI < 6 */
|
|
ruleset_attr.scoped &= ~(LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
|
|
LANDLOCK_SCOPE_SIGNAL);
|
|
}
|
|
|
|
This enables the creation of an inclusive ruleset that will contain our rules.
|
|
|
|
.. code-block:: c
|
|
|
|
int ruleset_fd;
|
|
|
|
ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
|
|
if (ruleset_fd < 0) {
|
|
perror("Failed to create a ruleset");
|
|
return 1;
|
|
}
|
|
|
|
We can now add a new rule to this ruleset thanks to the returned file
|
|
descriptor referring to this ruleset. The rule will only allow reading the
|
|
file hierarchy ``/usr``. Without another rule, write actions would then be
|
|
denied by the ruleset. To add ``/usr`` to the ruleset, we open it with the
|
|
``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file
|
|
descriptor.
|
|
|
|
.. code-block:: c
|
|
|
|
int err;
|
|
struct landlock_path_beneath_attr path_beneath = {
|
|
.allowed_access =
|
|
LANDLOCK_ACCESS_FS_EXECUTE |
|
|
LANDLOCK_ACCESS_FS_READ_FILE |
|
|
LANDLOCK_ACCESS_FS_READ_DIR,
|
|
};
|
|
|
|
path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
|
|
if (path_beneath.parent_fd < 0) {
|
|
perror("Failed to open file");
|
|
close(ruleset_fd);
|
|
return 1;
|
|
}
|
|
err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
|
|
&path_beneath, 0);
|
|
close(path_beneath.parent_fd);
|
|
if (err) {
|
|
perror("Failed to update ruleset");
|
|
close(ruleset_fd);
|
|
return 1;
|
|
}
|
|
|
|
It may also be required to create rules following the same logic as explained
|
|
for the ruleset creation, by filtering access rights according to the Landlock
|
|
ABI version. In this example, this is not required because all of the requested
|
|
``allowed_access`` rights are already available in ABI 1.
|
|
|
|
For network access-control, we can add a set of rules that allow to use a port
|
|
number for a specific action: HTTPS connections.
|
|
|
|
.. code-block:: c
|
|
|
|
struct landlock_net_port_attr net_port = {
|
|
.allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP,
|
|
.port = 443,
|
|
};
|
|
|
|
err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
|
|
&net_port, 0);
|
|
|
|
The next step is to restrict the current thread from gaining more privileges
|
|
(e.g. through a SUID binary). We now have a ruleset with the first rule
|
|
allowing read access to ``/usr`` while denying all other handled accesses for
|
|
the filesystem, and a second rule allowing HTTPS connections.
|
|
|
|
.. code-block:: c
|
|
|
|
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
|
|
perror("Failed to restrict privileges");
|
|
close(ruleset_fd);
|
|
return 1;
|
|
}
|
|
|
|
The current thread is now ready to sandbox itself with the ruleset.
|
|
|
|
.. code-block:: c
|
|
|
|
if (landlock_restrict_self(ruleset_fd, 0)) {
|
|
perror("Failed to enforce ruleset");
|
|
close(ruleset_fd);
|
|
return 1;
|
|
}
|
|
close(ruleset_fd);
|
|
|
|
If the ``landlock_restrict_self`` system call succeeds, the current thread is
|
|
now restricted and this policy will be enforced on all its subsequently created
|
|
children as well. Once a thread is landlocked, there is no way to remove its
|
|
security policy; only adding more restrictions is allowed. These threads are
|
|
now in a new Landlock domain, which is a merger of their parent one (if any)
|
|
with the new ruleset.
|
|
|
|
Full working code can be found in `samples/landlock/sandboxer.c`_.
|
|
|
|
Good practices
|
|
--------------
|
|
|
|
It is recommended to set access rights to file hierarchy leaves as much as
|
|
possible. For instance, it is better to be able to have ``~/doc/`` as a
|
|
read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to
|
|
``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy.
|
|
Following this good practice leads to self-sufficient hierarchies that do not
|
|
depend on their location (i.e. parent directories). This is particularly
|
|
relevant when we want to allow linking or renaming. Indeed, having consistent
|
|
access rights per directory enables changing the location of such directories
|
|
without relying on the destination directory access rights (except those that
|
|
are required for this operation, see ``LANDLOCK_ACCESS_FS_REFER``
|
|
documentation).
|
|
|
|
Having self-sufficient hierarchies also helps to tighten the required access
|
|
rights to the minimal set of data. This also helps avoid sinkhole directories,
|
|
i.e. directories where data can be linked to but not linked from. However,
|
|
this depends on data organization, which might not be controlled by developers.
|
|
In this case, granting read-write access to ``~/tmp/``, instead of write-only
|
|
access, would potentially allow moving ``~/tmp/`` to a non-readable directory
|
|
and still keep the ability to list the content of ``~/tmp/``.
|
|
|
|
Layers of file path access rights
|
|
---------------------------------
|
|
|
|
Each time a thread enforces a ruleset on itself, it updates its Landlock domain
|
|
with a new layer of policy. This complementary policy is stacked with any
|
|
other rulesets potentially already restricting this thread. A sandboxed thread
|
|
can then safely add more constraints to itself with a new enforced ruleset.
|
|
|
|
One policy layer grants access to a file path if at least one of its rules
|
|
encountered on the path grants the access. A sandboxed thread can only access
|
|
a file path if all its enforced policy layers grant the access as well as all
|
|
the other system access controls (e.g. filesystem DAC, other LSM policies,
|
|
etc.).
|
|
|
|
Bind mounts and OverlayFS
|
|
-------------------------
|
|
|
|
Landlock enables restricting access to file hierarchies, which means that these
|
|
access rights can be propagated with bind mounts (cf.
|
|
Documentation/filesystems/sharedsubtree.rst) but not with
|
|
Documentation/filesystems/overlayfs.rst.
|
|
|
|
A bind mount mirrors a source file hierarchy to a destination. The destination
|
|
hierarchy is then composed of the exact same files, on which Landlock rules can
|
|
be tied, either via the source or the destination path. These rules restrict
|
|
access when they are encountered on a path, which means that they can restrict
|
|
access to multiple file hierarchies at the same time, whether these hierarchies
|
|
are the result of bind mounts or not.
|
|
|
|
An OverlayFS mount point consists of upper and lower layers. These layers are
|
|
combined in a merge directory, and that merged directory becomes available at
|
|
the mount point. This merge hierarchy may include files from the upper and
|
|
lower layers, but modifications performed on the merge hierarchy only reflect
|
|
on the upper layer. From a Landlock policy point of view, all OverlayFS layers
|
|
and merge hierarchies are standalone and each contains their own set of files
|
|
and directories, which is different from bind mounts. A policy restricting an
|
|
OverlayFS layer will not restrict the resulted merged hierarchy, and vice versa.
|
|
Landlock users should then only think about file hierarchies they want to allow
|
|
access to, regardless of the underlying filesystem.
|
|
|
|
Inheritance
|
|
-----------
|
|
|
|
Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain
|
|
restrictions from its parent. This is similar to seccomp inheritance (cf.
|
|
Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with
|
|
task's :manpage:`credentials(7)`. For instance, one process's thread may apply
|
|
Landlock rules to itself, but they will not be automatically applied to other
|
|
sibling threads (unlike POSIX thread credential changes, cf.
|
|
:manpage:`nptl(7)`).
|
|
|
|
When a thread sandboxes itself, we have the guarantee that the related security
|
|
policy will stay enforced on all this thread's descendants. This allows
|
|
creating standalone and modular security policies per application, which will
|
|
automatically be composed between themselves according to their runtime parent
|
|
policies.
|
|
|
|
Ptrace restrictions
|
|
-------------------
|
|
|
|
A sandboxed process has less privileges than a non-sandboxed process and must
|
|
then be subject to additional restrictions when manipulating another process.
|
|
To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
|
|
process, a sandboxed process should have a superset of the target process's
|
|
access rights, which means the tracee must be in a sub-domain of the tracer.
|
|
|
|
IPC scoping
|
|
-----------
|
|
|
|
Similar to the implicit `Ptrace restrictions`_, we may want to further restrict
|
|
interactions between sandboxes. Each Landlock domain can be explicitly scoped
|
|
for a set of actions by specifying it on a ruleset. For example, if a
|
|
sandboxed process should not be able to :manpage:`connect(2)` to a
|
|
non-sandboxed process through abstract :manpage:`unix(7)` sockets, we can
|
|
specify such a restriction with ``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET``.
|
|
Moreover, if a sandboxed process should not be able to send a signal to a
|
|
non-sandboxed process, we can specify this restriction with
|
|
``LANDLOCK_SCOPE_SIGNAL``.
|
|
|
|
A sandboxed process can connect to a non-sandboxed process when its domain is
|
|
not scoped. If a process's domain is scoped, it can only connect to sockets
|
|
created by processes in the same scope.
|
|
Moreover, If a process is scoped to send signal to a non-scoped process, it can
|
|
only send signals to processes in the same scope.
|
|
|
|
A connected datagram socket behaves like a stream socket when its domain is
|
|
scoped, meaning if the domain is scoped after the socket is connected , it can
|
|
still :manpage:`send(2)` data just like a stream socket. However, in the same
|
|
scenario, a non-connected datagram socket cannot send data (with
|
|
:manpage:`sendto(2)`) outside its scope.
|
|
|
|
A process with a scoped domain can inherit a socket created by a non-scoped
|
|
process. The process cannot connect to this socket since it has a scoped
|
|
domain.
|
|
|
|
IPC scoping does not support exceptions, so if a domain is scoped, no rules can
|
|
be added to allow access to resources or processes outside of the scope.
|
|
|
|
Truncating files
|
|
----------------
|
|
|
|
The operations covered by ``LANDLOCK_ACCESS_FS_WRITE_FILE`` and
|
|
``LANDLOCK_ACCESS_FS_TRUNCATE`` both change the contents of a file and sometimes
|
|
overlap in non-intuitive ways. It is recommended to always specify both of
|
|
these together.
|
|
|
|
A particularly surprising example is :manpage:`creat(2)`. The name suggests
|
|
that this system call requires the rights to create and write files. However,
|
|
it also requires the truncate right if an existing file under the same name is
|
|
already present.
|
|
|
|
It should also be noted that truncating files does not require the
|
|
``LANDLOCK_ACCESS_FS_WRITE_FILE`` right. Apart from the :manpage:`truncate(2)`
|
|
system call, this can also be done through :manpage:`open(2)` with the flags
|
|
``O_RDONLY | O_TRUNC``.
|
|
|
|
The truncate right is associated with the opened file (see below).
|
|
|
|
Rights associated with file descriptors
|
|
---------------------------------------
|
|
|
|
When opening a file, the availability of the ``LANDLOCK_ACCESS_FS_TRUNCATE`` and
|
|
``LANDLOCK_ACCESS_FS_IOCTL_DEV`` rights is associated with the newly created
|
|
file descriptor and will be used for subsequent truncation and ioctl attempts
|
|
using :manpage:`ftruncate(2)` and :manpage:`ioctl(2)`. The behavior is similar
|
|
to opening a file for reading or writing, where permissions are checked during
|
|
:manpage:`open(2)`, but not during the subsequent :manpage:`read(2)` and
|
|
:manpage:`write(2)` calls.
|
|
|
|
As a consequence, it is possible that a process has multiple open file
|
|
descriptors referring to the same file, but Landlock enforces different things
|
|
when operating with these file descriptors. This can happen when a Landlock
|
|
ruleset gets enforced and the process keeps file descriptors which were opened
|
|
both before and after the enforcement. It is also possible to pass such file
|
|
descriptors between processes, keeping their Landlock properties, even when some
|
|
of the involved processes do not have an enforced Landlock ruleset.
|
|
|
|
Compatibility
|
|
=============
|
|
|
|
Backward and forward compatibility
|
|
----------------------------------
|
|
|
|
Landlock is designed to be compatible with past and future versions of the
|
|
kernel. This is achieved thanks to the system call attributes and the
|
|
associated bitflags, particularly the ruleset's ``handled_access_fs``. Making
|
|
handled access rights explicit enables the kernel and user space to have a clear
|
|
contract with each other. This is required to make sure sandboxing will not
|
|
get stricter with a system update, which could break applications.
|
|
|
|
Developers can subscribe to the `Landlock mailing list
|
|
<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and
|
|
test their applications with the latest available features. In the interest of
|
|
users, and because they may use different kernel versions, it is strongly
|
|
encouraged to follow a best-effort security approach by checking the Landlock
|
|
ABI version at runtime and only enforcing the supported features.
|
|
|
|
.. _landlock_abi_versions:
|
|
|
|
Landlock ABI versions
|
|
---------------------
|
|
|
|
The Landlock ABI version can be read with the sys_landlock_create_ruleset()
|
|
system call:
|
|
|
|
.. code-block:: c
|
|
|
|
int abi;
|
|
|
|
abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
|
|
if (abi < 0) {
|
|
switch (errno) {
|
|
case ENOSYS:
|
|
printf("Landlock is not supported by the current kernel.\n");
|
|
break;
|
|
case EOPNOTSUPP:
|
|
printf("Landlock is currently disabled.\n");
|
|
break;
|
|
}
|
|
return 0;
|
|
}
|
|
if (abi >= 2) {
|
|
printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n");
|
|
}
|
|
|
|
The following kernel interfaces are implicitly supported by the first ABI
|
|
version. Features only supported from a specific version are explicitly marked
|
|
as such.
|
|
|
|
Kernel interface
|
|
================
|
|
|
|
Access rights
|
|
-------------
|
|
|
|
.. kernel-doc:: include/uapi/linux/landlock.h
|
|
:identifiers: fs_access net_access scope
|
|
|
|
Creating a new ruleset
|
|
----------------------
|
|
|
|
.. kernel-doc:: security/landlock/syscalls.c
|
|
:identifiers: sys_landlock_create_ruleset
|
|
|
|
.. kernel-doc:: include/uapi/linux/landlock.h
|
|
:identifiers: landlock_ruleset_attr
|
|
|
|
Extending a ruleset
|
|
-------------------
|
|
|
|
.. kernel-doc:: security/landlock/syscalls.c
|
|
:identifiers: sys_landlock_add_rule
|
|
|
|
.. kernel-doc:: include/uapi/linux/landlock.h
|
|
:identifiers: landlock_rule_type landlock_path_beneath_attr
|
|
landlock_net_port_attr
|
|
|
|
Enforcing a ruleset
|
|
-------------------
|
|
|
|
.. kernel-doc:: security/landlock/syscalls.c
|
|
:identifiers: sys_landlock_restrict_self
|
|
|
|
Current limitations
|
|
===================
|
|
|
|
Filesystem topology modification
|
|
--------------------------------
|
|
|
|
Threads sandboxed with filesystem restrictions cannot modify filesystem
|
|
topology, whether via :manpage:`mount(2)` or :manpage:`pivot_root(2)`.
|
|
However, :manpage:`chroot(2)` calls are not denied.
|
|
|
|
Special filesystems
|
|
-------------------
|
|
|
|
Access to regular files and directories can be restricted by Landlock,
|
|
according to the handled accesses of a ruleset. However, files that do not
|
|
come from a user-visible filesystem (e.g. pipe, socket), but can still be
|
|
accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly
|
|
restricted. Likewise, some special kernel filesystems such as nsfs, which can
|
|
be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly
|
|
restricted. However, thanks to the `ptrace restrictions`_, access to such
|
|
sensitive ``/proc`` files are automatically restricted according to domain
|
|
hierarchies. Future Landlock evolutions could still enable to explicitly
|
|
restrict such paths with dedicated ruleset flags.
|
|
|
|
Ruleset layers
|
|
--------------
|
|
|
|
There is a limit of 16 layers of stacked rulesets. This can be an issue for a
|
|
task willing to enforce a new ruleset in complement to its 16 inherited
|
|
rulesets. Once this limit is reached, sys_landlock_restrict_self() returns
|
|
E2BIG. It is then strongly suggested to carefully build rulesets once in the
|
|
life of a thread, especially for applications able to launch other applications
|
|
that may also want to sandbox themselves (e.g. shells, container managers,
|
|
etc.).
|
|
|
|
Memory usage
|
|
------------
|
|
|
|
Kernel memory allocated to create rulesets is accounted and can be restricted
|
|
by the Documentation/admin-guide/cgroup-v1/memory.rst.
|
|
|
|
IOCTL support
|
|
-------------
|
|
|
|
The ``LANDLOCK_ACCESS_FS_IOCTL_DEV`` right restricts the use of
|
|
:manpage:`ioctl(2)`, but it only applies to *newly opened* device files. This
|
|
means specifically that pre-existing file descriptors like stdin, stdout and
|
|
stderr are unaffected.
|
|
|
|
Users should be aware that TTY devices have traditionally permitted to control
|
|
other processes on the same TTY through the ``TIOCSTI`` and ``TIOCLINUX`` IOCTL
|
|
commands. Both of these require ``CAP_SYS_ADMIN`` on modern Linux systems, but
|
|
the behavior is configurable for ``TIOCSTI``.
|
|
|
|
On older systems, it is therefore recommended to close inherited TTY file
|
|
descriptors, or to reopen them from ``/proc/self/fd/*`` without the
|
|
``LANDLOCK_ACCESS_FS_IOCTL_DEV`` right, if possible.
|
|
|
|
Landlock's IOCTL support is coarse-grained at the moment, but may become more
|
|
fine-grained in the future. Until then, users are advised to establish the
|
|
guarantees that they need through the file hierarchy, by only allowing the
|
|
``LANDLOCK_ACCESS_FS_IOCTL_DEV`` right on files where it is really required.
|
|
|
|
Previous limitations
|
|
====================
|
|
|
|
File renaming and linking (ABI < 2)
|
|
-----------------------------------
|
|
|
|
Because Landlock targets unprivileged access controls, it needs to properly
|
|
handle composition of rules. Such property also implies rules nesting.
|
|
Properly handling multiple layers of rulesets, each one of them able to
|
|
restrict access to files, also implies inheritance of the ruleset restrictions
|
|
from a parent to its hierarchy. Because files are identified and restricted by
|
|
their hierarchy, moving or linking a file from one directory to another implies
|
|
propagation of the hierarchy constraints, or restriction of these actions
|
|
according to the potentially lost constraints. To protect against privilege
|
|
escalations through renaming or linking, and for the sake of simplicity,
|
|
Landlock previously limited linking and renaming to the same directory.
|
|
Starting with the Landlock ABI version 2, it is now possible to securely
|
|
control renaming and linking thanks to the new ``LANDLOCK_ACCESS_FS_REFER``
|
|
access right.
|
|
|
|
File truncation (ABI < 3)
|
|
-------------------------
|
|
|
|
File truncation could not be denied before the third Landlock ABI, so it is
|
|
always allowed when using a kernel that only supports the first or second ABI.
|
|
|
|
Starting with the Landlock ABI version 3, it is now possible to securely control
|
|
truncation thanks to the new ``LANDLOCK_ACCESS_FS_TRUNCATE`` access right.
|
|
|
|
TCP bind and connect (ABI < 4)
|
|
------------------------------
|
|
|
|
Starting with the Landlock ABI version 4, it is now possible to restrict TCP
|
|
bind and connect actions to only a set of allowed ports thanks to the new
|
|
``LANDLOCK_ACCESS_NET_BIND_TCP`` and ``LANDLOCK_ACCESS_NET_CONNECT_TCP``
|
|
access rights.
|
|
|
|
Device IOCTL (ABI < 5)
|
|
----------------------
|
|
|
|
IOCTL operations could not be denied before the fifth Landlock ABI, so
|
|
:manpage:`ioctl(2)` is always allowed when using a kernel that only supports an
|
|
earlier ABI.
|
|
|
|
Starting with the Landlock ABI version 5, it is possible to restrict the use of
|
|
:manpage:`ioctl(2)` on character and block devices using the new
|
|
``LANDLOCK_ACCESS_FS_IOCTL_DEV`` right.
|
|
|
|
Abstract UNIX socket (ABI < 6)
|
|
------------------------------
|
|
|
|
Starting with the Landlock ABI version 6, it is possible to restrict
|
|
connections to an abstract :manpage:`unix(7)` socket by setting
|
|
``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET`` to the ``scoped`` ruleset attribute.
|
|
|
|
Signal (ABI < 6)
|
|
----------------
|
|
|
|
Starting with the Landlock ABI version 6, it is possible to restrict
|
|
:manpage:`signal(7)` sending by setting ``LANDLOCK_SCOPE_SIGNAL`` to the
|
|
``scoped`` ruleset attribute.
|
|
|
|
.. _kernel_support:
|
|
|
|
Kernel support
|
|
==============
|
|
|
|
Build time configuration
|
|
------------------------
|
|
|
|
Landlock was first introduced in Linux 5.13 but it must be configured at build
|
|
time with ``CONFIG_SECURITY_LANDLOCK=y``. Landlock must also be enabled at boot
|
|
time like other security modules. The list of security modules enabled by
|
|
default is set with ``CONFIG_LSM``. The kernel configuration should then
|
|
contain ``CONFIG_LSM=landlock,[...]`` with ``[...]`` as the list of other
|
|
potentially useful security modules for the running system (see the
|
|
``CONFIG_LSM`` help).
|
|
|
|
Boot time configuration
|
|
-----------------------
|
|
|
|
If the running kernel does not have ``landlock`` in ``CONFIG_LSM``, then we can
|
|
enable Landlock by adding ``lsm=landlock,[...]`` to
|
|
Documentation/admin-guide/kernel-parameters.rst in the boot loader
|
|
configuration.
|
|
|
|
For example, if the current built-in configuration is:
|
|
|
|
.. code-block:: console
|
|
|
|
$ zgrep -h "^CONFIG_LSM=" "/boot/config-$(uname -r)" /proc/config.gz 2>/dev/null
|
|
CONFIG_LSM="lockdown,yama,integrity,apparmor"
|
|
|
|
...and if the cmdline doesn't contain ``landlock`` either:
|
|
|
|
.. code-block:: console
|
|
|
|
$ sed -n 's/.*\(\<lsm=\S\+\).*/\1/p' /proc/cmdline
|
|
lsm=lockdown,yama,integrity,apparmor
|
|
|
|
...we should configure the boot loader to set a cmdline extending the ``lsm``
|
|
list with the ``landlock,`` prefix::
|
|
|
|
lsm=landlock,lockdown,yama,integrity,apparmor
|
|
|
|
After a reboot, we can check that Landlock is up and running by looking at
|
|
kernel logs:
|
|
|
|
.. code-block:: console
|
|
|
|
# dmesg | grep landlock || journalctl -kb -g landlock
|
|
[ 0.000000] Command line: [...] lsm=landlock,lockdown,yama,integrity,apparmor
|
|
[ 0.000000] Kernel command line: [...] lsm=landlock,lockdown,yama,integrity,apparmor
|
|
[ 0.000000] LSM: initializing lsm=lockdown,capability,landlock,yama,integrity,apparmor
|
|
[ 0.000000] landlock: Up and running.
|
|
|
|
The kernel may be configured at build time to always load the ``lockdown`` and
|
|
``capability`` LSMs. In that case, these LSMs will appear at the beginning of
|
|
the ``LSM: initializing`` log line as well, even if they are not configured in
|
|
the boot loader.
|
|
|
|
Network support
|
|
---------------
|
|
|
|
To be able to explicitly allow TCP operations (e.g., adding a network rule with
|
|
``LANDLOCK_ACCESS_NET_BIND_TCP``), the kernel must support TCP
|
|
(``CONFIG_INET=y``). Otherwise, sys_landlock_add_rule() returns an
|
|
``EAFNOSUPPORT`` error, which can safely be ignored because this kind of TCP
|
|
operation is already not possible.
|
|
|
|
Questions and answers
|
|
=====================
|
|
|
|
What about user space sandbox managers?
|
|
---------------------------------------
|
|
|
|
Using user space processes to enforce restrictions on kernel resources can lead
|
|
to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of
|
|
the OS code and state
|
|
<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
|
|
|
|
What about namespaces and containers?
|
|
-------------------------------------
|
|
|
|
Namespaces can help create sandboxes but they are not designed for
|
|
access-control and then miss useful features for such use case (e.g. no
|
|
fine-grained restrictions). Moreover, their complexity can lead to security
|
|
issues, especially when untrusted processes can manipulate them (cf.
|
|
`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).
|
|
|
|
Additional documentation
|
|
========================
|
|
|
|
* Documentation/security/landlock.rst
|
|
* https://landlock.io
|
|
|
|
.. Links
|
|
.. _samples/landlock/sandboxer.c:
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c
|