// we are

Wayland Compositors - Why and How to Handle Privileged Clients! (Updated on the 2014/02/21)

It’s been more than 3 years since my last security-related blog post. One might think I lost interest but the reality is that I just suck at blogging. This blog post is meant as a summary of a debate a few of us had privately and publicly on the Wayland ML.

Disclaimer: Although I try to be up to date with everything that surrounds security of X11 and Wayland, what I write in this article may be outdated, incomplete or simply blatantly wrong. This article being the basis for a document I’m planning on writing to help Wayland compositor developers implement secure compositors, I would love to hear your feedback!

1. The needs behind every security property

Before dwelving into how to securely export privileged features to graphics servers, let’s first have a look at the different security properties that can be expected by users. I’ll try to illustrate all of them with a simple example that should be important for everyone. Of course, we can imagine many other situations but that’s an exercise left to the reader.

On a graphics server, the user can only be concerned about 2 cases, input and output. The input being what is used by the user to interact with the computer while the output is what is displayed back to the user.


Input confidentiality means that only the application that is supposed to get the keyboard and mouse events receives them. This is important to avoid situations where a malicious application would record keyboard strokes while you enter a password (key-loggers) or record your mouse movements when clicking on a graphical digit keyboard (used by some bank websites to authenticate you). The result of both cases is that you’ve lost your credentials’ confidentiality and someone can now impersonate you.

Output confidentiality means an application should not be allowed to read back what is displayed by other applications or the entire screen. At the moment, any application can read the whole screen’s content. This is problematic when e-shopping because after keying-in your credit card number, it is displayed and one screen-shot can leak those bank credentials (when the heck will they get their shit straight?). No output confidentiality basically means that whatever you can read can be read by another application (and likely be sent over the internet too).


Input integrity means that no application can pretend to be the user and send forged input to another application. If this were allowed, applications could perform confused deputy-like privilege escalations. Integrity doesn’t only mean that the input hasn’t been tampered with, it also means that the source of the data really is the one advertised. Without integrity, if an application was not allowed to directly access a file but the user could, the application would only have to fake sending the “Alt + F2” key sequence and key-in the commands it wants to execute. This isn’t problematic in poorly-secured Operating Systems but becomes a problem when applications start running with less privileges than the user who started them.

Output integrity means no application can modify the content that is displayed by another application. It also means non-repudiation of the output content, if it is on the screen, it really comes from the application it claims to be from. An attack scenario could be for an application to launch Firefox on the URL which would be a copy of the HSBC website, except it would steal your credentials before redirecting you to the real one. With no output integrity, the malicious application could alter the URL bar of Firefox to make it look like you are connected to and not This would make the attack impossible to visually detect, even by careful users.


A service is available if legitimate users can access it whenever they want. One way to make other applications not available is for an application to run full-screen and not allowing the user to switch back to using other applications. This feature is useful for screen lockers but applications such as games often make use of it, which can be very annoying. It has also been found that at least one anti-cheat systems took advantage of games running always in the foreground in order to use the computational power of the gamer’s PC to mine bitcoins, making it harder for users to realise the problem.

Input availability means no application can redirect all/most of the user’s input to itself, preventing other applications from receiving input when the user intended them to. This can be achieved by not allowing unconditional routing of events to a single application, thus blocking the compositor from receiving events for the Alt + Tab shortcut.

Output availability means no application can prevent other applications from displaying their content on the screen, if the user desires to see that content. An example would be an application being full-screen and “top-most”, thus blocking users from viewing/accessing other applications.

2. Improving the security of X

At XDC2012, Tim and I gave a presentation about the security of the Linux graphics stack which has been relayed by LWN and LinuxFR. The result wasn’t pretty, as indeed a user can expect neither confidentiality nor integrity or availability on inputs and outputs when using the X11-based standard graphics server.

Input confidentiality is limited as it is possible for any X11 client to snoop on the keyboard inputs and mouse position, but not mouse click events or wheel scrolling (src).

The only security property that can truly be fixed on X11 is the integrity of application’s output graphic buffers (the image of the application that is displayed on the screen). This work requires applications to share buffers with the x-server using DMA-Buf instead of GEM’s buffer sharing mechanism which has very limited access control. GEM is the kernel interface for open source drivers to allocate graphic buffers and allow applications to share them.

Fixing the other security properties is impossible using the X11 protocol as it would break too many legitimate applications that rely on those features. Disabling access to these features would effectively make the X-Server non-compliant with the X11 protocol. Only authorising the legitimate applications to access those restricted interfaces wouldn’t increase the security of the system either because of the amount of applications who do require them. As there is a new graphics server emerging, we have decided to fix this one and not repeat X’s mistakes. In summary, this is where the graphics stack security currently stands:

|    Property     |  Input  |  Output  |
| Confidentiality |   NO    |    NO    |
|    Integrity    |   NO    |    WIP   |
|  Availability   |   NO    |    NO    |

3. Wayland

Wayland is intended as a simpler replacement for X, easier to develop and maintain. GNOME and KDE are expected to be ported to it.

Wayland is a protocol for a compositor to talk to its clients as well as a C library implementation of that protocol. The compositor can be a standalone display server running on Linux kernel modesetting and evdev input devices, an X application, or a Wayland client itself. The clients can be traditional applications, X servers (rootless or fullscreen) or other display servers.


Current state of security within Wayland compositors

The first good point of the Wayland protocol is input management. At the moment, the protocol doesn’t allow snooping on the input (confidentiality), generating input events (integrity) nor for an application to grab all events (availability). However, Wayland clients allowing LD_PRELOAD are still vulnerable to input attacks, as demonstrated by Maarten Baert. This is not Wayland compositors’ problem so it won’t be taken into account in this discussion.

Just like with X, there are multiple ways for applications to send their output buffers to the graphics server. With Wayland/Weston, applications can use shared memory (SHM) or GEM’s buffer sharing mechanism. SHM buffer sharing is meant for CPU-rendered application while GEM-based buffer sharing is meant for GPU-rendered applications.

SHM buffer sharing seems to be using anonymous files and file descriptor(fd) passing in order to transmit buffers from the client to the compositor. This makes sure that only the creator and the receiver of the fd can access the (now-shared) resource, making it impossible for a third-party other than the kernel to spy on or modify the output of other applications (confused-deputy). Confidentiality and integrity seems to be guaranteed but I haven’t dwelved into the implementation to make sure of it.

GEM buffer sharing is known to be insecure because shared buffers are referenced by a easily-guessable 32-bit handle. Once the handle is guessed, the buffer can be opened by other application run by the same user without access control. Once opened, the buffers may be read from or written into. This means confidentiality or integrity cannot be guaranteed on the output of applications using this buffer-sharing method.

On-going work is being performed to make use of DMA-Buf instead of GEM. DMA-Buf, just like SHM, is based on anonymous files and fd-passing and even allows different GPU drivers to exchange GPU buffers. Once the Wayland protocol and GPU applications start using it, confidentiality and integrity of the output buffers won’t be a problem anymore.

|    Property     |  Input  |  Output  |
| Confidentiality |   YES   |   YES*   |
|    Integrity    |   YES   |   YES*   |
|  Availability   |   YES   |   YES    |

2014/02/21 UPDATE: Kristian Høgsberg pointed out in the comments that Wayland’s EGL code in mesa has supported DMA-BUF for quite a while although it was made secure in mesa 10.0 (confidentiality & integrity). I updated the table above to refect on that.

The need for standardised privileged interfaces

Although Tim and I advised Wayland compositors not to rely on external programs to perform privileged tasks, some people do think it is needed as they want to make it possible to develop cross-compositors applications performing privileged tasks. Examples of such applications would be:

  • Screenshot applications (stills and video)
  • Virtual keyboards and pointing devices
  • Screen sharing (VPN, Skype)
  • Hotkeys handlers (?)
  • Session lockers

All of these applications are violating one or more security properties. Wayland compositors should thus control the access to the interfaces allowing those applications to work. Since we want these applications to be cross-compositors, a standardised way of granting/passing/revoking privileges should be described in the protocol or its implementation reference guide.

Allowing the user to securely break security properties

By default, the system should be enforcing all the security properties we defined earlier. However, sometimes, users need/want to automate some process, record their screens or lock the computer with a custom-app. This is why we need ways to by-pass the security when it is really needed. Without such means, people may refuse to use Wayland because it “takes freedom away from them”. However an ideal design is so that someone will always come up first with the “right” way to do something. Here when it comes to distributors/vendors using Wayland, you want them to use your own preferred security property rather than entirely unlocking Wayland’s safeguards to support the features of poorly-written apps.

The usual way of dealing with applications needing more privileges is to statically give them at launch time. Once an application has no use of the permission anymore, it can revoke its right to access it, until its next execution. This is very similar to what exists with capabilities.

The problem with such a system is that a malicious application could potentially take advantage of a poorly-coded application that holds an interesting capability (assigned statically at launch time), and use that application’s capability to gain indirect access to the restricted interface it is interested in. This is because permissions aren’t granted according to the immediate intent of the user. Indeed, a user would ideally always have a way to be aware of a reduced security state. This means the user has to take action in order to temporary reduce the security. The user should then be able to check whether the system’s security is still reduced and should be able to revoke permissions. Capturing the user’s intent can be done by:

  • Waiting for a user to press the key with a clear semantic before launching the associated application (for instance, PrintScreen launching the screen-shot application)
  • Prompting the user whenever an application tries to access a restricted interface
  • Creating secure widgets that are drawn and managed by the compositor but can be imported in applications (UDAC)
  • Any other way?

The first solution requires absolute trust in the input integrity and requires the compositor to know which application it should run (fullpath to the binary). The second solution requires both trust in input integrity and output integrity (to prevent a malicious application from changing the content of the prompt window to change its semantic and turn it into a fake advertisement, for instance). The third solution requires secure widgets, unfortunately it is -ENOTQUITETHEREYET. We have ideas on how to implement them using sub-surfaces, they will be discussed again later on this very same blog ;)

While I think the user-intent method has a higher security than static privilege assignation, I think both should be implemented with the latter used as a way for users to specify they are OK with potentially reducing the security of the desktop environment to let the application he/she wants to run properly. This will lower users’ dissatisfaction and should result in a better security than bypassing some security properties for all applications. I am however worried that some stupid applications may be OK with creating snapshot capabilities from the command line, without requiring the user’s input. A packager would then grant the privileges to this application by default and thus, the mere fact of having this application installed will make your desktop non-confidential anymore.

This is why once privileges have been granted, the user needs to have a way to keep track of who has access to restricted interfaces. This can be done by having a mandatory notification when an application accesses a privileged interface and a compositor-owned application in the systray whose colour would indicate the current security state (no threat, at least one application has the rights to use a restricted interface and at least one application is using a restricted interface). A click on this icon could provide more information about which restricted interfaces are used by which application. A button could then be added to each entry to allow users to revoke some privileges of applications. While I think the interface for the application providing this feedback should be specified, the user shouldn’t have a choice on it and it should be hardcoded in the Desktop Environment.

Recommendations to restricted Wayland interface designers

I have never designed an interface for Wayland and don’t know what the best practice is. However, I know that restricted interfaces should never be considered as always usable.

The first important point is that before being able to use an interface, a client should first bind to it. This binding process could either succeed or fail, depending on the compositor’s security policy. Clients are mandated to test that binding worked well before using the interface. In case it didn’t, clients should fail gracefully and tell the user what restricted interface couldn’t be bound. Also, binding a restricted interface could take some time and the application shouldn’t block on it.

To support privileges revocation, a revoke signal should be added to the interface in order to inform clients their rights to access the restricted interface have been revoked. Clients should fallback gracefully and tell the user they received such a signal.

Launching privileged Wayland clients from the compositor

The most-secure way of launching clients requiring restricted interfaces is to let the compositor run them by itself. This way, it can control the environment in which the process has been launched which lowers the risks of environment attacks such as the LD_PRELOAD one exposed earlier.

Implementing such a system is difficult as the compositor needs to remember that the PID of the client it launched should be granted the privileges to access one or more restricted interfaces when this (soon-to-become)client connects to the Wayland compositor. Not only does it mean that the compositor needs to have a separate table of which PIDs are supposed to get which privileges, it also means the compositor needs to keep track of the death of the client’s PID to avoid another process from re-using the PID of this client and gaining access to privileged interfaces it wasn’t supposed to access.

A simpler and more secure solution would be for the compositor to open a UNIX socket to itself before exec’ing the client. Once opened, it should be simpler for the compositor to set the client’s capabilities to a flag stored in the structure tracking the client and then execute the client’s binary. When running the exec() syscall, all the FDs that have not been opened with the O_CLOEXEC flag will be passed on to the new process. A run-time parameter of the Wayland client could then be used to tell which FD represents the unix socket to the Wayland compositor. An example of such parameter could be --wayland-fd=xxx. The compositor should however be careful it doesn’t leak any un-needed FD to the new client.

2014/02/21 UPDATE: Pekka Paalanen said on the Wayland Mailing List the latter approach is already implemented in Wayland and suggested reading the documentation about the environment variable WAYLAND_SOCKET in wl_display_connect. I actually prefer the implemented solution better because it is transparent to applications. Well done!

Letting applications require more privileges at run time

Some times, application may require access to a restricted interface after it has been launched. In this case, they can use the binding call I described earlier and the compositor will grant access to it or not, depending on its configuration or policy.

The problem with allowing applications to require more privileges is that we do not control their environment and we cannot make sure it didn’t get loaded with LD_PRELOAD or tampered with in any other way. As this decision really depends on which other security tools are being used on the computer, this isn’t something Wayland compositors should hard-code. This leads us to our final proposal.

Wayland Security Modules

As seen earlier, granting access to a restricted interface or not depends on the context of the client (how it was launched, previous actions). The expected behaviour should be defined by a security policy.

As no consensus on the policy can apparently be reached (as usual in security), we have all agreed that we needed to separate the policy from the code. This is very much alike Linux Security Modules (LSM) or X Access Control Extension (XACE).

From a software engineering point of view, we would work on a security library called Wayland Security Modules (name subject to changes) that Wayland compositors would call when a security decision would need to be made. The library would then load the wanted security policy, defined by a shared-object that I will refer to as the security backend. In the case of allowing a client to bind a restricted interface or not, the corresponding WSM hook should return ACCEPT, PROMPT or DENY, prompt meaning the compositor would have to ask the user if he wants to accept the risk or not. Let me stress out that prompting should be a last-resort measure as numerous studies have been made proving that unless asked very rarely, users will always allow the operation.

Some additional hooks would also be needed in order to track the state of Wayland clients (open, close, etc…) but nothing too major should be needed. The compositors would just have to store this context in a void *security; attribute in the Wayland client structure. Finally, WSM could be extended to control the access to the clipboard and maybe other interfaces I haven’t thought about yet.

The design of this library has not started yet. If you are interested in helping out, I would love to have some feedback on what are your use cases for WSM.

POSIX security backend

Most users run their computers without Mandatory Access Control (MAC), it is thus important to provide the best security possible by default. The POSIX security backend shouldn’t depend on any decision engine or MAC system (such as SELinux, Tomoyo, AppArmor, …) and should be easy to configure.

2014/02/21 UPDATE: A reader on reddit said the following about the above paragraph: “Pretty weird statement considering both Ubuntu and Fedora several other distros come with MACs enabled by default”. As far as I know, no user-oriented operating systems has a MAC policy for graphical applications. Both Ubuntu and Fedora run applications unconfined. The only system I know about that has a real MAC policy for all its applications (and many more security layers) is PIGA-OS, a research operating system I helped developping at the ENSI de Bourges.

A default policy could be specified in /etc/wayland/security.conf. Per-client configuration could be stored in /etc/wayland/authorized_clients.d/. This would allow package managers to install application security policies along with the application. Each application-specific policy would define the full path of the allowed binary and which restricted interface the application needs to get access to and in which cases is it acceptable (only when run by the compositor? etc…). This is enables Trusted Path Execution (TPE) as only the binary specified by the fullpath will match this set of privileges.

Different UNIX users should be allowed to have different security parameters. The easiest way would be to store per-user configuration in different files in /etc/wayland/users.d/ in order to simplify the logic. Another possibility would be to have ~/.wayland/ overriding the /etc/wayland/ configuration folder. The latter solution would be harder to implement securely because only the compositor should be allowed to change the configuration.

In any case, to be considered valid, configuration files should all be root-owned and 644 to prevent malicious applications from changing the security policy. This means changing the security policy will be considered as an administrative task which sounds about right.

Other security backends?

Other security backends could be implemented/integrated with PAM, Polkit or SELinux. You could even write your own security backend without needing to patch any Wayland compositor, unless you need new WSM hooks.

Please let me know about what security backend you would be interested in!

4. Acknowledgment

This article is the result of countless hours of discussions with my friends Timothée Ravier and Steve Dodier-Lazaro. It is also the result of multiple discussions with Sebastian Wick and Maarten Baert on Wayland’s mailing list (latest thread).

5. Conclusion

This article is just a summary of the current situation we are in security-wise and a summary of all the discussions I have been involved in. My hope is to get some feedback from the Wayland and security communities in order to achieve a secure system with cross-compositor privileged applications!

Please send me some feedback, I would love to hear from you!