33

Sometimes computers stutter a bit when they're working hard, to the point where the mouse location freezes for a fraction of a second, or stutters intermittently for a few seconds. This sometimes happens with keyboard input too, even in very basic applications (i.e. presumably it's not a complicated/costly handling of the event loop within an application). This is very frustrating as a user.

Why can't (or why don't) operating systems absolutely prioritise user input (and repainting thereof) in threading and process scheduling?

A few ideas, maybe it's one/some of these:

  • Operating systems don't force applications to explicitly delineate between immediate user input event handling and any other knock-on processing or background processing, so it relies on all applications being well engineered. And not all applications are.
  • The event loop and repainting requires every application that could potentially be visible or responsive to user input to weigh in on what happens in the next tick, and sometimes lots of complicated things happen at once. Perhaps even the user input thread in some application(s) gets blocked waiting for another thread?
  • The event loop and repainting only requires the currently active application for the most part, but operating systems sometimes let background processes hog the CPU. If this is the case - why do they let this happen?
  • It's deemed more important to let threads operate in short bursts to prevent slow down via context switching than to operate strictly in priority order (presumably there is always some cutoff/trade-off here).
    • I don't know modern CPU architectures, but presumably executing 10 instructions on virtual thread 1, then 2, then 3 is faster than executing 1 instruction on thread 1, then 2, then 3, 10 times. (Scale 10 and 1 up or down as appropriate.)
  • Something about interrupts that I don't understand.

Admittedly my experience is only on Windows - keen to hear if other OS's have this solved.

18
  • 7
    Usually, swapping Commented Apr 29, 2022 at 9:52
  • 24
    You might enjoy seeing how startlingly responsive an old Mac Plus is, running at only 8 MHz. It pleases me, but also makes me a tad resentful that modern OS's, including macOS, haven't accounted for responsiveness. My personal pet peeve is how long it takes my Mac to simply populate and draw the Recent Items menu under the Apple menu -- the list should already be known!
    – donjuedo
    Commented Apr 29, 2022 at 18:40
  • 8
    I might turn this into a separate answer, but for now, I'll just post this as a comment: you are assuming that the current mainstream Operating Systems are "modern" Operating Systems. They are anything but that. For example, macOS traces its lineage back to the original Unix from 1970. Windows NT dates back to 1988. Linux dates back to 1991, but is of course also heavily inspired by Unix. Also, our current computer architecture, the PC is from 1980 and it wasn't even "modern" back then, on the contrary, it was designed to be as cheap and simple as possible. The CPU dates back to 1976. Commented Apr 29, 2022 at 22:45
  • 11
    This is very frustrating as a user - I don't remember the last time I was frustrated by this
    – Caius Jard
    Commented Apr 30, 2022 at 14:35
  • 4
    @NonnyMoose If you've not experienced it on Linux or Mac, then you haven't been using them for long enough.
    – Neil
    Commented May 1, 2022 at 15:14

7 Answers 7

26

As you may have noticed, there's a category of application that tries really hard to avoid input lag and only occasionally fails at doing so: games. Even then it's not uncommon for players to notice occasonal slowdowns and stuttering.

There is an excellent blog which gives examples of hunting for these issues. A followup post tries to find the exact instructions reponsible: What is Windows doing while hogging that lock.

As you've guessed, it's the user input thread in all applications getting blocked while the operating system holds a global lock while trying to clean up a list of objects. Repeatedly.

Windows, because of its long history and origins in the 16-bit single-CPU cooperative multitasking era, isn't especially suitable for isolating processes from each other's performance issues, especially when there are UI elements involved.

4
  • 3
    Isn't specific to Windows. Android has similar problems. Not with the mouse, but with the equivalent: touch gestures. Even though they have specific framework constructs that make it easy for the programmer to move stuff off the UI thread, and even though they have built-in diagnostics to visualize when the UI thread is taking too much time ... it still happens that poorly written apps behave badly w.r.t. user input because too much stuff is happening on the UI thread. I believe iOS can have those problems too, and you don't see it as much perhaps because Apple's enforcement is draconian.
    – davidbak
    Commented May 1, 2022 at 1:23
  • 10
    I also disagree with your last paragraph. The 16 bit days are long over. Any remaining problems in Windows of this nature don't trace from those days, but are due to more recent decisions and implementations.
    – davidbak
    Commented May 1, 2022 at 1:26
  • 4
    And, in fact, 32-bit Windows 95 was very good at enabling 16-bit multitasking. Windows's 16-bit origins were not holding it back in this respect, even when they were still supported.
    – wizzwizz4
    Commented May 1, 2022 at 11:20
  • 4
    What I was referring to was the message loop system, implemented within applications - Windows effectively requires that both input and paint requests go through the same thread within the application. It's very easy for an application developer to accidentally make the UI thread block on I/O or some other thread, at which point the application will not accept input until that progresses. On the other hand, I've not seen designs that really avoid that problem or make it hard for the application developer to make that mistake.
    – pjc50
    Commented May 2, 2022 at 14:35
25

I would like to answer this question from more of a high-level, marketing perspective than a more low-level, technical one.

All of the current mainstream Operating Systems are so-called general purpose Operating Systems. That's not really a head scratcher: a special-purpose OS is by definition only useful for a small group of people, so it can't really become mainstream.

In programming, there is an almost general rule that there is a trade-off between latency and throughput. Improving throughput worsens latency and improving latency worsens throughput. You can't have both low latency and high throughput.

However, a general-purpose OS must be useful for many different purposes, so it must make a trade-off where it offers both "good enough" latency and "good enough" throughput. But that means, you can't really have very low latency or very high throughput in a general-purpose OS. (And this applies to hardware, to networking, to many other things as well. E.g. heavily superscalar CPUs with deep pipelines have high steady-state throughput but terrible latency in case of a pipeline stall, cache miss, etc.)

Windows, for example is used for gaming and file servers and word processing and reading emails and 3D modeling. For tiny home PCs, phones, tower workstations, and 500-core servers with terabytes of RAM. It is used in places where power is of no concern and in places where power is everything.

You simply can't have an OS that is perfect at all of those things.

Now, the question is: why don't we simply give up on the idea of a general-purpose OS and use different OSs for different purposes instead? Well, commonality is economically efficient. You only have to write drivers for one OS, not 100. You only have to train administrators in one OS, not 100. Some software may be needed in multiple different niches, you only need to write this once instead of multiple times.

In fact, with projects like Java+JavaFX+Swing, .NET+MAUI, Electron / React-Native, etc. we see efforts of making it possible to write an application only once and have it run on macOS, Windows, Linux, Android, and iOS. So, there clearly is a desire to have uniformity across OSs, even when there is only three or so of them.

So, in summary: it makes economic sense to have only one general-purpose OS, only one general-purpose hardware architecture, only one general-purpose communication network, etc. But that means that we end up with a "jack-of-all-trades, master-of-none" situation, where e.g. an OS cannot afford to optimize for interactive latency too much, because that would hurt batch throughput processing, which is important for other users.

Note that, for example, in Linux, there are certain compile time options in the kernel as well as third-party patches, which improve interactivity. However, in the end, Linux is just a small part of the system. Input processing for graphical applications is actually handled by XInput or whatever the Wayland equivalent of that is, so there is not much Linux can do there.

Which brings us to another thing: abstractions. Software Engineering is all about abstractions and reuse. But, generally reusable abstractions need to be, well, general-purpose. And so we end up in a situation where a game uses maybe some in-house framework, which in turn uses Unity, which in turn uses Wayland, which in turn uses DRM, which in turn uses the Linux kernel, which in turn uses the xHCI API to talk USB HID to the mouse.

Every single crossing of one of those abstraction boundaries costs performance, and every single one of those abstractions is more general than this particular game needs it to be. But without all of these abstractions, every game would need to implement its own USB driver, its own graphics driver, its own rendering pipeline, its own filesystem, its own SSD driver, and so on, and so on, which would be prohibitively expensive. Especially if you consider that one of the big promises of the PC is modularity, where you can create billions of combinations of different hardware and have it work.

This is very different for gaming consoles, for example, where you have exactly one well-defined hardware and software configuration. But gaming consoles are not general-purpose, you can't use them to run a database cluster.

So, effectively, it is all about trade-offs. Like everything in Engineering is.

7
  • 5
    I'd add one additional detail to this answer regarding computers being "General purpose use" with regards to OP's question about event handling and repainting operations only focusing on the currently active application; while background processes can also take process usage, there's no guarantee that there's only one active application that needs repainting and event handling. Gaming consoles usually try to avoid letting you run two games at the same time, for example, which keeps them locked down as to what runs at any given time. Commented Apr 30, 2022 at 1:13
  • 1
    "You can't have both low latency and high throughput.". The same goes for us humans. If you try to answer every phone call, slack message and e-mail as soon as possible in order to offer low-latency, you probably won't get anything done at all, and will have low throughput. Commented Apr 30, 2022 at 16:59
  • 2
    Note that while the OS-level tradeoffs explain input latency being roughly 3ms instead of roughly 0.01ms, any delays long enough for the user to notice (e.g. hundreds or thousands of milliseconds) are the fault of the application, as evidenced by the existence of applications (mostly games as you mentioned) that do it much better.
    – Ben Voigt
    Commented May 2, 2022 at 16:58
  • @BenVoigt as I say in my answer, in my experience it is usually the fault of the OS swapping the application out, not the application itself. Sometimes it is the fault of the application, but not usually. Commented May 5, 2022 at 17:32
  • @user253751: It's not the fault of the OS that (some application on the computer) had a working set larger than system RAM. On a multi-user OS, it would be the responsibility of the OS to fairly share resources so that one hungry application swaps out the other processes of the same user and doesn't affect others. In a single-user desktop environment, that isn't a concern. Either way, it's the user's responsibility not to launch the other application that consumes all available resources.
    – Ben Voigt
    Commented May 6, 2022 at 15:14
20

Why can't (or why don't) operating systems absolutely prioritise user input (and repainting thereof) in threading and process scheduling?

Even if the operating system tells the application about the user input, or the necessity to draw (part of) it's window instantaneously, it's still up to the application to actually do that.

what happens in the next tick

Most programs don't use ticks. They handle the stream of events that the OS supplies to them, as fast as they can. If they aren't keeping up, the OS might drop or merge some events (if you haven't handled the mouse moving to one place before it moves elsewhere, you really only need the latest position).

it relies on all applications being well engineered. And not all applications are.

Mostly this. You could more charitably characterise it as the developers prioritising something other than UI responsiveness.

6
  • 1
    Thanks! Do you think it would be impossible for operating systems to force or strongly encourage the prioritisation of UI responsiveness? If so, why? e.g. why can't the mouse pointer always be drawn in the updated location, separate to asynchronously responding to events with e.g. highlighting/opening menus/pointer type etc.? Some apps want to control/update mouse location itself on move (e.g. 3D), but that is rare in desktop apps and could be made exceptional. Maybe other people don't care about mouse latency as much as me :) Commented Apr 29, 2022 at 10:26
  • I also think this doesn't explain why background processes like periodic (not immediate/responsive) malware scanners can cause input lag. Why do process schedulers ever allow them that much monopolisation? Commented Apr 29, 2022 at 10:33
  • 7
    @PaulCalcraft if an application wants to be in control of the pixels it draws, then you can't really stop it from being slow at deciding what those pixels should be. There is a paradigm where an application delegates the rendering to the OS, with event hooks on ui elements. A browser is basically the OS for a web app.
    – Caleth
    Commented Apr 29, 2022 at 10:57
  • I was going to answer something similar in that the problem is being misdiagnosed. The problem isn't necessarily that the UI is laggy. All the input processing may be running 100% optimally. If the thing that is receiving that input is laggy either internally or because it is fighting other processes for memory, processing, networking, or whatever, then "showing the user that it did something" may be a waste of time. Further, to "show something changed" when in fact the application has not done anything with the input would be a placebo, and a lie not showing the application's true state.
    – killermist
    Commented May 1, 2022 at 12:33
  • @killermist: I heard about an optimization Microsoft did probably one or two decades ago, where they switched to rendering the desktop, the taskbar, and the start menu as fast as possible during startup. None of the UI elements actually worked, and in fact, this caused startup to take slightly longer but in all tests, users consistently judged the startup to be faster and more responsive. Commented May 1, 2022 at 15:19
11

In my experience, on most computers I have ever used, this is usually caused by inappropriate swapping to disk. Every other cause (such as operating system locks) is significantly less common.

When there is a high demand for memory, the operating system chooses memory pages to write to disk (very slow) to make room for new memory allocations.

If the memory demand is too high or the operating system chooses poorly, it can swap out pages that are in the event processing path of some process. When (and only when) an event is received that follows this code path, the CPU will fault when it wants to access a page that is not currently present in memory. The operating system will handle the fault by choosing another page to write to disk, then reading the requested page from the disk, and perhaps other pages that the OS predicts will also be needed, and then resuming the process that faulted. In that order.

The OS is unlikely to correctly predict all the pages that will be read to process the event, so several cycles of this can occur - potentially, hundreds of cycles - potentially, waiting 15 milliseconds twice per cycle. This can therefore add up to several seconds of simply waiting for the hard drive mechanics to move.

This is exacerbated by more complex programming environments which access more pages of memory when processing an event. I have seen web browsers swap for several minutes before responding. This may contribute to the perception that programs written in more complex environments are "slower".

Current operating systems do not really provide good tools to manage swapping behaviour. If they did, they would probably be abused by some programs to make them "faster" at the expense of other programs. If a program written by a cut-throat commercial vendor could mark 100% of its memory as "important event-processing code" then it would.

12
  • On the topic of swapping, there is also the fact that Windows doesn't actually have a swappiness setting, and doesn't seem to understand that using swap is a bad idea on a hard drive. In fact I added an SSD to my old laptop for this reason alone.
    – Max Xiong
    Commented May 1, 2022 at 1:15
  • @MaxXiong I'm fairly certain windows has allowed you to adjust the page file settings since at least Windows 2000, including disabling the page file on the hard drive entirely. In fact, I think the interface for it is practically unchanged since then too.
    – PC Luddite
    Commented May 1, 2022 at 3:13
  • 1
    If UI responsiveness is laggy due to swapping, then UI responsiveness is not the problem but a symptom of the larger problem that is causing the swappiness. It could be that the machine just doesn't have enough memory (which may not be solvable if the hardware is maxed out on memory), too many programs running simultaneously, carelessly written programs that do not try to optimize memory usage (web browsers are notorious for this), or any combination of these could be the root problem of swappiness->UI lag. Again, UI lag IS NOT the problem, and trying to treat that symptom misses the point.
    – killermist
    Commented May 1, 2022 at 21:13
  • @PCLuddite I was under the impression that you can change settings such as which partition the page file goes on, and the size of the page file, but not how aggressively swap occurs, whereas there is a setting for that on Linux.
    – Max Xiong
    Commented May 1, 2022 at 22:46
  • 1
    @user253751 That all sounds well and good until the "background tasks" which are very probably running for a reason and should be getting the CPU, memory, and network resources they need then have their resources given to some "foreground task" that just burns them anyway, possibly cascading into the same or other "foreground tasks" then having to wait for things that the "background tasks" should have been doing. Some programs just are not written efficiently, like a fork bomb or a web browser that thinks it's just fine to cache EVERYTHING in memory, etc. Not much to be done for it.
    – killermist
    Commented May 2, 2022 at 14:25
4

Why? Because, generally speaking, nobody cares enough to make it better. And by "caring enough" I mean cares enough to throw some serious money at it.

Things begin with synchronous programming models: mainstream programming languages, like pre-coroutine C++, and C, make it very hard to write reactive code that responds to events rather than waiting for them occasionally.

For example, suppose you're writing a typical desktop application. There's a GUI thread. The user wants to save the file. So you stick gui->current_document.Save() call in the place where the Save command is accepted. But now the system is not reacting to user events, but saving the file - the GUI is frozen. If the file is on a network, or the local mechanical hard drive is slow, the user experience is necessarily poor.

So then you think "aha, let's save in a thread!". So you execute the save call in a thread pool. Things are seemingly OK until your testers find that if they save the file and then immediately start modifying the document, the application crashes.

Yep, accessing the document data from multiple threads cannot be done "just like that". You need to produce a "safe" static snapshot of the document that's guaranteed to remain valid for as long as the Save() function needs it to. In general, it's not possible to "synchronize" the saving and modifications using just synchronization primitives like mutexes etc: the modifications may make some already-saved data irrelevant, there are problems ensuring self-consistency, etc.

Highly reactive software must be designed that way from day one, and the programming tools available for that are still not easy to use. Concurrent programming is still hard - there just are tools that make it very easy to shoot yourself in the foot instantly, in a single line of code. At least in the "good old days" starting a thread took a bit more work so people didn't bother and things at least worked, if slowly and with lag due to file access and network access. Today it's very easy to "parallelize" code, but it's no easier to do it safely and correctly - at least not in the languages the big, highly visible desktop code bases are written in - typically C++.

1
  • 1
    This is I think a big part of the reason. One big problem which makes this problem really difficult to avoid, is that most operation systems have one and only one specific thread per program, which can receive input from the gui, and update(draw) the gui. So to avoid all lag each time the gui send an event such as a button click, the application need to do the real work in an other separate thread, and then when done, send the work back to the gui thread which can then update the interface.
    – MTilsted
    Commented May 2, 2022 at 14:10
3

Dan Luu posted a wonderful insight into this: https://danluu.com/input-lag/

With his measurements, we can see that an Apple IIe from 1977 is indeed more responsive than a state of the art nowadays-computer.

He also proposes some culprits: rendering pipeline and system complexity (some might say bloating, but your mileage may vary).

Last but not least, I can hear that this matter is relevant to you. But as I see it, there has been gazillions of money thrown at the OS topic, and every decade or so, someone comes up with the next big thing. But people, clients, stick to what they know, and what is compatible. How many know or use KolibriOS?

You could say that latency can... wait (pun only half-intended).

2

The reason is, that anything is a compromise. If you say, mouse and keyboard input is a priority for you, this means something else gets less priority. Most users seem to not care too much if their mouse and keyboard input is handled instantly, if the computer can't do anything useful with the entered information anyway, because it is so busy with other stuff.

Of course, this is a trade off that can be changed, but a general purpose system cannot be too optimized for specific use cases, because other use cases suffer.

Its mostly a throughput versus latency choice. How much are you willing to compromise? If a calculation takes 5 minutes, while your input lags, is it better to take 8 minutes but your mouse moves smoothly? Or 15 minutes? Where would you draw the line?

For Linux, there are options to tune this, and you can experiment.

I use the lowlatency kernel for Ubuntu since a few years ago. This dramatically reduces the input latency under load. The downside is, that it also reduces throughput, and all the processes which use the CPU in the background will take longer to complete. For me, on my work laptop, this is an acceptable compromise. On a server which does batch jobs, this would be unacceptable. There are also realtime kernels, which further reduce latency, but dramatically reduce throughput and there are of course also other problems.

You can even tune the related kernel settings yourself and compile your own kernel, if smooth mouse movement is extremely important to you.

Some related Questions to dig deeper: This on Unix Stackexchange or this on Ask Ubuntu.

Not the answer you're looking for? Browse other questions tagged or ask your own question.