5

This is an open-ended question, with many possible answers, but perhaps there's one big thing I'm overlooking. If not, perhaps this question should be community wiki.


I don't use other OSs often enough to judge, but certainly on all versions of Windows the operating system can be brought to a crawl when apps are pegging the CPU. Understandable with the early versions, with cooperative multitasking.

However, with pre-emptive multitasking, shouldn't the operating system put itself and its GUI at a higher priority, so as to remain responsive even when user apps are asking for full CPU utilization? After all, the OS doesn't have to give away any time slices. In most cases, I don't care that an app which will require minutes of CPU time is delayed by a few microseconds so that the OS GUI can respond to input.

It sometimes helps to set high-CPU processes to a lower priority, perhaps because that then lets other low-CPU apps I interact with be more responsive, giving the appearance of a more responsive overall experience. Or does the priority of an app really affect how it interacts with OS processes?

I've seen this happen many times when I had plenty of physical memory available, and without heavy hard drive use. It seems like CPU usage is the main consistent element when the OS flags.

A counter-example: often when a system is almost completely hung, the mouse remains responsive. So the OS does protect this one part of itself from some problems. Exactly how is a separate question, I just raise it as an example.

10
  • 1
    The load of the UI should be negligible compared to processes which run for a perceptible length of time, so raising the OS's priority should not have a material impact on the long-running process. Consider, would you rather have a progress bar or have your program install 50 milliseconds sooner? I suspect that most people would prefer responsiveness, though SU users may be exceptional.
    – user49214
    Commented Jul 24, 2012 at 23:16
  • 3
    The load of the UI is not negligible because it requires changing to the code that runs the UI which requires context switches, blows out the CPU caches, and so on. Commented Jul 24, 2012 at 23:25
  • 1
    @DavidSchwartz, and that is why people get upset when the system hangs.
    – Synetech
    Commented Jul 25, 2012 at 14:46
  • 1
    @Synetech: Exactly, a long-running CPU-intensive process shouldn't make the OS or other apps almost unusably slow. If it does, the OS hardly deserves to be called "multi-tasking".
    – user49214
    Commented Jul 25, 2012 at 16:37
  • 1
    @DavidSchwartz: I don't think there's a conscious decision in OS design to let the GUI hang during long-running CPU-intensive tasks. I suspect we've just all gotten used to it, and I wonder if there are good reasons that it has to be that way.
    – user49214
    Commented Jul 25, 2012 at 16:40

5 Answers 5

5

The only things that can really make one of the "modern multitasking OS" truly hang are:

  1. hardware failure
  2. CPU is stuck in a device driver (because of 1 or bad programming)
  3. fatal exception in device driver or other kernel code (because of 1 or bad programming)

The operating system in a multitasking OS is always going to cut off a task when its timeslice ends. However, if a program is designed to respond to user input, but doesn't during its timeslice, then the fault is with the program

It's more likely the shell being unresponsive. In Windows this is explorer.exe. You may try the following:

  • an alternate Windows shell (Litestep, etc.)
  • kill all explorer.exe's via taskmgr.exe, then launch cmd.exe, and do your stuff via the command line. Or launch a smaller program designed to launch other programs.

explorer.exe is one of those heavily componentized Windows programs that a lot of stuff can hook into. So see how things are without it.

2
  • I'm interested not only in outright hangs but also in cases where the GUI is greatly slowed, enough to make using the computer frustrating. Using an alternative shell is an interesting idea, though perhaps not practical for most users (anyone remember the alternate UIs for Windows 3?).
    – user49214
    Commented Jul 25, 2012 at 16:46
  • If only Microsoft was as equally interested ... Of course maybe given the rise of iOS and mobile platforms, perhaps that is a possible reason for Metro ...
    – LawrenceC
    Commented Jul 28, 2012 at 23:10
3

Try Linux. If you actually compile the kernel you can specify the time slice. Also you actually get to see the effect of preempt. A 1000us time slice is better if you are building a server which will (probably) have no UI to worry about (but it will still be a preemptive multitasking OS). On the other hand a 100us will result in an extremely responsive system. Most Distributions have 100us on their Desktop OS, which means even if my CPU is touching 100% usage on all cores, my UI is still responsive (you can actually try it).

0
2

With respect to Windows, Windows 7 does a much better job than XP when it comes to this sort of thing, so I would disagree with your statement "all versions of Windows". But even with XP, when you are using the client version of the OS, Windows will give some extra priority to the foreground app (by default). No matter which OS version though, if multiple processes are all stuck waiting on the same single, shared resource (could be I/O) than they will all behave in an unresponsive fashion.

Another way to view this problem, if explorer.exe is busy waiting for a shared resource (including CPU time), then the desktop/window manager itself will behave unresponsive. Likewise for any apps that directly or indirectly wait for explorer.exe to free up.

2
  • Better than earlier versions, sure. Yet I can still reliably reduce my machine (Phenom dual 2.8, 4 GB) to a crawl by recalculating a large Excel workbook, thereby pegging both cores but using little memory or hard drive access, and not interacting with any drivers.
    – user49214
    Commented Jul 25, 2012 at 16:48
  • @JonofAllTrades - As others have mentioned, Excel could be performing excessive context switches (thus slowing the entire system) or invaliding the cache lines too often (also slowing the entire system). It's possible that Excel could be optimized for this specific scenario, but not likely. If you don't care about the speed of the calculation, then you could set the processor affinity to just one core for Excel (Task Manager -> right click Excel.exe -> Set Affinity) and this will give the entire other core to the rest of the system and should make things more responsive.
    – Chris O
    Commented Jul 25, 2012 at 19:15
2

User-mode applications generally can't slow down your OS's GUI. However, the situation isn't as clear-cut as it seems. There isn't a single user-mode application that is 100% user-mode, either because of system calls (notably the file system) or because of virtual memory mapping.

In most cases, the OS isn't stuck doing CPU-work, it's stuck waiting on some I/O resource. This is especially apparent when you run out of physical memory and start going into the pagefile (although do note that the OS can put memory in the page file at its leisure, even if there's plenty of physical RAM available). Windows is usually pretty smart about this, but it's quite possible to "break" this.

If it really is CPU, it's most likely due to a greedy driver (the vast majority of which is not written by Microsoft). Kernel-mode drivers are exempted from pre-emptive multi-tasking (both for latency and reliability reasons), so if a driver runs a second long loop on the CPU, you're out of luck, it will not be pre-empted.

A great and simple tool to see some of this is Process Explorer (from SysInternals) which shows you the kernel-time of a CPU (ie. how much of the CPU work on the core is done in the kernel, as opposed to the user applications themselves). Windows 7 and later also include this in their task manager (it's the red line in the CPU usage graphs).

All in all, pre-emptive multi-tasking doesn't save the OS designers from having to make compromises. There's always a cost to everything, and task scheduling is indeed very tricky (right now, my OS juggles over 2000 threads - that's quite a lot, and I'm not really doing "anything"). Would it be better to give the threads smaller time chunks, and spend more time doing context switching? Would it be better to give them longer time chunks, sacrificing latency?

So, check what the hard drives are doing. Check how your memory is used. Check the kernel-time. This will quite likely show you why Windows loses responsivity while you're doing heavy work. Some of these can be remedied (freeing memory, limiting the memory-offender, picking CPU affinity manually...), some are solved simply by keeping your drivers up to date (graphics drivers used to be quite notable for the issues they caused, which probably played a part in Windows dumping them back from kernel-mode to user-mode in recent versions).

Also, being on the subject of the GPU, modern Windows use GPU acceleration for rendering the GUI (actually, to a point, so did Windows XP). If your application taxes the GPU significantly, it might also lead to slow responsiveness of Windows, especially in Aero. Since GPUs are increasingly being used as GP "hyper" CPUs, this can be significant even outside of games and such.

Another major offender is a poorly written multi-threaded application. It's quite easy to kill memory caching if you're doing silly things, and RAM is extremely slow compared to the CPU itself, so without efficient caching, the CPU gets very, very slow (even though it's basically waiting the whole time). This is even more complicated on multi-core CPUs (and multi-CPU systems), because to ensure consistency, many multi-threaded operations require an invalidation of cached data (so that one CPU doesn't access the "old" value of the variable in memory out of its cache / registers). All those things are incredibly fast, but... CPUs are faster. Much faster. Which of course also introduces the issues of different CPU providers handling the same things differently, completely separate from the much higher-level OS. Modern CPUs (486+, so yeah, it's a very old feature a disturbing amount of programmers has no idea of) are actually heavily parallelized, they're no longer executing one instruction after another.

So even if the OS does everything perfectly (obviously an impossible ideal), it can still grind down to a halt due to hardware issues and communication with the hardware. What good is it that you have a 4-core CPU when your application manages to completely saturate the RAM R/W? What good is it having a fast hard drive, when every single byte it reads has to go through the CPU (remember PIO?). Every hardware operation that doesn't use direct memory access can potentially stall your CPU.

And now, most of Windows actually runs as user-mode applications. And they interact with the applications running as well - if I ask explorer.exe to do something for me, it can't do anything else in the mean-time. If I send a billion windows messages to a window, it's going to leave a mark.

There's just so much happening, all the time... guarantees are very difficult. Note how faster things become as soon as the "ctrl-alt-delete" screen actually comes on - suddenly it's as if nothing was wrong :)

0

Specifically to windows coming to a crawl.. one of the most common causes I find is when multiple processes are using enough virtual memory to start forcing a lot of swapping. It may not "appear" that there is a lot of disk activity, however it usually is happening.

This is easily proven as well as works to get out of this jam by simply selecting some small processes (like notepad) and "cleanly" and "patiently" ending them. This starts to reduce the jammed freeway effect. And then you close one large memory hog and everything comes back to usable state. Observing disk activity during this time often shows very little is going on.

Why do I suggest to close smaller apps first.. By removing smaller apps first you are freeing up virtual memory with minimal disk activity and quickly unclogging the machine. If you try to close a large app, e.g. firefox or Chrome, they all have extended close times. This is because most last apps have background processes that will not go away for a looooong time because the background processes also are stuck getting paged in and out of the swap file. To make matters worse, they also happen to be lower priority they take even longer to finish their business, and the full quantity of virtual memory requested by the app has to wait to be released till all background processes and threads are done.

Another thing I've noticed from experience is that the browsers are one of the worst when it comes to closing quickly and cleanly. Even the office apps close more cleanly and quickly.

4
  • 1
    There are times which call for patience, but when it comes to web browsers don't waste your time. It is better to just terminate (kill) them. They handle that fine. I always kill -9 chrome. But restarting with a few hundred tabs (1200) needs some work.
    – Dan D.
    Commented Mar 13, 2014 at 4:39
  • The scenario I have in mind is where there is adequate memory and minimal disk I/O - high CPU usage alone is sometimes enough to make the OS GUI almost unusable. My understanding is that no matter how much CPU time an app wants to use, the OS can/should/is reserving a chunk for itself, so as long as there's not contention for other resources, the GUI should remain responsive.
    – user49214
    Commented Mar 13, 2014 at 23:02
  • @JonofAllTrades OS can and does reserve a chunk. However if you are experiencing extreme slowness with just high CPU usage, then you, my friend, most likely have a graphics card driver issue. It is possible the driver is written such that it is needing to call back into the OS instead of just handling the screen painting. One simple test may be to try to disable Aero support (and possibly even themes) temporarily to determine where the real problem is. BTW responsive mouse doesn't mean OS is not busy otherwise. A large part of mouse/kbd handling is done in drivers that always get time slices.
    – LMSingh
    Commented Mar 15, 2014 at 0:21
  • @DanD. you've blasted through all boundaries I've known with 1200 tabs!! Wow! In XP I'd see windows start to corrupt after IE had about 23 tabs. IN Win 7, I've pushed up to about 60 in chrome and another 45 in FF at the same time (while many other apps are running) and then things get unstable. Guess, I'll have to figure out how to break past a couple of hundred first!
    – LMSingh
    Commented Mar 15, 2014 at 0:27

You must log in to answer this question.