56

I know that when the source code, in say C++, is compiled, the output from the compiler is the machine code (executable) which I thought were instructions to the CPU directly. Recently I was reading up on kernels and I found out that programs cannot access the hardware directly but have to go through the kernel.

So when we compile some simple source code, say with just a printf() function, and the compilation produces the executable machine code, will each instruction in this machine code be directly executed from memory (once the code is loaded into memory by the OS) or will each command in the machine code still need to go through the OS (kernel) to be executed?

I have read a similar question. It did not explain if the machine code that is generated after compilation is an instruction to the CPU directly or if it will need to again go through the kernel to create the correct instruction for the CPU. I.e., what happens after the machine code is loaded into memory? Will it go through the kernel or directly talk to the processor?

8
  • 29
    If you're writing code for an Arduino you don't need an OS.
    – stib
    Commented Jul 27, 2018 at 20:16
  • 12
    printf is not a great example. It's explicitly defined by the C spec as a function that's only available in "hosted" implementations (meaning running on a kernel, as opposed to "freestanding", which may not require one). And on most platforms, printf is a just a function provided by your libc that does a bunch of stuff on your behalf (which eventually includes a syscall to print to stdout). It's really no different from calling libvlc_media_list_add_media or PyObject_GetAttr, except that some printf implementation is guaranteed linkable without adding extra non-standard -ls.
    – abarnert
    Commented Jul 28, 2018 at 9:49
  • 1
    This exists! (not affiliated, just thought it was cool) erikyyy.de/invaders Commented Jul 28, 2018 at 15:28
  • 9
    This really depends on your precise definition of the terms "executable", "kernel", "run", "need", "talk to", and "go through". Without a precise definition of those terms, the question is un-answerable. Commented Jul 29, 2018 at 9:07
  • 3
    @JörgWMittag -- If you're going to be pedantic, then why are you only scrutinizing just those terms and only this question? The truly salient term that needs defining is "operating system", which is questionably applied to MS-DOS (and similar single-task runtime environments). If there's a few (misinformed) people that think that the PC BIOS is an OS, then is everything up for grabs? I think not. The OP uses those words in a context that seems either reasonable (esp. if non-native English speaker) or non-technical.
    – sawdust
    Commented Jul 30, 2018 at 4:16

11 Answers 11

90

As someone who has written programs that execute without an OS, I offer a definitive answer.

Would an executable need an OS kernel to run?

That depends on how that program was written and built.
You could write a program (assuming you have the knowledge) that does not require an OS at all.
Such a program is described as standalone.
Boot loaders and diagnostic programs are typical uses for standalone programs.

However the typical program written and built in some host OS environment would default to executing in that same host OS environment.
Very explicit decisions and actions are required to write and build a standalone program.


... the output from the compiler is the machine code (executable) which I thought were instructions to the CPU directly.

Correct.

Recently I was reading up on kernels and I found out that programs cannot access the hardware directly but have to go through the kernel.

That's a restriction imposed by a CPU mode that the OS uses to execute programs, and facilitated by certain build tools such as compilers and libraries.
It is not an intrinsic limitation on every program ever written.


So when we compile some simple source code, say with just a printf() function, and the compilation produces the executable machine code, will each instruction in this machine code be directly executed from memory (once the code is loaded into memory by the OS) or will each command in the machine code still need to go through the OS (kernel) to be executed?

Every instruction is executed by the CPU.
An instruction that is unsupported or illegal (e.g. process has insufficient privilege) will cause an immediate exception, and the CPU will instead execute a routine to handle this unusual condition.

A printf() function should not be used as an example of "simple source code".
The translation from an object-oriented high-level programming language to machine code may not be as trivial as you imply.
And then you choose one of the most complex functions from a runtime library that performs data conversions and I/O.

Note that your question stipulates an environment with an OS (and a runtime library).
Once the system is booted, and the OS is given control of the computer, restrictions are imposed on what a program can do (e.g. I/O must be performed by the OS).
If you expect to execute a standalone program (i.e. without an OS), then you must not boot the computer to run the OS.


... what happens after the machine code is loaded into memory?

That depends on the environment.

For a standalone program, it can be executed, i.e. control is handed over by jumping to the program's start address.

For a program loaded by the OS, the program has to be dynamically linked with shared libraries it is dependent on. The OS has to create an execution space for the process that will execute the program.

Will it go through the kernel or directly talk to the processor?

Machine code is executed by the CPU.
They do not "go through the kernel", but nor do they "talk to the processor".
The machine code (consisting of op code and operands) is an instruction to the CPU that is decoded and the operation is performed.

Perhaps the next topic you should investigate is CPU modes.

10
  • 2
    "If you expect to execute a standalone program (i.e. without an OS), then you must not boot the computer to run the OS." is not entirely correct. Many DOS programs were loaded after DOS and then completely ignored DOS services (by directly bit-banging or perhaps calling BIOS directly). Win3.x is an excellent example that (except in some interesting corner cases) ignored that DOS was present. Win95/98/Me did this as well. There are many examples of OSs that support standalone programs, many from the 8-/16-bit era. Commented Jul 28, 2018 at 21:27
  • 10
    @EricTowers -- By "DOS" presumably you mean MS-DOS (since I've used DOSes not related to MS or Intel)? You're citing an "OS" that doesn't even match the criteria of my 1970's college textbooks on OS concepts and design. The origins of MS-DOS trace back (through Seattle Computer Products) to CP/M, which is explicitly not called an OS by its author Gary Kildall. FWIW an OS that allows a program to take over the system has failed in its basic function of managing the system resources. "There are many examples of OSs that support standalone programs" -- "Support" or unable to prevent?
    – sawdust
    Commented Jul 29, 2018 at 1:08
  • 5
    ... or ProDOS or PC-DOS or DR-DOS or CBM DOS or TRS DOS or FLEX ... Commented Jul 29, 2018 at 1:22
  • 4
    I like GCC's "freestanding" terminology. The English word has all the right connotations for code that runs without an OS, maybe even better than "standalone". e.g. you can compile gcc -O2 -ffreestanding my_kernel.c special_sauce.S to make an executable that doesn't assume any of the normal libraries or OS stuff will be there. (Of course you would normally need a linker script to get it to usefully link into a file format that a bootloader will want to load!) Commented Jul 29, 2018 at 17:07
  • 4
    @PeterCordes "standalone" is the term used in the C standard which IMO can be considered somewhat authoritative. Alternatively a good term is also "non-hosted" (as in hosted by the OS)
    – jaskij
    Commented Jul 30, 2018 at 18:47
40

The kernel is "just" more code. It's just that that code is a layer that lives between the lowest parts of your system and the actual hardware.

All of it runs directly on the CPU, you just transition up through layers of it to do anything.

Your program "needs" the kernel in just the same way it needs the standard C libraries in order to use the printf command in the first place.

The actual code of your program runs on the CPU, but the branches that code makes to print something on screen go through the code for the C printf function, through various other systems and interpreters, each of which do their own processing to work out just how hello world! actually gets printed on your screen.

Say you have a terminal program running on a desktop window manager, running on your kernel which in turn is running on your hardware.

There's a lot more that goes on but lets keep it simple...

  1. In your terminal program you run your program to print hello world!
  2. The terminal sees that the program has written (via the C output routines) hello world! to the console
  3. The terminal program goes up to the desktop window manager saying "I got hello world! written at me, can you put it at position x, y please?"
  4. The desktop window manager goes up to the kernel with "one of my programs wants your graphics device to put some text at this position, get to it dude!"
  5. The kernel passes the request out to the graphics device driver, which formats it in a way that the graphics card can understand
  6. Depending on how the graphics card is connected other kernel device drivers need to be called to push the data out on physical device buses such as PCIe, handling things like making sure the correct device is selected, and that the data can pass through relevant bridge or converters
  7. The hardware displays stuff.

This is a massive oversimplification for description only. Here be dragons.

Effectively everything you do that needs hardware access, be it display, blocks of memory, bits of files or anything like that has to go through some device driver in the kernel to work out exactly how to talk to the relevant device. Be it a filesystem driver on top of a SATA hard disk controller driver which itself is sitting on top of a PCIe bridge device.

The kernel knows how to tie all these devices together and presents a relatively simple interface for programs to do things without having to know about how to do all of these things themselves.

Desktop window managers provide a layer that means that programs don't have to know how to draw windows and play well with other programs trying to display things at the same time.

Finally the terminal program means that your program doesn't need to know how to draw a window, nor how to talk to the kernel graphics card driver, nor all of the complexity to do with dealing with screen buffers and display timing and actually wiggling the data lines to the display.

It's all handled by layers upon layers of code.

3
  • Not just hardware access, most communication between programs also goes through the kernel; that which doesn't typically at least involved the kernel is setting up a more direct channel. However, for purposes of the question, it is also possible and practiced in far simpler cases to condense all the code into a single program. Commented Jul 27, 2018 at 23:58
  • Indeed, your terminal program doesn't even have to be running on the same machine as the program that's writing stuff to it.
    – jamesqf
    Commented Jul 28, 2018 at 5:26
  • Since it may need to be explicitly stated in this question - do note that when we talk about programs "talking to" each other, it's metaphorical. Commented Aug 1, 2018 at 4:20
21

It depends on the environment. In many older (and simpler!) computers, such as the IBM 1401, the answer would be "no". Your compiler and linker emitted a standalone "binary" that ran without any operating system at all. When your program stopped running, you loaded a different one, which also ran with no OS.

An operating system is needed in modern environments because you aren't running just one program at a time. Sharing the CPU core(s), the RAM, the mass storage device, the keyboard, mouse, and display, among multiple programs at once requires coordination. The OS provides that. So in a modern environment your program can't just read and write the disk or SSD, it has to ask the OS to do that on its behalf. The OS gets such requests from all the programs that want to access the storage device, implements about things like access controls (can't allow ordinary users to write to the OS's files), queues them to the device, and sorts out the returned information to the correct programs (processes).

In addition, modern computers (unlike, say, the 1401) support the connection of a very wide variety of I/O devices, not just the ones IBM would sell you in the old days. Your compiler and linker can't possibly know about all of the possibilities. For example, your keyboard might be interfaced via PS/2, or USB. The OS allows you to install device-specific "device drivers" that know how to talk to those devices, but present a common interface for the device class to the OS. So your program, and even the OS, doesn't have to do anything different for getting keystrokes from a USB vs a PS/2 keyboard, or for accessing, say, a local SATA disk vs a USB storage device vs storage that's somewhere off on a NAS or SAN. Those details are handled by device drivers for the various device controllers.

For mass storage devices, the OS provides atop all of those a file system driver that presents the same interface to directories and files regardless of where and how the storage is implemented. And again, the OS worries about access controls and serialization. In general, for example, the same file shouldn't be opened for writing by more than one program at a time without jumping through some hoops (but simultaneous reads are generally ok).

So in a modern general-purpose environment, yes - you really need an OS. But even today there are computers such as real-time controllers that aren't complicated enough to need one.

In the Arduino environment, for example, there isn't really an OS. Sure, there's a bunch of library code that the build environment incorporates into every "binary" it builds. But since there is no persistence of that code from one program to the next, it's not an OS.

10

I think many answers misunderstand the question, which boils down to this:

A compiler outputs machine code. Is this machine code executed directly by a CPU, or is it "interpreted" by the kernel?

Basically, the CPU directly executes the machine code. It would be significantly slower to have the kernel execute all applications. However, there are a few caveats.

  1. When an OS is present, application programs typically are restricted from executing certain instructions or accessing certain resources. For example, if an application executes an instruction which modifies the system interrupt table, the CPU will instead jump to an OS exception handler so that the offending application is terminated. Also, applications are usually not allowed to read/write to device memory. (I.e. "talking to the hardware".) Accessing these special memory regions is how the OS communicates with devices like the graphics card, network interface, system clock, etc.

  2. The restrictions an OS places on applications are achieved by special features of the CPU, such as privilege modes, memory protection, and interrupts. Although any CPU you would find in a smartphone or PC has these features, certain CPUs do not. These CPUs do indeed need special kernels which "interpret" application code in order to achieve the features that are desired. A very interesting example is the Gigatron, which is an 8-instruction computer you can build out of chips which emulates a 34-instruction computer.

  3. Some languages like Java "compile" to something called Bytecode, which is not really machine code. Although in the past they were interpreted to run the programs, these days something called Just-in-Time compilation is usually used so they do end up running directly on the CPU as machine code.

  4. Running software in a Virtual Machine used to require its machine code to be "interpreted" by a program called a Hypervisor. Due to enormous industry demand for VMs, CPU manufacturers have added features like VTx to their CPUs to allow most instructions of a guest system to be executed directly by the CPU. However, when running software designed for an incompatible CPU in a Virtual Machine (for example, emulating a NES), the machine code will need to be interpreted.

2
  • 1
    Although Java's bytecode is normally not the machine code, there still do exist Java processors.
    – Ruslan
    Commented Jul 29, 2018 at 20:17
  • Hypervisors have never always been interpreters. Interpretation is of course necessary if the virtual machine as an instruction set that's incompatible with its host, but for same-architecture execution, even early hypervisors execute code directly on the CPU (you may be confused by the need for paravirtualized kernels, for CPUs without the necessary hypervisor support). Commented Aug 1, 2018 at 13:06
5

When you compile your code, you create so-called "object" code that (in most cases) depends on system libraries (printf for example), then your code is wrapped by linker that will add kind of program loader that your particular operating system can recognize (that is why you can't run program compiled for Windows on Linux for example) and know how to unwrap your code and execute. So your program is as a meat inside of a sandwich and can be eaten only as a bundle, in whole.

Recently I was reading up on Kernels and I found out that programs cannot access the hardware directly but have to go through the kernel.

Well it is halfway true; if your program is a kernel mode driver then actually you can access directly hardware if you know how to "talk" to hardware, but usually (especially for undocumented or complicated hardware) people use drivers that are kernel libraries. This way you can find API functions that know how to talk to hardware in almost human readable way without the need to know addresses, registers, timing and bunch of other things.

will each instruction in this machine code be directly be executed from the memory (once the code is loaded into the memory by OS) or will each each command in the machine code still need to go through the OS(kernel) to be executed

Well, the kernel is as a waitress, whose responsibility is to walk you to a table and serve you. The only thing it can't do - it is eat for you, you should do that yourself. The same with your code, kernel will unpack your program to a memory and will start your code which is machine code executed directly by CPU. A kernel just need to supervise you - what you are allowed and what you're not allowed to do.

it not explain if the machine code that is generated after compilation is an instruction to the CPU directly or will it need to again go through the kernel to create the correct instruction for the CPU?

Machine code that is generated after compilation is an instruction to the CPU directly. No doubt on that. The only thing you need to keep in mind, not all code in compiled file are actual machine's/CPU code. Linker wrapped your program with some meta data that only kernel can interpret, as a clue - what to do with your program.

What happens after the machine code loaded on to the memory? Will it go through the kernel or directly talk to the processor.

If your code is just simple opcodes like addition of two registers then it will be executed directly by CPU without kernel assistance, but if your code using functions from libraries then such calls will be assisted by kernel, as in example with waitress, if you want to eat in a restaurant they would give you a tools - fork, spoon (and it still their assets) but what you will do with it, - it up to your "code".

Well, just to prevent flame in comments - it is really oversimplified model that I hope would help OP understand base things, but good suggestions to improve this answer are welcome.

4

So when we compile a simple source code, say with just a printf() function, and the compilation produces the executable machine code, will each instruction in this machine code be directly be executed from the memory (once the code is loaded into the memory by OS) or will each each command in the machine code still need to go through the OS(kernel) to be executed?

Essentially, only system calls go to the kernel. Anything to do with I/O or memory allocation/deallocation typically eventually results in a system call. Some instructiosn can only be executed in kernel mode and will cause the CPU to trigger an exception. Exceptions cause a switch to kernel mode and a jump to kernel code.

The kernel does not process every instruction in a program. It just does the system calls and switches between running programs to share the CPU.

Doing memory allocation in user-mode (without the kernel) is not possible, if you access memory you don't have permission to access, the MMU, previously programmed by the kernel, notices and causes a CPU-level "segmentation fault" exception, which triggers the kernel, and the kernel kills the program.

Doing I/O in user-mode (without the kernel) is not possible, if you access I/O ports or registers for devices, or addresses connected to devices (one or both needed to perform any I/O), these trigger an exception in the same way.


Would a executable need a OS kernel to run?

Depends on the type of executable.

Kernels, in addition to mediating shared access to RAM and hardware, also perform a loader function.

Many "executable formats", like ELF or PE, have metadata in the executable file in addition to the code, and its the loader's job to process that. Read the gory details about Microsoft's PE format for more information.

These executables also reference libraries (Windows .dll or Linux shared object .so files) - their code has to be included.

If your compiler produces a file that's meant to be processed by an operating system loader, and that loader is not there, it won't work.

  • Can you include code that does the job of the loader?

Sure. You need to convince the OS to somehow run your raw code without processing any metadata. If your code calls kernel APIs, it still won't work.

  • What if it doesn't call kernel APIs?

If you load this executable somehow from an operating system somehow (i.e. if it allows raw code to be loaded and executed), it will still be in user mode. If your code accesses things that are prohibited in user mode, as opposed to kernel mode, such as unallocated memory or I/O device addresses/registers, it will crash with privilege or segment violations (again, exceptions go to kernel mode and are handled there) and still won't work.

  • What if you run it from kernel mode.

Then it will work.


3
  • This is not entirely correct. The requirement that hardware access go through the kernel, or that there even be a kernel, is a design decision made in the affirmative on many systems today, but also made in the negative (even to this day) on many simple systems. Commented Jul 28, 2018 at 0:02
  • I'm explaining how things are if A) there is a kernel and B) if you are running code on a CPU with user/supervisor mode and an MMU to help enforce that. Yes, there are CPUs and microcontrollers without MMUs or user/supervisor mode, and yes some systems run without using the whole user/supervisor infrastructure. Microsoft's first Xbox was like this - even though a standard x86 CPU with user/supervisor mode, from what I understand it never left kernel mode - the loaded game could do whatever it wanted.
    – LawrenceC
    Commented Jul 28, 2018 at 0:11
  • 1
    The Macintosh System, before MacOS X, was an OS of a general purpose computer, running on general purpose CPU (68000 family, PowerPC) with support for memory protection for decades (except the first 68000 based computers I think) that never used memory protection: any program could access anything in memory.
    – curiousguy
    Commented Jul 29, 2018 at 2:25
3

TL;DR No.

Arduino development comes to mind as a current environment where there's no OS. Trust me, on one of these babies you don't have the space for an operating system.

Likewise, games for the Sega Genesis didn't have an OS provided by Sega to call on. You just crafted your game in 68K assembler, writing directly to the bare metal.

Or where I cut my teeth, doing embedded work on the Intel 8051. Again when all you have is a 2716 eprom with a 2k * 8 footprint, you don't have the room for an operating system.

Of course, this assumes a very broad usage of the word application. As a rhetorical question, it's worth asking yourself if an Arduino sketch actually is an application.

3

While I don't want to imply that the other answers are not right on their own, they provide far too many details that, I'm afraid, are still very obscure to you.

The basic answer is that the code will be executed directly on the processor. And no, the machine code will not "talk" to anybody, it's the other way around. The processor is the active component and everything you do in your computer will be done by that processor (I'm simplifying things a bit here but that's OK for now). The processor will read the code and execute it and spit out the results, the machine code is just food for the processor.

Your confusion stems from the use of the word hardware. Although the division isn't as clear cut as it used to be, it's better if you think in terms of peripherals rather than simply calling everything hardware. So, if there is an operating system or similar on your machine, your program has to use its services for accessing the peripherals but the processor itself is not a peripheral, it's the main processing unit that your program runs on directly.

Kernels, operating systems and similar intervening layers are typically used only in larger systems where there is an expectation that several programs will run and there is a need for the system to manage how these programs can use the peripherals of the computer (quite often at the same time). In these cases, running programs can only access these peripherals using the system that will decide how to share them and will make sure there are no conflicts. Small systems where there is no need for any management among competing programs because there are none, often have no underlying system at all and the single program normally running on these systems is more or less free to do whatever it wants with the peripherals.

2

The BIOS that runs in your computer on power up is executable code stored in ROM. It consists of machine instructions plus data. There is a compiler (or assembler) that assembles this BIOS from source code. This is a special case.

Other special cases include the bootstrap program that loads the kernel and the kernel itself. These special cases are generally coded in a language other than C++.

In the general case, it is much more practical to have the compiler produce some instructions that invoke system services provided by a kernel or by library routines. It makes the compiler much more lightweight. It also makes the compiled code more lightweight.

At the other end of the spectrum is Java. In Java, the compiler does not translate the source code into machine instructions, as this term is usually understood. Instead, the source code is translated into "machine instructions" for an imaginary machine, called the Java Virtual Machine. Before a Java program can run, it must be combined with the Java runtime, which includes an interpreter for the Java Virtual Machine.

2

In the good old days your program was responsible for doing everything that needed to be done during the execution of your program, either by you doing it yourself or by adding library code others wrote to your program. The only thing running beside that in the computer was the code to read in your compiled program - if you were lucky. Some computers had to had code entered through switches before being able to do more (the original "bootstrap" process), or even your whole program entered this way.

It was quickly found that it was nice to have code running capable of loading and executing program. Later it was found that computers were powerful enough to support running several programs at the same time by having the CPU switch between them, especially if the hardware could help, but with the added complexity of the programs not steppings on each others toes (for instance, how to handle multiple programs trying to send data to the printer at once?).

All this resulted in a large amount of helper code being moved out of the individual programs and into the "operating system", with a standardized way of invoking the helper code from user programs.

And that is where we are today. Your programs run full speed but whenever they need something managed by the operating system they call helper routines provided by the operating system, and that code is not needed and not present in the user programs themselves. This included writing to the display, saving files, accessing the network, etc.

Microkernels have been written that provide just what is needed for a given program to run without a full operating system. This has some advantages for the experienced users while giving away most others. You may want to read the Wikipedia page about it - https://en.wikipedia.org/wiki/Microkernel - if you want to know more.

I experimented with a Microkernel capable of running a Java Virtual Machine, but found later that the sweet spot for that is Docker.

1

In typical desktop OSes, the kernel itself is an executable. (Windows has ntoskrnl.exe; Linux has vmlinux, etc.) If you needed a kernel in order for an executable to run, then those OSes could not exist.

What you need a kernel for is to do the things a kernel does. Allow multiple executables to run at once, referee between them, abstract the hardware, etc. Most programs aren't capable of doing that stuff themselves competently, and you wouldn't want them to even if they could. In the days of DOS -- which could barely be called an operating system itself -- games often used the OS as little more than a loader, and directly accessed the hardware much like a kernel would. But you often had to know what brands and models of hardware were in your machine before you bought a game. Many games only supported certain families of video and sound cards, and ran very poorly on competing brands if they worked at all. That's the kind of thing you get when the program controls the hardware directly rather than through the abstraction typically provided via the kernel.)

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .