Why do old computers (PCs) perform a long memory test on every boot?

Question

Basically any computers from the mid 90s and earlier perform a slow memory check on every single boot. The more memory there is present, the slower that process becomes, for example: https://www.youtube.com/watch?v=A3Po8zneaLE

Why are they doing that? Modern computers, as far as I am aware of, only check their memory when explicitly told to. What exactly are retro computers doing during that check that more modern computers seem to not do and why?

Modern computers now have to overwrite all memory on boot to avoid cold boot security attacks. — user71659, Commented Oct 6, 2018 at 16:10
And modern computers have to perform a memory check to enumerate available memory — tofro, Commented Oct 6, 2018 at 20:21
Not all old computers "boot" per se. Booting involves loading an operating system off floppy disk, hard disk or some other external storage. Many computers from the '80s and earlier simply power up and run code on ROM. — Jim MacKenzie, Commented Oct 8, 2018 at 14:49
@JimMacKenzie I would speculate that the OP has meant IBM PC XT / AT by saying "old computers". Indeed, you are right that quite a number of home computers did boot from ROM and did not perform any memory tests upon cold start. However, XT / AT were notorious for those lengthy RAM tests, I've seen such tests a number of times myself when I was a teenager. Later on, BIOS introduced an option to bypass the RAM test, but earlier models kind of "forced" it if my memory serves me well. — DmytroL, Commented Aug 30, 2021 at 7:58
@JimMacKenzie: It took awhile for such ability to become even close to universal. The original PC and XT, as well as many early clones, had a startup test that took about a minute on a 640K system. — supercat, Commented May 6 at 16:44

isanae · Accepted Answer · 2018-10-06 19:36:18Z

Why are they doing that?

The most important reason is that IBM introduced that check as part of the BIOS startup code, so everyone copied it to be compatible.

The PC did differ from many other machines of the same era in that it did a thorough test of all components installed at power up to make sure the configuration was operable. Something carried over from mainframes or similar professional systems. Other machines just initialized components and let the user guess what the problem was when an error occurred.

Modern computers, as far as I am aware of, only check their memory when explicitly told to.

RAM got more reliable over the years. Equally important, RAM size increased manyfold, making a thorough memory test anything but quick. Last but not least, memory design for PCs did split in the (late) 90s between consumer PC with error detection (like the first PC) and professional machines with error correction (ECC). Where consumer grade machines just let the process/OS die on the user, professional systems will not only correct starting RAM failure, but also report it which (hopefully) leads to preemptive RAM change.

What exactly are retro computers doing during that check that more modern computers seem to not do

Various bit patterns are written to RAM and read again to detect cell failure or certain kinds of crossover. The test is split into two parts: base RAM (first 16/64 KiB, *1,2) and memory above 64 KiB. On AT (286+) class machines, a third (faster) test may be used for memory above 1 MiB (*3), together with an additional test in protected mode and even more diverging POST codes.

Conventional memory (up to 1 MiB ,*4) is checked in 4 KiB blocks (*5) and reported as such. The BIOS halts if there is an error in the first 16 KiB (original PC) or first 64 KiB (XT and above).

The bit pattern used (*6) for the first 64 KiB is AA, 55, 00, FF, 01, 02, 04, 08, 10, 20, 40 and 80. They are written (and read) in a way to not only detect single bit failures, but also address and data line mismatch/failure.

For the remaining memory it is shortened to AA, 55, FF, 00 and 01.

Here is a nice explanation of basic bit walking and increment tests similar to what the PC does/did and what it will show.

and why?

To alarm the user of an imminent RAM problem before it occurs so that they don't lose hours of work due to a flipped bit.

*1 - 16 KiB on the first series of 5150 PCs (64 Kib Motherboard), 64KiB on the later (256 KiB motherboard and XT)

*2 - On the XT there is a separate BIOS POST code for the first 32 KiB.

*3 - The beep codes do not distinguish between above 64 KiB and above 1 MiB.

*4 - Well, in reality on the early PCs only until 544 KiB. Later PCs would go until 640 KiB.

*5 - Looks like a hint as if they expected 4 KiB chips to be used - at least during early development stage - or that test was copied from some other device using them.

*6 - Caveat: Bit patterns are taken from an old man's memory. To verify, browsing the BIOS would be helpful.

Wasn't it also in some cases to check total amount of memory? A lot of retro computers didn't have any place to save that information until next start. — UncleBod, Commented Oct 6, 2018 at 13:36
@UncleBod Other computers did, but not for the (original) PC, as it's memory size was set by switches. One switch (group) noted what banks are filled and another group the amount of RAM inserted. The BIOS was ment to obey these setings, not search for themself. — Raffzahn, Commented Oct 6, 2018 at 13:39
@AndreasHartmann Maybe check this additional page: esacademy.com/en/library/technical-articles-and-documents/… — Raffzahn, Commented Oct 6, 2018 at 15:17
@MichaelKjörling The PC's RAM had a parrity bit for error detection, no correction. And no (default) way to recover from a memory error. When one occures, a NMI is issued. MS-DOS got no NMI handler and the BIOS just displays "PARRITY ERROR 1" when it's on mainboard RAM or "2" when it's on an expansion card - and of course only as long as NMI generation is enabled (Port A0). In fact, to make it worse, there were tools to disable parrity check. People used it to work with unreliable setups insetad of buying new RAM. — Raffzahn, Commented Oct 6, 2018 at 19:59
Introducing IBM as the "inventor of the memory check" sounds a bit odd to me - Basically all decent computers I happen to know before the IBM PC check their memory as well - After all, this is not a PC-specific question. — tofro, Commented Nov 28, 2018 at 14:56

Kaz · Accepted Answer · 2021-08-29 09:37:45Z

7

In the era of the original IBM PC (early 1980s), home computers would often use many chips of RAM (eight or more) to provide the system's memory. These would either be soldered directly to the motherboard, or fitted in individual sockets. (The inline memory modules or SIMMs/DIMMs we see today, with several RAM chips soldered to a removable board, came years later.)

Memory chips can fail in a variety of different ways. For example, they might always output some fixed value, fail to retain or refresh stored data, write or read data to/from the wrong location, or something else entirely. Some errors will stop the operating system from booting correctly, others may only show up later when running your software (and potentially corrupting your important data!)

To avoid this happening, the IBM PC's BIOS runs a series of read and write tests on its memory during the Power ON Self-Test (POST), before handing over to the operating system. If an error is detected, a message is displayed on-screen which a technician can use to determine the faulty chip. (In a fully expanded IBM PC, there'd be 36 AM9016 memory chips; finding a faulty chip by trial-and-error would be time consuming.)

As mentioned in the question, the more RAM fitted to a machine, the longer it takes to test all memory locations in that RAM. Because nobody enjoys waiting for their computer to boot, the option to skip the extended memory test was included. Improved manufacturing techniques meant fewer RAM chip errors, and it was often the case that a detected RAM fault was caused by a RAM chip that had become loose in its socket, a phenomenon known as "chip creep". Frustration with this situation led to the introduction of SIMMs, which were held in position more reliably, and also saved space on the motherboard.

Because memory tests were becoming slower, and faults were becoming rarer, manufacturers changed the default to not running a memory test during POST. (A faster boot-time would be a marketing advantage.) The BIOS option is still available on modern machines, usually by disabling an option named "fast boot" or similar.

edited Aug 29, 2021 at 9:37

answered Nov 28, 2018 at 12:55

Kaz

8,1862 gold badges39 silver badges82 bronze badges

I wonder why the PC's memory test did all of the testing for each 16K section before advancing to the text? I would think it would have been both faster and more effective to write an odd-length pattern to all of memory, disable RAM refresh for a little while while avoiding accesses to half the rows, allow a refresh of all RAM, disable RAM refresh for a little while while avoiding access to the other half of the rows, and then verify that everything was stored correctly, and then repeat with another pattern.
– supercat
Commented Jan 11, 2022 at 19:39
Code which processes 16KB sections separately wouldn't detect situations where erroneous configuration would cause a chunk of memory to be mapped to two different addresses, but if one fills all of memory with e.g. a 53-byte pattern, then every 16KB section of RAM would end up with a different pattern in it, so any chunk of RAM that gets double-mapped would report an error when the first image is read back.
– supercat
Commented Jan 11, 2022 at 19:43
@supercat Remember the IBM 5150 PC derives from the IBM 5322/5324 System/23 Datamaster, not only in much of its hardware but also inherited many of its programming logic, including tests. The Datamaster had 16KB memory pages of RAM, therefore it is logical to think if they modified a Datamaster to be used as a PC prototype to use the same memories and tests, althought the ones from a PC are simplified in comparison with the S/23, where literally everything is tested.
– Borg Drone
Commented Apr 23 at 7:47
@BorgDrone: I would think a good memory test should configure DRAM refresh to run slightly slower than normal, write everything with test pattern #1, configure an interrupt to run after a few refresh cycles, execute a WAIT instruction so CPU accesses wouldn't refresh any memory, and then verify that pattern #1 reads back correctly, and then repeat the process with two other test patterns. I'm unaware of any early PC BIOS routines testing DRAM reliability with slow refresh, however.
– supercat
Commented May 6 at 16:58
@supercat I don't know either, but I wrote the previous post in order to inform about the origin of the 16KB sized chunks in the tests. Actually the Datamaster had a 8203 DRAM controller in 16KB mode but it was dropped during the development of the PC, however.
– Borg Drone
Commented May 6 at 17:14

| Show 1 more comment

Borg Drone · Accepted Answer · 2024-05-06 14:36:17Z

The PC's POST tests routines are an inheritance from the Datamaster's PID 1200 routine (see it here in action), which is an exhaustive test of all the system and allows detection and test of ROMs, RAM and peripherals.

When IBM designed the System/23 they used the same methodology as a they would as if it was a minicomputer (midrange system). On every single boot, the Datamaster would test itself to be easier to diagnose and repair.

Later on, when it was modified to be the first PC prototype, they wrote their equivalent of the PID 1200 which is the original POST system.

Stack Exchange Network

Why do old computers (PCs) perform a long memory test on every boot?

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
hardware
memory
boot
.

Linked

Hot Network Questions

Why do old computers (PCs) perform a long memory test on every boot?

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged hardwarememoryboot.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
hardware
memory
boot
.