8

I'm sure it would have a protocol for overheating, simply by shutting down, but would a server have a protocol for failing parts? For example, failing fans, or damaged power supply?

5
  • 6
    In software? Not really. There's not really any reason to NOT have such software resiliencies built in to consumer level software. But in hardware? Yes. Server hardware is generally built using higher quality (more expensive) components, and also with more of them (redundancy), and servers use hardware technologies that add resiliency and redundancy. The idea is that server hardware can generally handle individual components failing without that impacting the system as severely as a system that lacks that redundancy. Commented Dec 26, 2023 at 22:33
  • 4
    Also, more parts can be swapped while the machine is running. Commented Dec 27, 2023 at 6:49
  • 2
    You should bear in mind that ProLiants aren't what they were. I've got a high-quality HP-branded ML310e Gen 8 which is still great after 10 years. I've also got a 4-year-old HPE-branded ML30 Gen 10. It's much lower quality. Its only redeeming feature is the iLO. It's currently in bits on my desk; might be usable with a new drive, or might not.
    – EML
    Commented Dec 28, 2023 at 16:45
  • 2
    Of course they do. More usefully, what research have you done? Commented Dec 28, 2023 at 18:17
  • 1
    Well I have looked at some YouTube videos that shown the BIOS and the many options available. I have also checked out how easy it was to replace the hardware on these servers. I also researched a bit on the iLO, which I love. Afterwards, I came here. I did not know if there was some safety precautions further integrated.
    – Javontae
    Commented Dec 28, 2023 at 18:38

2 Answers 2

23

There are some hardware features that are common in servers but relatively rare in home pcs:

  • Redundant power supplies
  • ECC RAM with ECC-supporting motherboard and CPU
  • SAS for hard drive error checking
  • "server-grade" motherboard

But if there is any problem, the hardware handles major faliures (there is thermal shutdown in 100% of modern CPUs and GPUs, and redundant power management is also handled by hardware), but other things are handled by software, which means it isn't server-specific.

7
  • 1
    SAS for hard drive error checking - what does SAS do that SMART over SATA can't do, in terms of HDD or SDD error reporting or status querying? Your other points make sense to me, including even hot-swap capability for the rendundant power supplies so you replace the failed one while it's running on the one working PSU that's left. And maybe even hot-swap RAM. Commented Dec 27, 2023 at 7:27
  • 6
    @PeterCordes He probably meant RAID. I had a bunch of trainees conflating SAS and RAID. Turned out that the confusion was due to their badly written study material, that just said "RAID is SCSI (now obsolete) or SAS (current) disks used in a group" Didn't even mentioned you need a controller or that RAID comes in several flavors. And that was only the tip of the iceberg with that educational program. Numerous complains (not only from me) about their educational material. School is still using the same 5 years later. We don't take their trainees for internships anymore.
    – Tonny
    Commented Dec 27, 2023 at 10:43
  • 8
    @PeterCordes It’s not strictly tied to SAS, but there’s a lot of stuff that is relatively common or easy to find with SAS, FC, or SCSI hardware that’s almost nonexistent for SATA. Multipathing comes to mind as one of the big ones here, but stuff like T10 DIF for E2E integrity checking at the sector level is also far more common with this type of hardware. I think it’s pretty likely the OP was conflating a couple of such things, and possibly also RAID as Tonny suggests, and calling it all ‘SAS’. Commented Dec 27, 2023 at 12:12
  • 4
    another thing I'd like to mention is physical security, server hardware usually has intrusion sensors and other hardware presence sensors (hdd/psu), so if somebody opens chassis you can see an alert (locally or remotely via network/SNMP), also both chassis and front bezel often provide locks with keys, so it is harder to access internals Commented Dec 27, 2023 at 17:14
  • 4
    Servers often also provide hot swapping for many components like RAM, fans, and drives, as well as specialized ports or slots for specific hardware devices that would take up a standard PCI slot on a home computer, such as as network adapters, RAID controllers, etc.
    – barbecue
    Commented Dec 27, 2023 at 17:56
13

In addition to what is mentioned in the answer, servers may have additional remote management facilities built in compared to a consumer computer.

This can be be manufacturer specific. E.g.:

  1. HPE (Hewlett Packard Enterprise) has Integrated Lights-Out (iLO), which How is HPE iLO unique? contains:

    To help you manage your servers more easily, this embedded management process runs on a separate microprocessor chip (which is why it is called “out-of-band management”). This way, HPE iLO remains available, even when the server suffers a failure. You can use iLO to determine precisely what went wrong and then fix it quickly and efficiently, even if you are unable to power up your server.

  2. Dell Remote Access Controller (iDRAC) contains:

    The iDRAC is a piece of hardware that sits on the server motherboard that allows Systems Administrators to update and manage Dell systems, even when the server is turned off.

7
  • 1
    @Javontae it is called idrac in dell's servers Commented Dec 27, 2023 at 17:16
  • 5
    @Javontae Practical considerations worth considerin': the iDrac unit itself consumes power, as I found out recently- 26W (!!)
    – bertieb
    Commented Dec 27, 2023 at 22:51
  • 1
    Back when I was a sysadmin for a small research group at a local university, I recall we had some Sun and/or Dell servers for a few small Beowulf clusters. One of them (I think Sun) had what it called "iLOM" (integrated Lights-Out Management); same idea: a management processor with its own ethernet port you could connect to via a Java Applet (this was 2005 or 6, and Sun now that I think about it). So it was like a VNC connection to the physical machine's VGA, keyboard and mouse, including being able to press the reset and power buttons, and get into the BIOS. Transparent to the main CPU / OS. Commented Dec 28, 2023 at 10:20
  • 4
    @PeterCordes HP also have Intel AMT on some of their desktop PCs, where AMT can use the same Ethernet interface as the OS for out-of-band management. E.g. in Enabling AMT networking did enable AMT network access such that could get to a AMT logon page when the HP Z640 was connected to the mains but the main CPU was powered down. Think Intel AMT can sniff Ethernet frames received on the on-board Ethernet interface and pass them to the management processor, rather than the main CPU. Commented Dec 28, 2023 at 10:40
  • 6
    @ChesterGillon it does do the latter. It also does an annoying thing where by default the management controller will take over responding to pings on the host's IP if the host is off, which is a bit of a pain when you have a naive monitoring system which is pinging things to check whether they're online, and you don't know that AMT does this unless you disable the setting.
    – Carcer
    Commented Dec 28, 2023 at 12:17

Not the answer you're looking for? Browse other questions tagged .