If one or more IP addresses can map to the same MAC address, why are MAC addresses necessary? Even though MAC is a data link layer address and IP is a network layer protocol, why isn't an IP sufficient?
At first there was indeed no such separation between layers, and Ethernet addresses were directly used as host addresses. But as soon as you have more than one network-layer protocol, the separation becomes inevitable as the original protocol's "network" address – whose format is set in stone at that point – becomes the "data link layer" address for the newly introduced protocols.
Originally, Ethernet wasn't built to carry IPv4 (which didn't quite exist yet) – it was built to carry Xerox's Pup protocol suite, and the early "Experimental Ethernet" did directly use 8-bit addresses that were the same as 8-bit Pup addresses.
So when other people wrote specifications on how to carry IPv4 over Experimental Ethernet, those 8-bit Pup addresses became the 8-bit "MAC" addresses from IPv4's point of view.
Later, when Xerox switched to 48-bit addressing (because 8 bits wasn't long enough) they again directly used the 48-bit Ethernet address as part of the XNS host address. And at this point they were already in the situation where 8-bit Pup addresses had to be translated to 48-bit Ethernet addresses.
Now imagine that Ethernet hardware directly worked with 32-bit IPv4 addresses. If that were the case, in order transport other protocols such as IPv6 over existing networks (without having to rip and replace every single switch and NIC) you would still need to include a residual IPv4 header below the new IPv6 header, and you'd need a way to dynamically map a node's IPv6 address(es) to its IPv4 address (sounds a lot like ARP, doesn't it), and then you'd still end up with exactly the same situation as today except the "MAC" address would be 32-bit.
(In a sense, you could say that this is exactly what happened. Because Ethernet addresses were (in part) XNS addresses, you could actually say that the 48-bit MAC address was originally an "IP" address from another world – it just became a "MAC" address when Ethernet was adapted to carry different kinds of protocols that did not use the same address format.)
The cause in all cases is that the "link layer" address isn't only known to the OS, it's also used by the hardware. For example, in old shared-bus Ethernet (where every node saw every packet), the hardware would look at the "destination MAC" field and discard unwanted packets so that the CPU would remain undisturbed.
With later improvements to Ethernet, the MAC addresses aren't limited to hosts only – they're also tracked by Ethernet switches. Every single switch builds an in-memory "Ethernet routing table" that says which MAC (L2) addresses are behind which physical (L1) port, so that it would correctly deliver L2 frames to only the specific port that wants them (instead of flooding every packet across the entire ethernet).
This means whatever header and address format was chosen, literally gets baked into the hardware of all Ethernet NICs and all Ethernet switches, and becomes mandatory to use by all other network-layer protocols.