According to wiki IBM System 360 had byte addressable RAM.
Yes.
[Considering this and the title "Why did IBM System 360 have byte addressable RAM" it feels as there's a mixup about what addressing and RAM means. See some thoughts about that at the end)]
Previously IBM had machine with word addressable memory.
No. Only very few.
IBM did make all sorts of machines, including the bit addressable as explained here and here.
In detail the most used machines are:
- 1401 used byte addressing - at the time called character addressing - with 6 bit bytes
- 1620 used decimal addessing with decimal bytes (one digit per byte)
- 1710 - see 1620
- 7030 used bit addressing
Speaking of 7030, a 700/7000 family is often assumed, but in reality its a more of a marketing thing, were IBM tried to press all CPUs into an 70xx numbering scheme, as from hardware, as well as software they were vastly different lines:
- 701 - half word addressing (19 units build)
- 702 - character addressing (14 units build)
- 704 - word addressing
- 709/704x/709x - like 704
- 705/7080 - character addressing
- 7010 - character addressing (top end 1400)
- 7030 - bit addressing
- 707x - decimal words of 10 (like 650 calculator)
So of all of these only the 704x/709x CPUs used word addressing. And while it includes some of the most powerful (well, outclassed by CDC already before the /360 came) and expensive, their numbers were quite low (*1)
Bottom Line: Most pre-/360 machines were byte addressable (of various size), not word addressable.
Did they make a switch for comparability between different machines?
Why should they? I would know of no reason. Comparability is an external request, nothing a producer needs nor wants. marketing loves to sell things that are not as easy comparable :)
As explained here the /360 was the follow up to all the different machines - with only a few of those being word addressable. See above.
Or it was just performance or money or single symbol size reasoning behind it?
Pick whatever you want. The /360 was intended to be a single ISA capable to be tailored to all needs from low end business to high end scientific.
Now the promised thoughts:
Could it be, that your thoughts are stuck inbetween addressability as defined in the ISA (InstructionSet Architecture) and seen from a programmers view and the memory interface as seen from hardware?
An ISA is the abstract view of Hardware a programmer will interact with. It's the way the machine looks to him. Addressing on ISA side describes the granularity an instruction can use to address data. While this may vary between instructions and access type (for example due restrictions of alignment), the smallest size that can be addressed directly with a complete address is considered the one defining capabilities. In case of IBM that's the byte. Each regular addressing within an instruction can point to any byte in memory.
Words and alike are formed by multiples of bytes and may or may not cover only a limited address range - like the /360 requiring words to be alligned to multiples of 4, thus leaving the two lowest bits of any word address zero.
This definition is only valid within its ISA and not necessary related to the hardware at all.
On the hardware side memory is always word-accessed, with word being of arbitrary size, independent of word (or byte) size defined by the ISA. The /360 is a great example here, as its ISA presents a plain 32 bit world with 24 bit addressing and 8 bit bytes. But at the memory interface many sizes were used depending on machine type and time. Starting from 16 and 32 bit for the earliest implementations up to 64, 128, 256 and more later on.
It's the task of the memory interface to map bytes, words or whatsoever the ISA side requests onto its own memory word and back.
This abstraction level was already used before the /360, as for example (AFAIR) a character addressing 7010, a word addressing 7090 and a bit addressing 7030 could all use the same memory subsystem made of 36/72 bit words.
*1 A few hundret for all of them combined, while the 1401 alone accounts for more than 10,000 units.