3 reasons: Much higher voltage breakdown, thermal conductivity and much lower contaminants including moisture from condensation that lead to Partial Discharge, which is cheaper to monitor and repair in oil than dry-epoxy-types.
I added the 3rd reason which is more complex as easier to remove foreign particles in oil and the viscosity reduces the kinetic energy of accelerating particles in an E-field hitting a conductor with enough energy to release hydrogen, a combustible gas from the water molecule.
Dry transformers do exist < 5MVA occupy a smaller footprint, quieter, safer, preferred for some urban areas, but less efficient, cost more and rely on more expensive insulation with Mica tape, and epoxy polymers to make moisture resistant. Dry transformers must combat the tendency to absorb molecules of moisture, which rapidly deteriorates breakdown voltage.
Transformer grade Oil is at least 8x and up to 25x better than air for dielectric breakdown and at least 6x better thermal conductivity in[W/mK].
Oil is predominantly used > 5MVA due to better electrical and cooling efficiency. Oil is necessary for cooling, thermal spreading of hotspots and for electrical insulation.
Partial Discharge (PD) is all about the flow of ions in plasma, like an aurora or corona. It needs some contaminants to collide and cause discharge.
From my experiments on Nydas Transformer oil in a transformer factory to exceed 25kV/mm. With typical results varying 25 to 40kV.
With more expensive processing to remove ppm level contaminants, it can reach 70kV/mm. Those that can afford the $50k+ machine, use them but some skill in invisible contamination processing and process quality controls in a clean room environment are necessary.
The test is done with about 1kV/s ramp with ultraclean large (~2cm) brass flat electrodes in a clean glass beaker with tapered smooth edges.
Like Air, it is mobile contaminants and pressure changes that can lead to partial discharge that causes variability in the breakdown voltage BDV of an insulator.
For transformer oil, the Partial Discharge also breaks down the large hydrocarbon chain into H2 which has a lower explosive threshold of 4% concentration.
Clean air breakdown is 3kV/mm while dirty moist air is < 500V/mm in flat to flat, while point to point is about 1/3 of these Voltage thresholds.
An ultra-low vacuum gives a high BDV but a partial vacuum very low as reduction of molecules allows for less drag and higher kinetic energy when an ion in the air hits the conductor. ( See Paschen law. )