I am doing a machine learning project and I'm interested in the different pre-processing techniques which can be applied to audio data. In particular, I am interested in comparing a human auditory model (basically pre-process my data into features which mimic how humans perceive sound) and a non-human auditory model. I know for example that the Mel Spectrogram is a human-auditory model. Or the multi resolution cochleagram. But what are the ones that are not human auditory? I suppose an example is the short-time Fourier transform? Or the PCM data?
What would be other examples?