7

Assume sensitive audio emissions from a mechanical keyboard. These audio emissions are often sufficient to reconstruct the actual key presses that generated the sound. If the audio is compressed using a narrowband audio codec such as G.711, how much of the information is destroyed?

Put another way, can acoustic side-channel attacks ever be done using modern telephony?

7
  • 2
    From Don't Skype & Type! Acoustic Eavesdropping in Voice-Over-IP: "... In fact, we show that very popular VoIP software (Skype) conveys enough audio information to reconstruct the victim's input -- keystrokes typed on the remote keyboard. ...". Commented Apr 27, 2021 at 4:10
  • 2
    @SteffenUllrich The attack is against Skype which uses the SILK codec, which is much more capable than G.711, G.729, or other common codecs used in cellular telephony. Most VoIP software tends to use higher quality codecs than your local cell tower, SIP trunk, or whatever.
    – forest
    Commented Apr 27, 2021 at 6:49
  • @forest shouldn't you be comparing to AMR-WB and EVS in 2021?
    – hobbs
    Commented Apr 27, 2021 at 21:46
  • @hobbs I have no idea. My knowledge of cellular telephony is very primitive.
    – forest
    Commented Apr 27, 2021 at 23:07
  • 1
    @Hobbamok I suppose I should have said typical cellular telephony that can go over PSTN. I'm sure there are VoIP clients which support significantly higher sample rates, etc.
    – forest
    Commented Apr 30, 2021 at 0:07

1 Answer 1

9

According to Wikipedia:

"... G.711 passes audio signals in the range of 300–3400 Hz and samples them at the rate of 8,000"

Nyquist criteria limits the top frequency to be less than half the sample rate, or less than 4 KHz in this case. Further G.711 filtering apparently cuts this down to 3.4 KHz top.

A quick impromptu experiment

Laying the microphone of a USB headset next to my very clicky old Dell keyboard and recording the sound gives me an amplitude time space recording of: Keyboard recording

Running a Fourier transform to frequency space yields: enter image description here

It looks to me like all of the subtle key impulse differentiation lies in the 5-20 KHz region, which by definition cannot be passed by G.711

This was a quick and dirty experiment, take it for what it's worth.

1
  • 3
    "It looks to me like all of the subtle key impulse differentiation lies in the 5-20 KHz region" I fail to see why this is the case. The 5-20Khz region is less smooth but this is a spectrum, it's less intuitive than the time-domain signal. E.g. think of the spectrum of a transmission of a secret phrase encoded in morse code using an ideal tuning fork. Also G.711 surely isn't an infinite order bandpass filter (but to be honest I don't know its details). Commented Apr 27, 2021 at 18:23

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .