Ok shazam, "la la-lalaa"!
- 3. - What is a signal?
- Where the fourier transformation? DFFT? FFT?sFFT? can be used?
- How to get spectrogram?
- What is energy picks of sound?
- How to get acoustic fingerprint?
- So, how does Shazam works?
- Will show a primitive Shazam-like app
Agenda
- 4. Roman Rodomansky
Software Engineer at Perfectial
FE Trainer at CURSOR Education (http://cursor.education/teacher/roman-rodomansky)
Co-Founder of GDG (Google Developers Group) Lviv (http://lviv.gdg.org.ua/)
Founder of UASC (Ukrainian Security Community) group (2009)
Founder of 2enota startup (2013)
https://github.com/itspoma
https://facebook.com/rodomanskyy
https://linkedin.com/in/rodomansky
I’m
- 5. - Shazam founded in 1999
- uses audio in (ex. built-in microphone) to gather a brief samples (10s) from audio in
- has more than 100 million monthly active users (w more 500 million mobile devices)
- Similar apps
- SoundHound (Midomi), Xiaomi Music, Musipedia
- Fire Phone from Amazon
- Google Sound Search, Bing Audio, Yahoo Music
- Sony TrackID, Lyrics Mania, Musipedia, Omusic, Peach, etc
- About ~4000 pattents https://patents.google.com/?q=music+identification…
- how to collect, parse, identify, query, etc & etc
Sound identification apps
- 7. https://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf
We have developed and commercially deployed a flexible audio search engine. The
algorithm is noise and distortion resistant, computationally efficient, and
massively scalable, capable of quickly identifying a short segment of music captured
through a cellphone microphone in the presence of foreground voices and other
dominant noise, and through voice codec compression, out of a database of over a
million tracks. The algorithm uses a combinatorially hashed time-frequency
constellation analysis of the audio, yielding unusual properties such as
transparency, in which multiple tracks mixed together may each be identified.
Furthermore, for applications such as radio monitoring, search times on the order of
a few milliseconds per query are attained, even on a massive music database.
- 28. db = [
{freq1, freq2, Δtime},
hash, hash, hash, hash, hash, …
]
Time-Invariant Hashes
- 29. Time-Invariant Hashes
=> [261209922, 928719572, 927571829, 756562712, 875626731, 726187626, 817592192, 8217646272,
9960192815, 987125921, 972857192, 81266852, 98172975, 91729852, 7579812752, 987219872,
965876125, 918729875, 1982798712, 981729871, 716287652, …)
- 30. Time-Invariant Hashes
=> [261209922, 928719572, 927571829, 756562712, 875626731, 726187626, 817592192, 8217646272,
9960192815, 987125921, 972857192, 81266852, 98172975, 91729852, 7579812752, 987219872,
965876125, 918729875, 1982798712, 981729871, 716287652, …)
- 32. Store & find matches
db.songs
int id
varchar title
varchar filehash
db.fingerprints
int id
int song_fk
varchar hash
int offset
fingerprint
find matches
audio-sampleslabeled music