Query By humming - Music retrieval technology

Music retrieval technique
Shital Katkar
132011005

Index
What is QBH?
Basic Architecture
Application
Challenges
File Formats
System Architecture
Parson code algorithm
 Benchmarking MIR System

•“I don’t know the name. I don’t know who does it.
•But I can’t get this song out of my head.”
•Well, why not just hum it.
QBH System
Query By Humming

Basic Architecture
Microphone Extraction Transcription Comparison Result List
DB
Fig- Basic System Architecture

Applications
Shazam
•identify pre-recorded music being
broadcast from any source, such as a
radio, television

Applications
Sound Hound
•identify music by humming, singing
or playing a recorded track

Applications
Midomi
•identify music by
humming, singing or
playing a recorded track

Applications
Musipedia
identify music by whistling
a theme, playing it on a
virtual piano keyboard,
tapping the rhythm on
the computer keyboard

Challenges
• Users may not make perfect queries.
• Accurately capturing pitches and notes from user hums is
difficult, even if the user manages to submit a perfect
query.
• Similarly, accurately capturing melodic information from a
pre-recorded music file is difficult.

File Formats
Wav File Format
short form of the Wave Audio File Format
the most common use is to store an uncompressed audio
 quite large in size
first generation files of high quality

File Formats
MIDI File Format
Musical Instrument Digital Interface
MIDI files are not exactly the same as the typical digital audio formats we
use (like WAV, MP3, MP4 etc.)
a MIDI is made up of information that describes what musical notes are to
be played
MIDI Files therefore do not contain any 'real world' recordings

Parson Code
Algorithm
Algorithm
A note in the input is classified in one of three ways
1. U = "up," if the note is higher than the previous
note
2. D = "down," if the note is lower than the
previous note
3. r = "repeat," if the note is the same pitch as the
previous note
4. * = first tone as reference

Textual Pattern
C C G G GA A
U r
F F E E D D C
D r D r D r D* rUr D
72 72 79 79
81 81 79 77 77 76 76 7274 74

Query By humming - Music retrieval technology

Introduction
Music Information Retrieval (MIR)
efficient content-based searching
retrieval of musical information
should be easily operated by users
should be controlled by a simple-to-use graphical 'musical' interface

MIR System
Problem Definition
Lot of MIR System
All have the same task- to enable users to search for music
Very few systems that are actually publicly accessible and comparable
Some System works only with MIDI representation, some with transcriptions
Each system has a different set of files available in its database

Music Information
Retrieval Methods
MIR Systems may be divided into two categories
1. those that search symbolic representations of music
MIDI files or Common Music Notation (CMN)
2. those that search raw audio files
WAV or mp3 file format

Symbolic representations
 consist of a list of instructions as to how the piece should be played
include the notes, when and for how long each is played
Typical Query –
Involve a search for files with a given sequence of notes
List of MIDI files from database
Music Information
Retrieval Methods

raw audio files
digital representations of an actual recording
contain a level of complexity that is not found in the symbolic representations
composition is contaminated by noise
Music Information
Retrieval Methods

Two Approaches
Extraction
involves finding certain features, such as the mean and variance of
audio signal
Transcription
convert the query into a symbolic representation
Music Information
Retrieval Methods

Online MIR Systems
CatFind
MelDex
MelodyHound
ThemeFinder
Music Retrieval Demo

CatFind
 search MIDI files using either a musical transcription or a melodic profile
based on the Parson’s Code
It has minimal features
intended primarily for demonstration
Online MIR Systems

MelDex
 MELody inDEX
allows searching of the New Zealand Digital Library
designed to retrieve melodies from a database on the basis of a few
notes sung into a microphone
It accepts acoustic input from the user, transcribes it into common music
notation, then searches a database for tunes that contain the sung
pattern, or patterns similar to it.
Retrieval is ranked according to the closeness of the match
Online MIR Systems

MelodyHound
developed by Rainer Typke in 1997
originally known as "Tuneserver"
It searches directly on the Parsons Code
was designed initially for Query By Whistling
return the song in the database that most closely matches
Online MIR Systems

Themefinder
created by David Huron
allows one to identify common themes in Western classical
music, Folksongs of the sixteenth century
Online MIR Systems

Music Retrieval Demo
 performs similarity searches on raw audio data (WAV files)
No transcription of any kind is applied
It works by calculating the distance between the selected file and all
other files in the database
Online MIR Systems

Evaluation Issues
The coverage of the collection, that is, the extent to which the system
includes relevant matter.
The time lag, that is, the average interval between the time the search
request is made and the time an answer is given.
The recall of the system, that is, the proportion of relevant material actually
retrieved in answer to a search request
The precision of the system, that is, the proportion of retrieved material
that is actually relevant.

Conclusion
In this work, we have laid down a framework for benchmarking of future
MIR systems. There are only a handful of MIR systems available
online, each of which is quite limited in scope. Still, these benchmarking
techniques were applied to five online systems. Proposals were made
concerning future benchmarking of full online audio retrieval systems. It is
hoped that these recommendations will be considered and expanded
upon as such systems become available.

Query By humming - Music retrieval technology

Related slideshows

More Related Content

Query By humming - Music retrieval technology

Editor's Notes