In my last article I told you about Fourier transform as a way of signals representation in a frequency domain. I also promised to tell you how it's applied in such a wonderful service that is called Shazam, which identifies the song by a short musical excerpt. This app can be downloaded on the iPhone, on Android and other platforms.
Let’s pretend that you are at a concert and there is a lovely song that you don’t know but want to remember – turn Shazam on and then the song title and artist, as well as additional information - lyrics, videos, biography of the artist, concert tickets and recommended tracks would be sent to you. In this article, I won't give any complex mathematical formulas, but will try in to explain music recognition algorithms in a simple language
How the Fourier transform is connected with Shazam algorithms?
The discrete Fourier transform, which I told you about in a previous article, will help to transform a finite set of signal samples taken at regular intervals of time, into a list of the coefficients of the final combination of complex sinusoids, ordered by frequency. It will help to study the spectrum of the signal and to determine which frequencies exist in this signal and which not. After that, you can filter, amplify or attenuate certain frequencies, or simply recognize the sound of a certain height among the available set of frequencies or get the signature of signals - take "fingerprints", to put it in simple language.
And now let's go to the technical part of the work of Shazam.
Common steps are:
- Card-index with an imprint of music was created and saved into the
database of Shazam. - User "notes" the song that he heard on which an imprint is
generated on the basis of a ten-second audio sample. - The application sends the imprint to Shazam service, which looks for
matches in the database. - If the matches are found then you will be notified about this and all the information about the track will be displayed.
That's how the imprinting works:
Shazam could see music as a simple graph - spectrogram. On one axis of it there is a time (x-axis), on the other - the frequency(y-axis), the third, vertical line, has got the intensity.
Here is an example of how the song might look:
Shazam algorithm makes an imprint of the song by creating three-dimensional graphics and detecting the frequency of "peak intensity".
Shazam is building its catalog of imprints in the form of a hash table in which the key role is played by the frequency value. Receiving an imprint, Shazam uses different keys to find some similar songs. Their hash table might look like this:
Some additional details:
They are looking for a pair of points - "peak intensity" plus a second "reference point." Therefore, their key contains not only a single frequency; it has got frequencies from both points. That leads to fewer collisions (when two different hash key matches) and speeds up the search through a catalog, allowing them to make more use of the average run time.
The top graph: Scatterplot of matching hash locations haven't found a diagonal so the songs are not the same.
The bottom graph: matching frequency observed at one time, so the songs are identical.
If there was not only one match between songs then the time-frequency matching will be checked. A two-dimensional frequency plot on which the match occurred is developed. On one axis there is the time of the appearance of frequency in the track, a similar time for the sample. If among the set of points there is a correlation, points form a diagonal. If such line is found then it is the song that you have searched for and it names will be displayed to you.
So you see that it is not really hard to understand how the Shazam works, but it has got a rather complicated scheme and you must know that this is only a basic algorithm - in fact, Shazam uses the upgraded one and we'll never know it for sure as every developer keeps everything in secret.
Follow me, if you are a geek like me or want to learn more about technologies and scientific/educational topics
Alex aka @phenom
wow, thanks for this informative post! It's awesome
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Thanks for the feedback. Stay tuned
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Literally have always wondered how it works, shot for this.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Before I realised how this app works it was like a magic for me
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Wow man, I have always wondered how this worked! Cheers!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Cheers, man. When I initially used Shazam it was like a magic for me too.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Useful information - especially for man, who have never used Shazam, like me)
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
It seems that you're one of a kind)
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
May be)
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
It's always amazed me how well shazam works. Thanks for the post.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Thanks for the feedback. Follow me to learn more about technologies and IT
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Ah yes, i remember doing pattern recognition w/ MatLab in college , trying do reverse engineer the Shazam algorithm. Good times!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
wow. I work a lot in Matlab. Absolutely amazing tool.
Have you finally reproduced shazam algorithm?
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
This is something I didn't know about at all! Thanks a lot!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
It's interesting that it only takes a 10-second sample to compare two songs! Thanks for that explanation.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Thanks to the author for a very interesting material. Of all the above-described applications that help identify music, in my opinion, Shazam is the best. That is why many have the question of how to make shazam.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit