OpenFP

What is OpenFP?

OpenFP allows to create audio fingerprint files from music tracks using the OpenFP client to match them against a set of reference fingerprints provided by the OpenFP server. The following picture provides an overview of the tooling:



Before using the matching SW you need to extract fingerprints using the extraction tool openfp_extract. All fingerprints are then loaded into the openfp_server during startup
  1. When issueing match requests against the server the openfp_match client is used
  2. Which queries the server and presents results.

How does it work?

Audio Extraction

To process input audio we use libfftw3 which can read PCM16 audio data. This format can be created using ffmpeg with the parameters "-f u16le -acodec pcm_s16le". This is done automatically by openfp_extract which extracts any input supported by ffmpeg (videos, audo files).

Fingerprint Extraction

The following graph is a block diagram describing the fingerprint extraction.



Steps performed:
  1. Raw audio data extraction using an audio tool supporting the relevant codecs (e.g. ffmpeg)
  2. Applying a window function to the audio stream
  3. Fourier transformation of the data to create power spectrum
  4. Reduce power spectrum to relevant power bands (Bark bands)
  5. Reduce noise using any type of lowpass. During our tests we found that we achieved the best results by an IIR LP without output reduction. The current implementation uses a 12Hz IIR low pass (5th order butterworth calculated here)
  6. Quantizes values to binary flags (energy band active/inactive)
  7. Reduce output data (e.g. drop 3 out of 4 results)

Fingerprints

The result of the fingerprint extraction is a set of 32bit subfingerprints stored in a single output file representing the original audio file. Each bit in each of the 32bit values describes the spectral power in a specific bark band. Being a bit value it cannot describe any actual value, but only a difference, an edge of a band energy change (as produced by the lowpass filtering).

To speed up matching we calculated MFCC feature vectors that we use to cluster the subfingerprints of all fingerprints. These feature vectors are stored in the fingerprint file every n subfingerprints. Clustering happens during startup of the server process.

Fingerprint Matching

Matching is performed by the openfp_server which answers requests sent by openfp_match. Matching is a two step process:

  1. Find suitable match position
  2. Evaluate fingerprints following match position

Suitable match positions are found by comparing one or more randomly choosen fingerprints from the audio sample to be checked with all reference fingerprints. A suitable match is found if the hamming distance of two compared fingerprints falls below a certain threshold. For all suitable matches then the average hamming distance for the following fingerprints (corresponding to a few seconds of audio) is calculated. Only when the average hamming distance is also below the matching threshold the fingerprint is considered a real match.

The two-step matching is motivated by performance considerations. The openfp_server implementation also facilitates fingerprint clustering to further eliminate unnecessary comparisons.


News:

Feed design by pfalzonline.de