Generalized Audio compression in Matlab

by Maya Barley and Mark Yun

barley@rice.edu
cynical@rice.edu

Our Goal: given a sound sample, we will show the effects of removal of low energy frequencies from this sample, through filtering and use of FFTs and IFFTs in Matlab.

Plan

Step 1: Produce 219 samples wav sound clip. A sample length of a power of 2 should be used so that a FFT, rather than a DFT, can be used; this will reduce calculation time tremendously. Because the sampling rate = 44160 samples/sec, this is equivalent to an 11.87 second sound sample. The segment will be produced using Soundeditor.
Step 2: Import this sound file into Matlab using the command wavread.
Step 3: Transform sound clip using FFT.
Step 4: Filter waveform in frequency domain using a filtering technique.
Step 5: optional. Plot the spectrograms of both the unmodified wavefile and the compressed version, and observe the differences in energy. The comparison of the modified and unmodified waves' spectrograms should yield information about how effective each compression is.
Step 6: Reverse transform.
Step 7: Export sound file and play using matlab command soundsc.
Step 8: Compare effects of filtering techniques.

Progress Report (last updated: 12/13/99)

As of 12/4/99

We have succeeded in importing our sound clip into Matlab and have fourier-transformed, normalized (using the frobenius norm), quantized and reverse transformed it. The
for loop we wrote is as below, where L = transformed sound wave, and z = quantization level:

For i = 1: length(L);
If i abs(L(i)) < z;
L(i) = 0;
end
end

After trial and error, we discovered that a value of z = 0.01 will yield a sound file highly inferior to the original clip; the compressed version seems to be composed mainly of tones (a result of eradicating most of the high frequencies). We were expecting this 'limit' to be higher than 0.01, but upon reflection it is evident that by normalizing the tranformed signal we are reducing the frequency coefficient values considerably, and so a value of 0.01 is not unlikely. If z is set to 0.001, however, the sound file is audibly perceived to be perfectly reproduced.

We also looked at our modified sound wave's spectrogram using the
siganalysis matlab software found at http://www.owlnet.rice.edu/~elec241/. By comparing the spectrograms of the modified and unmodified versions of our sound file we will hopefully be able to improve, and make more efficient, our final product by finding the optimum quantization level, filtering technique and type of transform.

These are the sound clips that we used in our compression project

Fatboy Slim
Opera
Plastic Jesus (accapella)


next page