Discussion


Wave Files
Graphs in PDF format


Changing the bit value

The fact that we could reduce the number of bits to an width of its previous value was very interesting. In the case of voice, although the music will not be quite so clear, a bit value of 1 can be used and the words will still be discernible. However, for the more electronic/synthetic  music, changing the bit rate had a greater effect. This is due to the different tone types: in
accapella music, the sounds are very nearly pure-tonal, composed of smooth sinusoids that require few information bits to 'model' them; in electronic music, the waveforms are more intricate, discontinuous and random, thus requiring a larger number of bits to achieve the same degree of accuracy.


Changing the sampling rate

As you decrease the sampling rate you will inevitably lose the ability to retrieve the original signal from its sampled version (you must sample at or above the Nyqist frequency, otherwise aliasing in the frequency domain will occur). Thus, by fs = 5kHz the sound clip had become distorted and was definitely of an inferior quality to the original.


Quantization

Our modified signal was still very good  because we removed only a few frequencies, and these were low-energy and thus of little importance audibly. However, if you look at the modified signal's representation in the frequency domain, you can see 'spikes' in the high frequency ranges that were not removed because they did not fall below the limit of quantization. The spikes will produce incoherent noise, thus causing the static.

Removing a range of frequencies

This method works because only the high frequencies with relatively low energies are removed; as the music usually lies within a limited, and low, range of frequencies, little of any import will be lost in eliminating the upper frequencies. Notably, the static that was heard when the signal was filtered using the quantization method above was not heard here. This is because the frequency spikes observed before are not present here.

Removing every second frequency

This is not really a valid method of compression, as 'important' frequencies are removed along with the less significant ones, and the removal of frequencies distorts the phase badly. However, removing only one out two frequencies does not distort our signal's phase badly enough to make a difference audibly. Thus, this was perhaps our 'best' compression.

?

Retaining only every fifth frequency

The echo-repeat we observed in this compression was just a more distorted version of the one heard in the removal of only one out every two frequencies.  We do not know why this happened but we hypothesize that the 'aliasing' effect that occurred was due to drastic phase distortion. Unfortunately, due to time constraints, we could not mathematically describe the processes involved here.

Conclusion

We succeeded in our goal of filtering low energy frequencies. However, many of the methods we used to do this SHOULD NOT BE ATTEMPTED in real music compression, because systematic frequency removal is not a valid method of filtering. We hypothesize that the bad effects we observed were due to serious phase distortion and the use of  too large a sound clip (219 samples).

Potential future forays into the world of audio compression...

If we had had more time in which to do this project, perhaps the following ideas might have been implemented:

  • Converting to the frequency domain, and then taking the real and imaginary parts of our signal independently and fitting each data set to an nth order polynomial. This would require a lot of encoding, but the decoding would be a simple process. However, unless the imaginary and real parts correlated exactly, this process would give us serious phase distortions.
  • Fitting Gaussian curves to frequency spikes.

Back to homepage