Compression Algorithms – A Brief Compendium

Compression algorithms comes under the discussion when the world is dealing with modern day challenge of digital storage space management which is how to contain the high quality and large size digital files in a smart way. The larger size of a file is really hectic for a person who have less storage space in his computer, cell phone, or other media storage devices. Another problem is that how to move or share files with others over the internet, Bluetooth? He might need to buy a USB drive of a huge storage capacity just to copy files from one computer to another. All these factors related to the large file size are resulting in an increase in expense of money and wastage of time. This is the more practical theory which encourages the scientists to compress the huge size of files without losing the quality of file as maximum as possible. Here in this article let’s try to get knowledge about most widely used compression algorithms and their types:

Lossless Compression Algorithms

As the name implies the lossless compression algorithms belong to a category of data compression algorithms that compress the files without losing any of their content. It means that lossless compression algorithms can accurately reconstruct the original data from the compressed data. Many different algorithms are designed either with a typical type of input data in mind or by assuming about what kinds of redundancy the uncompressed data are likely to contain.

Following is the brief explanation about some of the most widely used lossless compression algorithms:

bzip2

This algorithm uses the Burrows–Wheeler algorithm with RLE and Huffman coding to compress the data. It is used to compress the files only without archiving them. The compressed files are usually saved with .bz2 extension.

Huffman encoding

This algorithm is based on a specific method for selecting the identity for each symbol, resulting in a prefix code. Huffman coding is such a widespread method for creating prefix codes. The compression files with extensions such as .mpq, .acr, .jpeg, .png, .zip are supported by Huffman encoding.

Lempel-Ziv compression

This compression algorithm is also known as LZ77 and LZ78 are the two lossless data compression algorithms. The combination of these algorithms is based on many variations including LZWLZSSLZMA and others. Both of them are theoretically dictionary coders. During compression, the LZ77 maintains a sliding window. Later or it was later shown to be equivalent to the explicit dictionary constructed by LZ78. Therefore, they become equivalent on decompression of the entire data. The files with .lzma, .lzo, .lz, .lzh extensions are supported by Lempel-Ziv compression.

Prediction by partial matching (PPM)

Prediction by partial matching  which is also known as PPM is a compression algorithm based on prediction and context modeling. To predict the next symbol in a stream, the PPM models use a set of earlier symbols in the uncompressed symbol stream. PPM algorithm supports the ZIP and 7Z files.

Run-length encoding (RLE)

This algorithm is also known as RLE lossless compression algorithm based on sequences containing the same data value occurs in many adjacent data elements. These sequences are called runs. The RLE stored each run as a single data value and count. This is beneficial on data that contains many runs, such as simple graphic images, e.g. drawings, icons, line and animations. The files with .psd, .psb, .tga extensions are supported by RLE

Lossy Compression Algorithms

The lossy compression algorithms are step ahead in order to reduce the storage size of files. Whereas, loss of some information is accepted as dropping non-essential detail. Lossy data compression algorithms are formed by research on how people understand the data. Most lossy compression algorithms are based on transform coding.

Some of the famous lossy compression algorithms are briefly explained below:

Discrete cosine transform (DCT)

Discrete cosine transform (DCT) is a limited sequence of data points in terms of a sum of cosine functions fluctuating at different frequencies. It is used in most digital media, including digital images such as JPEG, HEIF, J2K, EXIF and DNG.

Wavelet compression

Wavelet compression is a lossy compression algorithm which is most commonly used in image compression. This algorithm uses a principal called transform coding in which a wavelet transform is applied initially. This creates as many coefficients as there are pixels in the image. Since the information is statistically concentrated in just a few coefficients, These coefficients can be compressed more easily. Notable implementations are JPEG 2000DjVu and ECW for still images.

Cartesian perceptual compression (CPC)

This lossy compression is also known as CPC was created for high compression of black-and-white raster imaging from archival scans. The algorithm is commonly used in the web distribution of legal documents, geographical plot maps, and design plans.

Fractal compression

Fractal compression is a lossy compression algorithm for fractal based digital images. The algorithm is suitable for natural images and textures, relying on parts of an image similar to the other parts of the same image. Fractal algorithms convert these parts into fractal codes which are used to recreate the encoded image.

Conclusion

In this article, you have learned about compression algorithms, their major types and commonly used compression algorithms. It is not necessary to keep the knowledge in your mind about all of the compression algorithms. But if you need to create a smart presentation on the topic of various lossy or lossless compressions, you can get help from here. Hence, bookmark this blog page as a reference.