Data compression is the coding of data to save storage space or transmission time.
For example, run-length encoding
replaces strings of repeated characters (or other units of data) with a single character and a count.
There are many compression algorithms and utilities.
The standard Unix compression utility is called compress though GNU's superior gzip has largely replaced it.
Other compression utilities include pack, zip and PKZIP.
The following procedure shows how to compress and decompress a file named Sequential.txt
by using Unix compression tool: gzip
and decompression tool: gunzip
.
shell> cat Sequential.txt
File Structures: An Object-Oriented Approach with C++ |0201874016|94.80|360|
Learning WML & WMLScript |1565929470|17.48|12|
XML in a Nutshell, 2nd Edition|0596002920|39.95|39|
Java and XSLT |0596001436|26.37|890|
WAP Servlets: Developing Dynamic Web Content With Java and WML|047139307|32.99|4|
WAP Development with WML and WMLScript|0672319462|18.99|56|
Advances in Security and Payment Methods for Mobile Commerce|1591403456|89.95|182|
M Commerce: Technologies, Services, and Business Models |0471135852|23.09|5|
Mobile Commerce |0521797561|29.51|93|
Dynamic WAP Application Development |1930110081|34.59|18|
shell> ls -l Sequential.txt
-rw-r--r-- 1 wenchen faculty 665 Feb 12 14:19 Sequential.txt
shell> gzip Sequential.txt
shell> ls -l Sequential.txt.gz
-rw-r--r-- 1 wenchen faculty 457 Feb 12 14:19 Sequential.txt.gz
shell> cat Sequential.txt.gz
ØBSequential.txt]ËnÛ0E÷þYuTàKvç8í¢° lej3)C¢]ïq´H ¹wæ\êÚ87köúÂZ7ùceUZeM-HÖ
6c iH*Öäzå~ _§8ïqîógzøÞûèÇ@¢´Ê*f [ò;vç:î{m×/píFW¤ªB×ÔX^l»|§ó1c>âñ6}¼
îàlq«yC mâ¼3 SKmµ¨I«ÂZ2yñd{µ:$]Î'¡_eà$ªZiiM¥H6I]V´Xöç.8vî4ùxɺ
çîÍ6÷c?Ãïq͸Kw9rV¡ ÿG®dm벤lQJ²·X9"ïºtÿÄÕSRF6EiÙ.¯Èשõi
shell> gunzip Sequential.txt.gz
Compressed data must be decompressed before it can be used.
When compressing several similar files, it is usually better to join the files together into an archive of some kind (using tar for example) and then compress them, rather than to join together individually compressed files.