Data compression is a process of reducing the size of a data file. A compressed file is easier and faster to transmit over a network or internet. A compressed file can hold multiple files together which also makes file handling easier. Linux comes with very effective tools that can be used to reduce the file size quickly.
In this article, we are going to discuss some important file archive and compression tools that are used in Linux or Unix operating system.
What is File Archiver?
A file archiver is a program that is used to combine multiple files or directories into one archive file. For easier transportation or storage purpose. File archiver may employ a lossless data compression in their formats to reduce the size of the archive. Below some important file archivers are discussed.
tar(Tape Archive) –
Tar is a file archiver that is used to collect many files into one archive file. These archived files are also known as the tarball. The name of tar is derived from (t)ape (ar)chive. Originally it was developed to write data on sequential I/O devices with no filesystem. Generally, it is used in Unix based systems.
Syntax:
tar [options] <name of the tar archive> [files and directories which to add into archive]
Options that can be used with the command are –
OPTIONS | DESCRIPTION |
---|---|
-c or --create | Create a new archive |
-a or --auto-compress | Additionally, compress the archive with a compressor which will be determined by the file name extension of the archive. for example- if the archive name ends with *.tar.gz then uses gzip or if *.tar.xz then uses xz, etc |
-r --append | Append file the end of an archive |
-x or --extract or --get | Extract file from an archiver |
-f , --file | Specify the archive name |
-t or --list | Show the list of files and folders in the archive |
-v or --verbose | Shows a list of processed files |
Basic usage-
Create an archive file –
tar -cvf archive.tar abc.txt xyz
Command will create an archive file with the name archive.tar which will include file abc.txt and a directory xyz.
Extract contents of archive.tar into the current working directory –
tar -xvf archive.tar
shar(Shell Archive) –
Shar is an abbreviation of shell archive, is a simple and quick archiving command-line utility. It uses a file extension. To install this utility in your system you should use the following command-
sudo apt-get install sharutils
(In debian and similar distros)
To archive, a file use the given command –
shar file_name > file_name.shar
To extract it use the given command-
unshar file_name.shar
ar(Archiver) –
The archiver is also a command-line utility used in Unix or Unix-like systems that uses a group of files to create a single file. The archive file created by ar can be used for any purpose but generally, it is used to create and update static library files that are used for generating packages like .deb
for debian. It uses .a
, .lib
or .ar
file extension.
To create an archive use the given command-
ar cvsr file_name.a file_name
To extract that archive-
ar -xv file_name.a
cpio(Copy Input Output) –
It is another file archiver that is used in Unix-like operating systems. Originally it was designed to store backup file archives on a tape device in a sequential and contiguous manner. The archive file uses .cpio as the file extension.
To create an archive use the given command-
cpio -ov > /home/username/backup.cpio
And to extract it use the given command-
cpio -idv <backup.cpio
ISO Image –
ISO is also an archive file that contains all the files that would be written to an optical disc, including the optical disc filesystem. ISO image files use file extension. The genisoimage is a command-line tool for creating ISO 9660 filesystem images, which can be burnt to a CD or DVD by using a disc burning tool.
sudo apt-get install genisoimage
(Use in Debian or similar Linux distributions)
Use the given command to create an iso image from a file or directory
genisoimage -o output_demo.iso /home/lalit/Desktop/demo
To create an iso image from a CD or DVD we will use the dd tool
first, unmount the device if it is already mounted
unmount /dev/cdrom
(Your device may have different name use accordingly)
dd if=/dev/cdrom of=~/cd_image.iso
where, if, and of means input file and output file respectively.
What is File Compression Utility?
A compression utility is a program that is used to reduce the size of files. These utilities use different algorithms to compress the files into different file formats. These are also known as data compressors. Some of these utilities used in Linux or Unix are discussed below.
gzip File Compression –
gzip is a file format and a software application. It was created by Jean-loup Gailly and Mark Adler as a free software replacement for most of the compression programs used in the Unix system at that time. gzip file format is based on the DEFLATE algorithm which is the combination of two other algorithms LZ77 and Huffman coding. It uses a file extension .gz
.
You can install gzip utility by using the following commands –
sudo apt-get install gzip
(In debian and similar systems)
sudo yum install gzip
(In the systems using rpm package manager like centOS)
Use the given command to compress a file –
gzip file_name
Use the given command to display the details of a compressed file –
gzip -l file_name.gz
Use the command to unzip the file –
gzip -d file_name.gz
To create a tar with gzip (.tar.gz) compression use the following command-
tar czf file_name.tar.gz file(s)
Extract a .tar.gz file by executing the command that is given below-
tar xzf file.tar.gz
lzma File Compression –
lzma is an abbreviation of the Lempel-Ziv-Markov chain algorithm. This algorithm is used to perform lossless data compression. And it uses a dictionary compression scheme. The compressed file uses a .lzma
extension.
To compress a file in lzma compression you can use the following commands-
lzma -c --stdout file_name > file_name.lzma
To decompress or extract a file using lzma compression use the following command-
lzma -d --stdout file_name.lzma > file_name
xz File Compression –
xz is a lossless data compression file format based on lzma compression which uses a file extension .xz. xz utility is used to compress and decompress a file into the xz and lzma file format. It is a successor of lzma utility. Instead of compressing a single file(in lzma utility), it can compress multiple files in a single command.
To compress a file use the given command
xz file_name
or
xz file_name.tar
To decompress a file use the given command
xz -d file_name.xz
or
unxz file_name.xz
or
unxz file_name.tar.xz
bzip2 File Compression –
bzip2 is a fast file compression utility that uses the Borrows-Wheeler algorithm. bzip2 compresses most files more effectively than gzip and other such other tools. It uses .bz2
file extension.
To compress a file, Use the following command-
bzip2 file_name
To decompress a file
bzip2 -d file_name.bz2
or
bunzip2 file_name.bz
7zip File Compressor –
Originally this tool was developed for windows but later it was released for other operating systems also. It supports multiple file compression formats. The 7zip file compressor can be used for compressing multiple files in a single command. It comes pre-installed in many Linux distributions but if don’t have you can use the given command to install it in your system-
apt-get install p7zip-full p7zip-rar
7z is a compressed archive file format that was initially implemented in the 7zip compression tool. And .7z
is the filename extension that is associated with these types of files.
To compress a file in 7z file format use the given command
7z a file_name.7z file_name
To extract the compressed file, Use the given command-
7z e file_name.7z
Conclusion
The performance of these compression utilities may differ from system to system. We discussed archiver and data compressors in the article that are mostly used by Linux/Unix users. There are many compression tools that provide a graphical user interface like File Roller you can download it also. At last, I hope the article was useful for you, If still you have a query you should write to us in the comments below.