Linux Basics: Zip It Up and Script It

Linux Basics: Zip It Up and Script It

·

28 min read

In this entry of our Linux Basics series, we will go over some popular commands for compressing files and directories. We will also talk about scheduling tasks and some basics of writing scripts. By the end of this post, you should be able to start putting some of the skills learned in the first two parts of this series to work. The goal is to have you ready to automate a backup of a directory or file using a bash script.

Previous Posts From This Series

What is Compression?

For this post, we need to understand that compression's purpose is to make data smaller. In turn, this makes the data easier to move from place to place and more convenient to store long term as it uses less of your total overall storage. There are mainly two genres of compression: lossy and lossless. Lossy compression types (.jpg, .mp3, .mp4) cause some loss of information, so their output is not identical to the input. Lossy compression is acceptable for images or audio files as most people won't notice the difference. For documents and directories, we cannot use lossy file compression. You would likely notice if letters were missing here and there. So for this tutorial, we will focus on lossless compression types.

tar

Using the tar command is usually one of the first things you will do before compressing a file or directory. The tar command creates an archive (the command means tape archive) of files or a directory. Oddly enough, for smaller files, this command may add to the overall size of the file. The change in size is because of the header or meta-information added to the .tar file. Using the tar command is pretty easy. We will use a few options with this command as well. For creating a .tar file we will generally use the following command:

tar -cvf output.tar file1.txt file2.txt

In the example shown above the options passed are -cvf which tell the command we are creating (c) a .tar file, to use verbose (v) output (optional), and to output to a specific file(f). To view which files have been included with a .tar file, substitute the c for a t.

Tar and Feather it... I mean compress it

Alternatively, if you are using Linux, you can tell the tar command to compress the .tar file using gzip. To do this, you change the options to -zcvf, where z specifies that we want to compress this file as well, and then you would rename your output file to include .gz, so this command would look like the following:

tar -zcvf output.tar.gz file1.txt file2.txt
Note:

You may also see this as a .tgz file.

Extract contents of a .tar file

Extracting the contents of a .tar file is simple; you only need to change out the c for an x and use the name of the .tar file.

tar -xvf output.tar

If the file was compressed with the z option, you will need to use the following command to ensure the file is decompressed first.

tar -zxvf output.tar.gz

The tar command is a pretty fast and easy way to create compressed backups for long term storage. As you will see, this is not the only option for managing compression and does not offer the highest compression ratio (good balance of time to compress vs compression). It's still one of the most common compression types you will see in Linux.

zip

All of the commands we'll go over for compression are uncomplicated to use, but this command produces a file that can be decompressed on Windows without additional software. The zip command offers similar compression to some of the other commands we'll cover. It also compresses quickly, so its main advantage is the ability to be easily decompressed on Windows. Also, zip creates a new directory leaving the original untouched.

To use zip to compress a single file use the following command:

zip [OUTPUT] [INPUT]

zip sometext.txt.zip sometext.txt

The above will compress the input file to the output file leaving the input file untouched. To use this to compress directories just add the recursive, -r, option.

zip -r [OUTPUT] [INPUT]

zip -r foo.zip foo

And there you have it, the directory is compressed without changing the input directory.

Decompress a .zip file

To decompress a .zip file we are going to use the unzip command, that makes sense right?

unzip foo.zip

Will unzip the file in its current directory, specify a directory to unzip into using the -d flag with a directory name.

unzip foo.zip -d /foo/bar/baz

That command will decompress foo.zip into /foo/bar/baz.

Extra Credit

You can also password protect your .zip file using -P somepasswordhere.

gzip

Another easy to use command, gzip will replace the given file or directory with its compressed version. It offers similar compression to zip, but generally completes a bit faster than zip. To use gzip to compress a single file enter the following command:

gzip [FILE]

gzip somefile.txt

The above will compress the file and add the .gz file extension to the filename. One thing to remember is that gzip cannot directly compress an entire directory. Instead, use gzip with tar to compress the given directory into a single file first (commonly referred to as a tarball), then that .tar file is compressed with gzip.

Decompress a .gz file

I bet you can guess how we decompress a .gz file, can't you?

gunzip somefile.gz

bzip2

This compression command offers the best compression rates; but also takes the longest time to complete. Like our other methods, it is easy to use, it also replaces the file with the compressed version, and like gzip directories will need to be archived first using tar.

bzip2 somefile.txt

Decompress a .bz2 file

Similar to gunzip to decompress a .bz2 file use bunzip2.

bunzip2 sometile.txt.bz2

Using Redirects

One way to keep your original file in place is to use a redirect. You can create your .gz or .bz2 files with a redirect leaving the original untouched. This method is a great way to back up a file while leaving the original alone.

gzip -c /home/dan/script.txt > /home/dan/backup/scripts.txt.gz

That is one way you could handle this scenario. I don't use this often since I typically create a .tar file first, but it is something to remember.

compress and decompress

This compress method is not used often, but if you see a file with a .Z extension it was created with compress. The compression with this method is not too excellent, but it is worth going over because you may run into it from time to time.

compress somefile.txt
decompress somefile.txt.Z

There is not much more to say about that.

Can't I Do This Tomorrow?

We'll cover the two main ways people schedule tasks to run from within the Linux command line. The first option is used to schedule something to run once.

at

The at command may be already installed by default on your system; if it is not use sudo apt install at to install it. The at command is used to schedule something to run once at a future time. This can be anything from a simple shell command to a complex bash script. Let's start with something basic to get familiar with this command.

1 dan@dan-pc:~/blog$ at now + 1 min
2 warning: commands will be executed using /bin/sh
3 at> echo "hello" > hello.log
4 at> <EOT>
5 job 1 at Tue Dec 22 22:13:00 2020

Let's take a look at what is going on here. In step 1 we use the at command and tell it when we want to run now + 1 min, in one minute. Next, we enter interactive mode and the shell warns us (step 2) that all commands will be executed using /bin/sh. In step 3, we enter our command to run; in this example, we are also redirecting to a log file (at usually sends the result using mail). Step 4 is the output from using CTRL + D to exit interactive mode and step 5 is confirming the job. Try this same command again using at noon. This will schedule a task for the next time it is going to be noon. Use atq to view the scheduled task queue. To remove an item from the task queue, you use atrm [#] using the job number in place of [#].

Example Times To Use With at

ExpressionOutcome
noonTask scheduled for next time it is noon
noon tomorrowScheduled noon tomorrow
midnightThe next time it will be midnight
teatimeThe next time it is 4 pm
next weekOne week from now
next fridayNext Friday
now + 3 hours3 hours from now
now + 10 min10 minutes from now
now + 1 year1 year from now
8:00 AM 12/25/2021At 8 AM on December 25, 2021

You can see how easy it is to set up a task to run once at any time in the future. The at command simplifies scheduling in an easy to understand way.

crontab

Cron jobs are tasks that are scheduled to run at regular periods like every day, hourly, etc. The command crontab is used to view, create, and edit cron jobs. Scheduling a cron job is done a lot differently than when using the at command. A cron entry requires you to set an interval (minute, hour, day of month, month, day of week) and a task to complete. Begin scheduling your cron job using crontab -e. If this is the first time you have used crontab it will ask you what you want to use to edit the cron file, I suggest Nano for its ease of use.

Interval Schedules

FieldValues
minute (m)0-59
hour (h)0-23
day of month (dom)1-31
month (mon)1-12
day of week (dow)0-7 (0 is Sunday)

You can also use the * symbol in any position to instruct the cronjob to run on all available values. For example, entering a * in the dow field tells the cron job this applies to every day of the week. You can also add time ranges, for example adding 0-2 in dow to schedule the task for Sunday - Tuesday. You can also add a comma-separated list as a value; continuing with our dow example, you could schedule the cron job for Monday, Wednesday, and Friday using "1,3,5". Another important option is using steps for scheduling; for example, to schedule a task for every 2 hours you would add */2 to the h field.

Examples
ExpressionOutcome
0 * * * *Scheduled for every hour on the hour
0 */2 * * *Every other hour on the hour
0 12 1 * 1-5Schedules on the first of every month at noon only if it is Monday - Friday
*/15 * "1,3,5"Every 15 minutes on Monday, Wednesday, and Friday
0 */1 * * *Hourly
0 0 1 1 *Yearly at 00:00 January 1

Command To Run

After you set your interval you will need to include the command or script you want to run. Sometimes this can be a bit tricky. You may need to run commands like which to find out where the binaries are for some commands. If they are in a non-standard location you will need to add that location to the command. For example, if your PHP install is in /usr/bin/local/php instead of /usr/bin/php you will need to tell the cron job its location. If you are trying to run a command, in some instances, you may need to add the location of the bash binary first.

echo "hello"

Since cron jobs do not give you any feedback on if they have run or not, it is a good idea to redirect your output to a log file.

echo "hello" > /foo/bar/myjob.log 2>&1

Try scheduling the above command, with the output to a real directory on your machine, to run every minute. If you set the job up correctly, your log file will update every minute with the output of the echo command. Congratulations, you just automated your first task in Linux. What if you want to run multiple commands every minute, don't worry, you don't have to add 10 cron jobs. Instead, we will utilize bash scripting.

Hint:
*/1 * * * * echo "hello" > /foo/bar/myjob.log 2>&1

Note:

You may need to add the user to your cron job. If you have the field user listed between the dow and command fields enter your current user.

*/1 * * * * dan echo "hello" > /foo/bar/myjob.log 2>&1

You're a wizard!

Sometimes scripting does feel like magic. You can accomplish some awesome things, automate little bits of your workflow, and back up your system with simple scripts. With that being said, we're going to introduce something new here.

From here on any posts that have scripts will be included on my GitHub profile under the blog-code-guides repo Additionally, some examples will be added to my repl.it profile. Repl.it lets you create scripts and run them in the browser. You can see an example here of a basic Temporary Trash can script you can use in the command line (remember in the last post, there is no trash folder in the command line).

What is a Shell Script, or Is It a Bash Script?

A Shell Script is a program that is programmed to run in the Unix command line. Things get a bit weird here. Shell Script and Bash Script often are used interchangeably, but they are different things. A Shell Script is intended to be executed on any unspecified Unix command-line interpreter. Whereas a Bash Script is intended to be executed in a Bash command-line interpreter. Even though Bash standards follow closely to those of the Unix shell, there are a few differences. Essentially, shell scripts should successfully run on any Unix command line, whereas Bash scripts should successfully run using the Bash command-line interpreter. The Bash interpreter is the most common Linux shell and is most likely the interpreter installed on your system. In most cases, we will be writing Bash scripts in these tutorials.

Ok, So a Bash Script.

Let's get into writing our first Bash script. Create a new file named hello.sh. Edit that file to contain the following:

#! /bin/bash

echo "Hello World!";

Awesome, that is your first Bash script! The first line tells the shell to execute the file using Bash. Alternatively, if this was a Shell script you would use !# /bin/sh (you may see some /bin/python3 or /bin/perl these are just declaring which interpreter to use). So how do you run it? In the command line execute your script using bash [FILE]. Having to add bash get's a bit annoying after a while, so let's make that file executable using the chmod command. The chmod command changes file permissions of files and directories, so be careful. Make sure you use either the absolute path of your file or that you are in the correct directory; for example, if your file is in /foo/bar make sure you change directories into /foo/bar using cd /foo/bar or use the absolute file path. Use the following command, with the correct file path, to give the file executable permissions.

chmod +x hello.sh

Now if you run the command ls -la you'll notice that the file has an x in its permissions. Your output may look something like the below:

-rwxr-xr-x  1 dan dan   34 Dec 22 23:59 hello.sh

Now you can execute this script using ./hello.sh.

A Note About Users

Outside of this tutorial, you will want to make sure you make files only executable for you. It is a good security practice that will only add the permissions to execute that file for your user.

chmod u+x hello.sh

The addition of the u says to only apply this to the current user. You can also use g for the user group, o for others, and a for all. Generally speaking, you want to use the lowest permission level needed to accomplish your task. We'll get further into permissions in later posts, but just be mindful of this going forward.

So What Makes This Special?

In a Bash script, we can use conditional statements (if this then do that), make comparisons (a is equal to a), define functions, and declare variables. We can do things like executing a command only if a specific file or directory exists or does not exists. We can ask for feedback or take in arguments or file names. From here, we can create reusable scripts that can run and execute based on a more well-defined set of instructions that do not require us to be sitting here typing at our computer. For example, what if we wanted to check every day if a file existed, and if it did remove it? We could schedule a cron job to execute a script every day that looks for that file and removes it if it exists or does nothing if it does not.

#! /bin/bash

if [ -x somefile.txt ]; then
    rm somefile.txt;
fi
exit;

Some Basics

Let's cover a few basics about Bash Scripting to get you started.

Comments

Comments are used by prefacing the comment with #.

# This is a comment!

Variables

A variable is just an area in memory that will store a value. It could be a string, integer, boolean, or any other type supported by Bash. It is just saying reserve this space for something.

Declare a Variable
name="Dan";
age=35;
userName="$(whoami)";

In the first example, we declare a string, the second an integer, and in the last, we set the output of whoami to a variable.

Use a Variable
echo "$name";
echo "My name is ${name} and I am ${age} years old.

Either of these is correct, but generally, you only put the variable name inside ${...} if it is connected to another character in the string directly or it contains spaces.

file=$1 # This just means assign the first passed argument to the variable file, ie ./somescript.sh "somefile"

gzip -c ${file}.txt > ${file}backup.txt.gz

Conditional Statements

Conditional statements are just a fancy way of saying if this is true, then do this.

if [ "a" == "a" ]; then
    DO THIS
fi

You can also add an else to that logic flow. This then turns our statement into if this is true do this if it is not, do this.

if [ "a" == "a"]; then
    DO THIS
else 
    DO THIS
fi

There is a lot more to conditionals, but the last bit we will go over here is how to check if a file or directory exists. To check if a file exists and you can read it your statement would use the -f operator

if [ -f /foo/bar/somefile.txt ]; then
...

For a directory, switch the -f with -d and add the path to the directory (instead of the file path). There are a lot of operators you can use in conditionals; some of these we will cover in later posts but feel free to look them up yourself to get a head start.

The Challenge

Create an automated backup of a directory.

Create a directory on your system somewhere and put a few images in it. You can download some from Pexels or Unsplash and just put them in a directory. This new folder will be our test directory.

Write a script that does the following:

  • Makes sure the directory exists
  • Creates a tar file of the directory
  • Compresses the tar file
  • Moves the backup to a new location

Then schedule it to run using a cron job (for this post you can schedule it every few minutes).

Need help? Check Here For an Example

Or run it here

So Long!

That's it for this post, but if you made it this far, you should be well on your way to being able to create compressed backups, schedule tasks to run, and jumping into writing Bash Scripts. Take some time to try and figure out the challenge, but if you get stuck you can check out my example. Everything you have learned in these first three posts will be more than enough to complete the challenge. Thanks for reading!

Cheatsheet

A quick list of the commands we have covered in this series.

Welcome To Linux:

whoami
pwd
ls
cd 
locate
whereis
cat
head
tail
more
less
cat [FILE] | grep "[TERM]"

Touching Files and Making Dirs:

touch
cat > somefile.txt
mkdir
vi
nano
mv
cp
rm
shred
nl
grep
sed

Zip it Up and Script It:

tar
gzip
zip
bzip2
gunzip
unzip
bunzip2
compress
decompress
at
crontab