TIP: Move Your WordPress Blog, And Leave The Trash Behind One Month Completed….
Feb 28
My Sites: My Blog | My Tech Blog | Follow me on Twitter
----

Table of contents for AutomateWebBackup

  1. TIP: Automating Website Backups
  2. TIP: Automating Website Backups - Part II (Reducing Backup Size)
  3. TIP: Automating Website Backups - Part IIb (Reducing Backup Size Contd.)
  4. TIP: Automating Website Backups - Part III (Cron Made Easy)

Last time, I had told you about how to reduce the backup size. Well, this is a short note further developing on the approaches discussed there. Basically, telling tar to backup only those files which have changed. For this, Linux has a command “find” which I am very fond of. You can give it a option “-newer” followed by a f”filename” and it will return the names of the files that are newer than filename.

CODE:
  1. find ~ -type f -newer ~/backups/backup_x.tgz > files.txt

So, the command given above will find all the files that have changed since you took the backup “backup_x.tgz” and store those filenames and paths into “files.txt”. The “-type f” option makes sure that only filenames are listed and not directory names, because tar creates a lot of issues when presented with directory names (Explore yourself about this).

Now, all we have to do is give this “files.txt” to tar as an input to tell it which files to archive.

CODE:
  1. tar cvzpGf ~/backups/backup_$date.tgz -T files.txt

The “-T files.txt” option makes this happen. Moreover, I have introduced 2 new options here, that were not present in our last part. They are:

  • p – Tells tar to preserve file permissions
  • G – Tells tar to ignore any file read errors etc and continue

Apart from this, you can also take a look at the “-mtime x” option for find command which lets you specify to list files which have changed in past x days. There are other similar options available for find. Look at “man find” and take your pick. Similar options exist for tar, but I have had a lot of weird issues using them, so I’d recommend sticking with this two step process of “find” followed by “tar”.

Now, the above mentioned commands and options can be used in innumerous ways and combinations to achieve your perfect balance of space and ease of use etc for backups. I’ll list down a sample script here, that will make a full backup on every first day of the week, and then make incremental backups over each day for rest 6 days. So, you’ll save a lot of space (more than 5 times), but you will have to use all 7 backup files to make a full restore. (I’m listing just the backup part, you can add the “mutt” command yourself, for e-mailing as mentioned in Part I)

CODE:
  1. #!/bin/bash
  2. date=`date +%w`
  3. if [ ! -e "test/a" ] || [ -z "$date" ]; then
  4. tar cvzpGf ~/backups/backup_`date +%w`.tgz ~/public_html
  5. echo inif
  6. else
  7. date2=$(($date-1))
  8. find ~ -type f -newer ~/backups/backup_$date2 > files.txt
  9. tar cvzpGf ~/backups/backup_$date.tgz -T files.txt
  10. fi

That’s it for today. Lemme know if you have any doubts, or if you would like to see any other questions answered in this series. The question that will be answered next time is:

Q2: Cron? Using tar, making up the script file is enough command line for me. Isn’t there an easy way?


----
If you liked this post, then you can Subscribe to my feed
Quote of the day: Morpheus: Throughout human history, we have been dependent on machines to survive. Fate, it seems, is not without a sense of irony.
Share and Enjoy:
  • Digg
  • del.icio.us
  • blogmarks
  • IndianPad
  • StumbleUpon
  • Technorati
  • Facebook
  • Live
  • Reddit
  • Slashdot
  • YahooMyWeb
  • e-mail

Related posts

written by Shantanu Goel \\ tags: , , , , , , , , , , , , , , , , , ,

2 Responses to “TIP: Automating Website Backups - Part IIb (Reducing Backup Size Contd.)”

  1. vince Says:

    Can you explain regarding this point:
    "only filenames are listed ... because tar creates a lot of issues when presented with directory names"

    What about duplicate names in different directories do they not get backed-up?

  2. Shantanu Goel Says:

    Vince,
    1. When tar is given a folder name, it will create a recursive tarball of that folder (and all files/folders inside it), then it'll again keep adding the individual files that it is given by find. So, the final tarball size will bloat up with a lot of duplicate copies of stuff. Though it won't create much issues while unpacking even if u do this, because files will get overwritten again and again and you'll finally get a sane output.
    2. About duplicate names in diff directories, they will get backed up because find doesn't just list the names, it lists the complete path.

Leave a Reply