Make a one way dropbox to easily pull down files

A simple script to synchronise files between servers, without the need for a home static ip.

Background

I have a cloud server with a whole lot of files on it. I also have a home server, which acts more as an archive. I pull files down from the cloud very frequently.

My ISP decided to switch to a shared IP address for our whole building, meaning I could no longer push to the home server directly, since the traffic gets blocked at their router. I asked them to open a port for me if possible, which they understandably didn’t want to, and a static IP would cost £5 a month extra. I think this is quite a stiff price for a simple service - my old ISP charged a one off fee of £5 which seems fairer.

To circumvent this I made a script that pulls stuff in on a regular basis. Obviously I can still log into my cloud server from home, so why not leverage that to sync files up?

Overview

If you are not famliliar with bash, this is what the script does:

  1. Sync every file in $REMOTE_PATH to $LOCALPATH
  2. If successful, remove every file in $REMOTE PATH

Errors get written to the $LOGFILE, notably if a sync fails.

The $PIDFILE is used to check if the script is already running. If it worked, it gets removed so that it can run again. If it has not finished or there was a problem, it exits. You will have to go and remove the $PIDFILE yourself if this happens but it’s better than losing your stuff.

Requirements

  • bash
  • ssh
    • With key-based authentication set up
  • rsync
  • (Optional) cron

Set up

  1. Change the variables at the top of the script. I have called it ‘ropbox’ because I am thick, you can come up with a better name probably.
    • $SERVER is your server’s hostname or IP address
    • Change the others based on where you want the logs storing etc
  2. Put this script somewhere on your home server, the one that will receive the files.
  3. Make the $LOCAL_PATH directory e.g. mkdir ~/ropbox
  4. Log into your remote server, make the $REMOTE_PATH, for me again this was mkdir ~/ropbox

Usage

  1. Copy some files to $REMOTE_PATH, on your server.
  2. On your home server, run the script, i.e ./ropbox.sh
  3. The files are now on your home server, and $REMOTE_PATH is empty
  4. Put it in your cron tab to run it regularly. If you sync large files, it won’t run twice because of the $PIDFILE

The script

#!/bin/bash
set -euo pipefail
IFS=$'\n\t'
PIDFILE=~/.ropbox.pid
LOGFILE=~/.ropbox.log
SERVER=yourcloudserver.example.com
REMOTE_USER=yourcloudusername
REMOTE_PATH=ropbox/
LOCAL_PATH=~/ropbox

## Check for running process
if [ -f $PIDFILE ]
then
    ## Check for pid using ps
    PID=$(cat $PIDFILE)
    set +e
    ps -p $PID > /dev/null 2>&1
    set -e
    ## If ps returned 0 exit code (ie PID exists)
    if [ $? -eq 0 ]
    then
        echo "$(date -u) Process already running" >> $LOGFILE
        exit 1
    else
    ## Process not found assume not running, make PIDFILE
        echo $$ > $PIDFILE
        if [ $? -ne 0 ]
        then
            echo "$(date -u) Could not create PID file" >> $LOGFILE
            exit 1
        fi
    fi
else
    echo $$ > $PIDFILE
    if [ $? -ne 0 ]
    then
        echo "$(date -u) Could not create PID file" >> $LOGFILE
        exit 1
    fi
fi

## Sync the ropbox dir
set +e
rsync -a $REMOTE_USER@$SERVER:$REMOTE_PATH $LOCAL_PATH

## Check for rsync errors
if [ $? -ne 0 ]
then
    echo "$(date -u) Sync not complete, preserving files" >> $LOGFILE
    exit 1
else
    ssh $REMOTE_USER@$SERVER rm -r "$REMOTE_PATH/*" > /dev/null 2>&1 || true
fi
set -e

## Remove PIDFILE
rm $PIDFILE

FAQS

Why clear out the remote directory?

This is a part that is specific to me - I have limited space on the cloud server, and cannot move the files from their locations. This system allows me to copy any assortment of files, have them download automatically, then delete them from the ‘sync’ folder.

This may not apply to you, or you might want to manually remove the files from the $REMOTE_PATH. If this is the case, remove the lines:

else
    ssh $REMOTE_USER@$SERVER rm -r "$REMOTE_PATH/*" > /dev/null 2>&1 || true

That way, if rsync completes the script does nothing. If it fails, it still writes to the log.

Why evaluate to true?

It’s a habit of mine to set -e, so that if something fails the script stops. There are one or two commands I don’t mind failng, like ssh... rm and rsync (if it fails and exits it won’t write to the log).

Thanks

Ben Cane’s article gave me the idea on how to implement a PIDFILE.