Syncing and backing up two desktops

Unison
A Bi-directional File Synchronizer.

I tend to live on two virtual locations: My MacBook Pro running OS X and my main server KUMR.LNS.COM running Linux.  I should say, for decades before I switched to Linux, I ran various versions of *BSD on KUMR.

I have a home directory on both boxes with a subfolder called “projects” that has various things I have been working on for the last 30 or so years.  I want these directories and files in both locations for access and this gives me some semblance of a backup as well.  (Of course I have other backup methods including TimeMachine and other off-site backups.)

Additionally, if I am doing development, I will tend to use my Mac but OS X does have some peculiarities in how various packages like Python, etc. get installed and running package managers like home-brew may not load what I need for an environment that will be deployed on some server so I will do that work on KUMR.  (Ya… I know about containers, and VMs.).

The challenge is how to keep things in sync with each other.  For quite a while I have been using Unison, a file synchronizer that is bi-directional and uses rsync’s rather efficient method of file transfer.  I will skip describing the rsync protocol, but you can check out the paper at https://rsync.samba.org/how-rsync-works.html for the details.

Unison is extremely efficent in working through large file collections.  I currently have about 305 GB with 215,450 files and 27,888 directories just in my “project” folder.  If I was just using “rsync”, it would take a large amount of time in walking through each file, computing the hash, seeing if the hash was the same on the other server and starting up a transfer if it isn’t.  Unison will make a similar crawl of all the files once and then keep track of files via a hash of the file and put it into an archive in ~/.unison directory.    This means that the first time I run Unison it may take a hours to crawl through all the files, but subsequent runs may take less than a minute to scan and transfer, depending on what was changed last.

If you are worried that Unison is missing anything with this system, just go into the .unison directory and delete the archives in both the local an remote servers.  Normally they start with “ar*” or “fp*” and then run unison again.

Unison also knows when you have just moved a file or folder with the same material.  If it sees a new file with the same hash and name of an existing file, it just moves that file as a “shortcut”.  A big win in just moving a folder with a large number of files or large files.

Since I have been using Unison for a while, I have had some tweaks to the Unison configuration file (~/.unison/default.prf).  I thought I would share mine here with some comments that detail the config file itself.  This, by no means is a complete set of options for Unison.  You can see all of it detailed in the manual, of which I would highly suggest reviewing.

# Unison preferences file
# Local root directory...
root = /Users/pozar/
# Remote server, ssh port number and root directory...
root = ssh://lns.com:22//home/pozar
# The program to use to copy files with. In this case rsync...
copyprog = rsync -aX --rsh='ssh -p 22' --inplace --compress
# The program to use to copy files wit that supports partial transfers...
copyprogrest = rsync -aX --rsh='ssh -p 22' --partial --inplace --compress
# maximum number of simultaneous file transfers.  Good to have more than one to really use the pipe.
maxthreads = 5
# synchronize resource forks and HFS meta-data. (true/false/default)
# I'm not interested in seeing AppleDouble files on my Linux box...
rsrc = false 
# Filename and directories to ignore...
ignore = Name projects/Mail/.imap
ignore = Name .FBCIndex
ignore = Name .FBCLockFolder 
ignore = Name {Cache*,.Trash*,.VolumeIcon.icns,.HSicon,Temporary*,.Temporary*,TheFindByContentFolder}
ignore = Name {TheVolumeSettingsFolder,.Metadata,.filler.idsff,.Spotlight,.DS_Store,.CFUserTextEncoding} 
# ignore all files with .unison somewhere in their full path
ignore = Path .unison
# Normally ignore the giant VM Harddrive files...
ignore = Name Documents/Parallels
# Don't try to copy this socket...
ignore = Name projects/gnupg/S.gpg-agent
# ignore = Name Music/iTunes
ignore = Name .lnk
ignore = Name .DS_Store
ignore = Name Thumbs.db
# Directory paths off of root to sync...
path = projects
path = Desktop
path = Documents
path = Downloads
# Keep ssh open to help with long syncs such as the initial one
sshargs = -C -o TCPKeepAlive=yes
# synchronize modification times
times = true
# Create a local log of what happened.
log = true
# Uses the modification time and length of a file as a quick check if a file has changed.
fastcheck=true

There is a handy command line argument called “-batch” which will avoid asking about what to do with each file it found to sync.  Normally it will figure things out by looking at the date stamp.  In some cases, it won’t know that to do.  You can see below an example where the permissions or times of a file may be in conflict with various Django files I have.  In this case I want to propagate this meta data to my remote server “KUMR”.  I would normally use the “>” key to tell it to go from left (Local) to right (KUMR)…

# unison
Contacting server...
Connected [//Tims-MBP.local//Users/pozar -> //kumr//home/pozar]
Looking for changes
  Waiting for changes from server                                       
Reconciling changes
local          kumr               
props    ====> props      Desktop/django/lib/python3.6/site-packages/django/contrib/contenttypes/locale/lb/LC_MESSAGES/django.mo  [] >
props    ====> props      Desktop/django/lib/python3.6/site-packages/django/contrib/contenttypes/locale/lb/LC_MESSAGES/django.po  [] >
props    <-?-> props      Desktop/django/lib/python3.6/site-packages/django/contrib/contenttypes/locale/lt/LC_MESSAGES/django.mo  [] y
Unrecognized command 'y': try again  [type '?' for help]
props    <-?-> props      Desktop/django/lib/python3.6/site-packages/django/contrib/contenttypes/locale/lt/LC_MESSAGES/django.mo  []

With the “-batch” command it avoids this so you can just script this to run from a cronjob if you like.  I normally run unison in batch mode using an alias for bash like:

alias unison='unison -batch'

Of course, if you have a situation like above, it won’t get “fixed”.  That may be fine 99.9% of the time.  Occasionally I run unison without the batch argument just to get things fully in sync.

But what happens if I have a thousand files like this?  Say for some reason, the modify times on a bunch of files got change on both sides.  Typically you would just use the UNIX “yes” command to tell it to send ‘>’ to the program with something like “yes \>”.  Unison will take this input happily until it comes to the last question where it asks if you want to propagate the changes.  Then it is looking for a ‘y’ or ‘n’.  Fortunately a ‘y’ is ignored when Unison is asking what direction to propagate the files (see above).  So you can use the bash command:

while true; do echo ">"; echo "y" ;done | unison

This sends a ‘>’ and then a ‘y’ continuously into unison.  Eventually it will ask if it should propagate the changes and it will get a ‘y’.

I should say this command should be considered a bit “dangerous” unless you are sure, the meta data and files you are propagating are what you want on the other side.

Hope this gives you some insight on this rather handy tool.  Drop me a line if you have comments or questions.

 

Blocking traffic by MAC address on Ubiquiti EdgeRouters by Time of Day

See the update at the bottom of this post – Tim 20180211

I have an Ubiquiti EdgeRouter PoE at the house as my main router.  In order to manage “resources” at the house,  I wanted a way to block a couple of MAC addresses at a certain time each day.   I created a filter that blocks by MAC address that looks something like:

set firewall name SWITCH0_IN default-action accept
set firewall name SWITCH0_IN description 'Used for blocking local users'
set firewall name SWITCH0_IN rule 1 action drop
set firewall name SWITCH0_IN rule 1 description APhone
set firewall name SWITCH0_IN rule 1 disable
set firewall name SWITCH0_IN rule 1 log disable
set firewall name SWITCH0_IN rule 1 protocol all
set firewall name SWITCH0_IN rule 1 source mac-address '66:55:44:33:22:11'
set firewall name SWITCH0_IN rule 2 action drop
set firewall name SWITCH0_IN rule 2 description iPhone
set firewall name SWITCH0_IN rule 2 disable
set firewall name SWITCH0_IN rule 2 log disable
set firewall name SWITCH0_IN rule 2 protocol all
set firewall name SWITCH0_IN rule 2 source mac-address '11:22:33:44:55:66'
set firewall name SWITCH0_IN rule 3 action drop
set firewall name SWITCH0_IN rule 3 description Desktop
set firewall name SWITCH0_IN rule 3 disable
set firewall name SWITCH0_IN rule 3 log disable
set firewall name SWITCH0_IN rule 3 protocol all
set firewall name SWITCH0_IN rule 3 source mac-address '12:34:56:78:90:ab'

I applied this rule to the “switch0” interface that talks to my LAN interfaces at eth2, eth3 and eth4.

For the rulesets above, I want to enable rule #2 and #3 for the devices “iPhone” and “Desktop” to block traffic from them.  Two hours later, I want to disable this rule to pass traffic again.  This script does just that…

#!/bin/vbash                                              
# A script to disable access for two MAC addresses at     
# 0330 to 0530 UTC (1930 to 2130 Pacific) everyday.
unblock=0
if [ $# == 1 ]; then
  unblock=1
fi
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
WR="/opt/vyatta/sbin/vyatta-cfg-cmd-wrapper"
$WR begin
if [ $unblock == 1 ]; then
  $WR set firewall name SWITCH0_IN rule 2 disable
  $WR set firewall name SWITCH0_IN rule 3 disable
else
  $WR delete firewall name SWITCH0_IN rule 2 disable
  $WR delete firewall name SWITCH0_IN rule 3 disable
fi
$WR commit                                     
$WR end

There are a couple of ways to configure the router with scripts.  Ubiquiti suggests using the /opt/vyatta/etc/functions/script-template script like:

 

#!/bin/vbash
     
if [ $# == 0 ]; then
  echo usage: $0 
  exit 1
fi
new_ip=$1;
     
source /opt/vyatta/etc/functions/script-template
configure
set interfaces tunnel tun0 local-ip $new_ip
commit
save
exit

This actually breaks due to a bug.  I have had to use the /opt/vyatta/sbin/vyatta-cfg-cmd-wrapper as part of my script.  Works just fine.

The CRON entries that will run the scripts at 0330 and 0530 UTC…

30 3 * * * /home/joeuser/block_mac.sh
30 5 * * * /home/joeuser/block_mac.sh unblock

Updated on Feb 11th 2018...

It seems that either I missed this feature or Ubiquiti just added it.  You can add times to enable and disable the rule.  For instance, in the case of Rule #2 above, you would add starttime and stoptime statements.  You can also specify date of the week or date such as day/month/year.  This has been in Vyatta for a while now.

set firewall name SWITCH0_IN rule 2 action drop
set firewall name SWITCH0_IN rule 2 description iPhone
set firewall name SWITCH0_IN rule 2 disable
set firewall name SWITCH0_IN rule 2 log disable
set firewall name SWITCH0_IN rule 2 protocol all
set firewall name SWITCH0_IN rule 2 source mac-address '11:22:33:44:55:66'
set firewall name SWITCH0_IN rule 2 time starttime '21:00:00'
set firewall name SWITCH0_IN rule 2 time stoptime '22:00:00'
set firewall name SWITCH0_IN rule 2 time utc

Radio Air Checks…

Radio air checks are normally used by radio talent and program directors to go back and listen to the talent’s performance. Many times these air checks are recordings of a DJ’S shift but was “telescoped” so it only recorded the time that the talent’s microphone was turned on (aka ‘open’). Air checks were also collected by radio fans that wanted to either record a DJ they liked or perhaps event grab that hit song they can listen to later.

Mike Schweizer was a bit of a radio fan boy in his early years. Later on he became a radio engineer specializing in remote broadcasts and working as an engineer for stations like KUSF, KYA and KSFO. As a kid and through his adult life, he made recordings of stations and collected hundreds of reels of tape. Unfortunately his life was cut short and passed away in 2011. Before he passed, he transferred a good number of these airchecks to digital. I have many of these up at my site that you can listen to.  Since I published these some years ago, Internet Archive has recently copied these recordings and has put them up there.

Many of these are classic as they are recordings of early free form radio such as KMPX or air checks of Wolfman Jack on XEPRS.

I should say that not all of these were recorded by Mike, it seems that there are a handful that were recorded by others that have snuck in here.  If you find that one of those are yours, please drop me a note and I can remove it if you deem it necessary.

Librenms’ API

Librenms is a very flexible network and server monitoring and alerting system.  I have it deployed at a number of companies based on the ease of installation, the fact that it auto discovers devices, it is updated frequently (typically multiple times a week) and supports pretty much every network device you can think of.

On top of that, the alerting can be tuned to match very specific cases as the back end is MySQL so you alerting conditions can match almost anything you can write a SQL query for.  A good example would be to only alert on certain interfaces that have a specific description in them such as “TRANSIT” where the device has a host name of “edge” and is only a 10Gbs connection (the interface name is ‘xe’).  Because you can group things by description or part of a host  name, you can just say anything with the string “edge” in the hostname should be considered a “edge router” so a group “ER” can be created for these devices.  With autodiscovery, as soon as you add a device, it will get automatically be put into the group that the rule/regular expression matches it.

One of the more interesting features is Libre’s API.  You can get pretty much any detail you want out of what Libre has collected and stored in the DB.  It will also create graphs for you on the fly.  One case I have had in the past is to create daily and weekly total bandwidth graphs for a set of specific ports on a group of switches.  The switch ports has a particular unique string I can match on so I was able to create a “group” called “peering” that included these ports over all of the switches.

I wrote this simple script called create_public_graphs.sh that asked for a graph for daily and weekly time frames.  I also added various options to the request such as don’t show the legend of interfaces and make the in and out directions all one color.  The other option is to make different colors for each interface.  We wanted a clean look so we went for the solid color.  The API doesn’t do everything you may want such as tilting the graph.  This is where I use the “convert” program from imagemagick to overlay some text at the top of the graph.  You can see the final result at the SFMIX site.

Getting Mediainfo Data into a CSV

Mediainfo is a pretty handy tool to examine media files like MP4 containers and the streams in it such as the video and audio streams. It will also spit out a couple of formats in order parse the data such as XML. But I just want to get a certain set of data and want to move it into a CSV file so I can bring it into something like Google Sheets or Excel. Seems the “Inform” argument can get me some of the way there. You can tell it to give you multiple data points about a particular stream or “General” aspect about the file. You can’t mix say “Audio” and “Video”. That’s ok. I just want a handful of things about the Video stream and the filename. Woops. The file name is in the “General” bucket. So I am going to cheat a little with the “echo” command and tell it to print the file name and a comma and not to print out a new line in order to have it create the first column for the CSV row.

So this little script will find my MP4 movie files in the “/foo/*” directories and subdirectories, assign the name to the “movie” variable, print out the name and comma without a new line and then spit out a bunch of stuff about the video stream to get me a nice CSV output…

#!/bin/sh
echo "Filename,PixelWidth,PixelHeight,DisplayRatio,Bitrate,FrameRate"
find /foo/ -type f | grep -i \.MP4 | while read movie;do
echo -n "$movie,"
mediainfo --Inform="Video;%Width%,%Height%,%DisplayAspectRatio/String%,%BitRate%,%FrameRate%" $movie
done

And I get something like:

Filename,PixelWidth,PixelHeight,DisplayRatio,Bitrate,FrameRate
/foo/fill1.mp4,1440,1080,4:3,15000000,23.976
/foo/fill2.mp4,1440,1080,4:3,15000000,23.976
/foo/fill3.mp4,1920,1080,16:9,15000000,29.970

You can get a list of Mediainfo parameters to use with “–Info-Parameters” argument.

[Update: I created a mediainfo template and script that will create a nice CSV with lots of info from an MP4 container. Assuming one video and audio track for the container.  You can see it on github at: https://github.com/pozar/mediainfo2csv]

Let’s Encrypt!

Encryption is in the news again. Various three letter government organizations want to have backdoors in devices like cell phones for surveillance. Of course that means that with a back door or exploit into an operating system or application anyone can track traffic from these devices. Trying to limit it to “lawful interception” would be impossible.

Encryption has two significant rolls; security from third party viewing the traffic and authentication so you have some confidence that you are talking to the right party. Having traffic in the clear, without encryption means that your communications can be easily captured and your session could be spoofed. You certainly don’t want your web sessions with your bank in the clear where a nefarious party can watch your traffic and even spoof your session to transfer your funds to them. Internet commerce would not work without encryption.

The encryption method that web sites use is called HTTPS. It uses a protocol called TLS to set up an encrypted session between you and the web site. The nice thing about HTTPS and TLS, is that it can use a number of different strong ciphers to make it pretty difficult for third parties to sniff your traffic. It also uses a “chain of trust” system in order to have some authentication that the web site you are using is really the site you think it is.

Up until recently, setting up HTTPS and acquiring and setting up the certificate for the web site has not been for the faint of heart. It also has been pretty expensive. Purchasing a certificate can run between $250 and $500 a year. Your personal web site or even a small company, may not have the coin to purchase a certificate. As such, many sites have opted not to run HTTPS and will run the more common and insecure HTTP protocol. This is where Let’s Encrypt comes into this story.

To quote from Let’s Encrypt’s web site:

Let’s Encrypt is a free, automated, and open certificate authority (CA), run for the public’s benefit. Let’s Encrypt is a service provided by the Internet Security Research Group (ISRG). The ISRG is a non-profit with the mission to “reduce financial, technological, and education barriers to secure communication over the Internet.”

Let’s Encrypt is doing just that. It addresses the speed bumps to creating secure communications; it is free and it is simple. For most operating systems and web servers, it just means downloading the Let’s Encrypt software, running it and restarting the web server. Your site would be up and running with a valid HTTPS session. Although this is true for most Linux distributions, it isn’t quite there for UNIX-like systems like FreeBSD of which this site uses. It did take me a bit more hacking around to get this to work. Googling around, you can find out how to get this software to work on FreeBSD as well as how to configure your web server (eg. Apache on this box) to use Let’s Encrypt’s certificate as well as updating when the cert expires.

One of the nice things about Let’s Encrypt is the process to prove who you are to the Let’s Encrypt process. Normally with any other certificate authority, it would mean email, phone calls, etc back and forth a number of times. This can take hours or days to process. The Let’s Encrypt process just requires you take down your web site for the short period you run the Let’s Encrypt client. Running the client will put up a little website that the Let’s Encrypt servers will validate against. If you have control over your domain, this process will work and the Let’s Encrypt servers will hand back a certificate for your web site that is good for 90 days. From then on, you just run the client software say once a month to update the certificate. Any operating system has a way to running scheduled applications such as “cron” for Linux/FreeBSD. Oh… and all of this is free.

So there shouldn’t be an excuse to running a non-encrypted web site now. Protect yourself and your users by using HTTPS with Let’s Encrypt.

[A very nice page detailing how to install and renew Letsencrypt certs for Ubuntu can be found at: https://www.digitalocean.com/community/tutorials/how-to-secure-apache-with-let-s-encrypt-on-ubuntu-14-04 – Tim]

Ubiquiti Rockets and a 50Km Path Over Water…

FarallonsThis last fall, we put in a 50Km 5.8Ghz link from the center of San Francisco (Twin Peaks) to the South East Farallon Island lighthouse using Ubiquiti Rockets. At first the link was unusable. This was mainly due to the fact that the long distance and shooting over water causes the received signal to vary wildly. This cased the radios to frequently and rapidly try to change the MCS (modulation scheme) and would make the link very lossy. Here are some settings I had to settle on to get the links to work.

  • Do not enable auto-negotiate for the signal rate on long links. The radios will auto negotiate data rates when the receive signal level changes. This will momentary drop the link while the ends sync up. If the signal is bouncing frequently this will make the link pretty lossy or not usable at all.
  • Long links or links that are being interfered with will likely have problems with modulation schemes that have an amplitude component such as QAM. If so, use a modulation scheme that doesn’t have an amplitude component like BSFK where you can leverage “Capture Effect“. This would be MCS0 (1 chain) and MSC8 (2 chains).
  • Fix the distance of the link to about 30% over the calculated distance. The auto-magic calculation that AirOS does typically is wrong with long links.
  • Turn off AirMax on Point to Point links. AirMax is used to manage multiple clients on one AP more fairly. Not needed for P2P.
  • Use as narrow of a channel you can support for the bandwidth you need. As per the AirOS manual…
Reducing spectral width provides 2 benefits and 1 drawback.
  • Benefit 1: It will increase the amount of non-overlapping channels. This can allow networks to scale better
  • Benefit 2: It will increase the PSD (power spectral Density) of the channel and enable the link distance to be increased
  • Drawback: It will reduce throughput proportional to the channel size reduction. So just as turbo mode (40MHz) increases possible speeds by 2x, half spectrum channel (10MHz), will decrease possible speeds by 2x.

Airfields of Yesteryear

An older post of mine talked about looking back at history with the little geodetic survey benchmarks you see in the sidewalk and at the base of older buildings. Modern archaeology has always interested me and if you are interested too, there is a wonderful site documenting abandoned airports around the US named “Abandoned & Little-known Airfields”. It covers the history and evidence left behind when general aviation was more popular and strange little military operations that were out in the middle of nowhere.

As I have just spent the last 25 or so years living in San Francisco, it was a surprise to find out about strips that I didn’t know about such as the Bay Meadows Airport in San Mateo and the Marina Airfield next to Crissy Field in San Francisco. Marina Airfield was the first terminus of the United States Post Office Department Trans-Continental Air Mail Service.

Growing up in Fresno, I remember the remnants of Furlong Field just out Shaw Avenue. Good to see it documented here so it isn’t forgotten as development has pretty much obliterated any trace of it.

Sad to see so many fields disappear with the wane of general aviation in this country. It is just too expensive for most to own or lease a plane and keep it up. Land is being sold to developers as cities can see better tax revenue with a shopping center than an air strip.

Small form-factor broadcast console…

Mackie set the standard in inexpensive, small form-factor recording and sound consoles. I own a 1402 VLZ console that fits in a small brief case and sounds great. The problem is that it is the wrong console for most of the work I do that will need a console. Coming from the broadcast side of the world and not the recording side, I want things like a cue buss that sits at the end of the fader travel, or the control room monitors to mute when I turn on the mike so I don’t get feedback. I want logic that I can switch a CD player into play when I bring up the fader or hit a start button. None of these “features” are typically required on recording and sound consoles and that was where the biggest market for companies like Mackie are.

Allen & Heath, a respected name in recording consoles, has just come out with their first stab at a broadcast console in the same sort of form-factor as the Mackie 1402. It is called the XB-14 and has most of the Bells and Whistles that I have been looking for. I have been told by Mark Haynes at Leo’s Pro Audio that they should have one in next week to test drive and I am looking forward to seeing if they got it right.

One “down” side of the console is the price. It is selling at just under $1,400. I have seen it advertised at $1,200. The Mackie 1402 is running around $500. I can see that the XB-14 has a some extra features to make it more of a broadcast desk, but $700 more? I hope some of these boxes sell to encourage folks like Mackie to compete for this market.

Great tool for checking Line of Sight…

Google Maps has opened up access to resources that would take considerable work and expense to access. Just purchasing software that can do ray tracing over a geographic area 10 years ago would have cost tens of thousands of dollars. Now “HeyWhatsThat” has leveraged Google Maps to do just this and it is free.

Now, why would I be so interested in this site? Being a bit of a wireless geek, it is a great starter tool to understand how much coverage area a mountain top has. In the example shown in the right you can see the coverage area from the Twin Peaks communications site in San Francisco. The orange/red overlay indicates area that this site can see. You can see the shadowing of some of the hills of San Francisco affecting the coverage area.

At the top of the frame, shows a panorama of the skyline seen from that site. The list on the right shows what mountain tops can been seen and distance to them.

HeyWhatsThat is a great starting point in checking out coverage area. I wouldn’t throw away your $50,000 coverage software just yet as that will be a bit more accurate using better algorithms to calculate coverage such as Longley Rice and TIREM as well as their own tweaks.