A GPS Stratum 1 NTP Server

“A man with a watch knows what time it is. A man with two watches is never sure.” – Segal’s Law


Keeping accurate time is critical for a number of things at an ISP. Accurate time is needed for debugging to see what devices were affected at what time.   If you had one reference clock that everything synced to, it really wouldn’t matter if it was 5mS off or 5 minutes off.  So long as everything was based on that clock.  Where it matters if you are trying to match your logs and data with devices that are not synced to your clock but to some other clock.  So long as the other set of devices have synced to the same coordinated set of clocks you are, then you may have a good chance of lining up your logs to these other devices.

In tracking an event you are looking for timestamps that are “newer” or “older”. The need for accurate time may be in the milli or microseconds to see fast-moving events. In order to have a device to determine what time it is, it will normally use a protocol like the Network Time Protocol (NTP).  NTP will send a packet out to another and trusted the NTP server to see what time it is. As there is latency on the Internet and in the protocol, NTP will also try to calculate the latency and adjust for it. 

Trusted NTP servers will be classified on how they get their time such as based on an internal, highly accurate clock (stratum 0) or based on another NTP server (stratum 1, 2, etc.). For instance, a stratum 1 server will get its time from a stratum 0 server. A stratum 2 server will get its time from a stratum 1 server and so on.

One does everything they can to minimize the workload on these servers so that when a server is queried, it can respond as quickly and reliably as possible.  So you may not have all of your devices connect to your only reference clock.  You would distribute a second tier of NTP servers that would sync to your reference clock that your devices get their time from.  As we are building a stratum 1 clock, you would deploy a set of NTP servers that would be stratum 2 servers for the rest of your devices.  The stratum 2 servers would also sync with other NTP servers on the net to confirm that the reference clock is accurate.

The NTP Server

This server uses the Adafruit Global Positioning System (GPS) receiver hat sitting on a Raspberry Pi 3 to get its time from GPS. GPS has to have a highly accurate time to have accurate positioning reporting. 


As mentioned above, our server will get its time from a stratum 0 server. In this case, a GPS clock is considered a stratum 0 server as it is a highly stable “coordinated clock”. Coordinated means that the time it calculates is coordinated with other clocks, such as the rest of the GPS satellite clocks as well as established clocks on the ground. Drift between these clocks is noted and adjusted as needed in order to keep all coordinated clocks in sync.

GPS receivers indicated time a couple of ways. Every second, this string is sent to the Pi. The actual timestamp of what time it is is sent to the Raspberry Pi via a 9600 baud serial port. As the string can vary in length as well as various other delays such as the computer servicing serial port interrupts, this is not an accurate method of finding out what time it is. It can actually vary several hundred mS. An additional method needs to be conveyed to the Raspberry Pi indicating the actual moment when the second occurs. This is typically a General Purpose I/O (GPIO) pin going high for a 100 mS and then going low. This GPIO pin is exposed to the computer as the Packet Per Second (PPS) interface such as /dev/pps0. The moment the GPIO pin goes high is the exact moment the second occurs. This is called the rising edge. So, timing software on the computer will get the time from the serial port and know the exact moment from the GPIO pin going high.

On a computer, you will have one process looking at the data from the GPS and one process handing the NTP protocol. The second process will get the timestamp and could be configured to assume that the first process is looking at the /dev/pps0 interface or it could use the /dev/pps0 interface. In the current configuration, the NTP server (CHRONYd) is getting the GPS timestamp from the GPSd process via shared memory and the NTP server is looking at the /dev/pps0 interface for the exact moment that the second occurs.

The GPSd daemon simply listens to the serial port and reports this via shared memory to the NTP server. We are using a compiled (source) GPSd as the packaged version was several revisions old for the Raspberry Pi distro. We are using an NTP server that is also a recent release and locally compiled called Chrony. In this case, Chrony is listening to the PPS interface to understand when the actual second starts.


You can check the status of the NTP server by running the Chrony command-line interface called “chronyc”.   A couple of commands are worth note are “sources” and “tracking”.  The “sources” command will list the sources it is using for time.  In the line below it shows PPS0 which is the Pulse Per Second GPIO pin.  It is within the error rate tolerable of +/- 8mS and is around 200nS. GPS0 is the GPS NEMA strings via shared memory from GPSd to CHRONYd. Normally this would be off around 500mS or so we compensate for that by putting an offset in the chrony config of 0.500 seconds. PSM0 and PST0 are GPS NEMA strings and PPS signals via shared memory. Since the GPS is not handling PPS, these entries will be zero. The rest of the various NTP servers that chrony can sync against and have some confidence that the GPS data is sane.

chronyc> sources
210 Number of sources = 8
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
#* PPS0                          0   4   377    10   +227ns[ +516ns] +/- 8186ns
#? GPS0                          0   3   377     5    -12ms[  -12ms] +/-  100ms
#? PSM0                          0   3     0     -     +0ns[   +0ns] +/-    0ns
#? PST0                          0   3     0     -     +0ns[   +0ns] +/-    0ns
^- jane.qotw.net                 2   4   377    10    +20us[  +20us] +/-   88ms
^- ntp18.kashra-server.com       2   4   177    10  +1485us[+1485us] +/-   95ms
^- time.iqnet.com                2   4   377     4  -1098us[-1097us] +/-  141ms
^- www.almaprovence.fr           3   4   377     5  -1196us[-1196us] +/-  115ms

The Chrony tracking command gives you an update of stratum Chrony thinks it is. In this case Stratum 1, it has some confidence about slaving from the GPS receiver.

chronyc> tracking
Reference ID    : 50505330 (PPS0)
Stratum         : 1
Ref time (UTC)  : Fri Aug 21 22:01:59 2020
System time     : 0.000000248 seconds fast of NTP time
Last offset     : +0.000000319 seconds
RMS offset      : 0.000000776 seconds
Frequency       : 0.923 ppm slow
Residual freq   : +0.000 ppm
Skew            : 0.034 ppm
Root delay      : 0.000000001 seconds
Root dispersion : 0.000025404 seconds
Update interval : 16.0 seconds
Leap status     : Normal

Configuration files


# /etc/default/gpsd
## mod_install_stratum_one
# Default settings for the gpsd init script and the hotplug wrapper.
# Start the gpsd daemon automatically at boot time
# Use USB hotplugging to add new USB devices automatically to the daemon
# Devices gpsd should collect to at boot time.
# They need to be read/writeable, either by user gpsd or the group dialout.
# DEVICES="/dev/ttyAMA0 /dev/pps0"
# in case you have two pps devices connected
# DEVICES="/dev/ttyAMA0 /dev/pps0 /dev/pps1"
# We tell GPSD to use the console serial port.  Alas, it will try to open /etc/pps0.  
# Run Chrony first to grab the port so GPSd doesn't get it.
# Other options you want to pass to gpsd
GPSD_OPTIONS="-n -r -b"
# Serial setup
# For serial interfaces, options such as low_latency are recommended
# Also, http://catb.org/gpsd/upstream-bugs.html#tiocmwait recommends
#   setting the baudrate with stty
# Uncomment the following lines if using a serial device:
/bin/stty -F ${DEVICE} raw ${BAUDRATE} cs8 clocal -cstopb
# /bin/setserial ${DEVICE} low_latency


# PPS0
# PPS: /dev/pps0: Kernel-mode PPS ref-clock for the precise seconds
# refclock  PPS /dev/pps0                   refid PPS0  precision 1e-7  poll 3  trust  noselect  lock PSM0
refclock PPS /dev/pps0 refid PPS0 trust
# SHM(0), gpsd: NMEA data from shared memory provided by gpsd
# refclock  SHM 0                           refid GPS0  precision 1e-1  poll 3  trust  noselect  offset 0.0
refclock  SHM 0                           refid GPS0  precision 1e-1  poll 3  trust  noselect  offset 0.500
# SHM(1), gpsd: PPS0 data from shared memory provided by gpsd
refclock  SHM 1                           refid PSM0  precision 1e-7  poll 3  trust  prefer
# SOCK, gpsd: PPS0 data from socket provided by gpsd
refclock  SOCK /var/run/chrony.pps0.sock  refid PST0  precision 1e-7  poll 3  trust  noselect
# NTP Servers
pool  2.debian.pool.ntp.org  iburst  minpoll 4  maxpoll 4
# for the GPS...
# Start Chrony first to grab /etc/pps0 before GPSd does...
sleep 2
stty -F /dev/serial0 raw 9600 cs8 clocal -cstopb
/usr/local/sbin/gpsd -n -b /dev/ttyAMA0 -F /var/run/gpsd.sock


Be the first to like this

Librenms’ API

Librenms is a very flexible network and server monitoring and alerting system.  I have it deployed at a number of companies based on the ease of installation, the fact that it auto discovers devices, it is updated frequently (typically multiple times a week) and supports pretty much every network device you can think of.

On top of that, the alerting can be tuned to match very specific cases as the back end is MySQL so you alerting conditions can match almost anything you can write a SQL query for.  A good example would be to only alert on certain interfaces that have a specific description in them such as “TRANSIT” where the device has a hostname of “edge” and is only a 10Gbs connection (the interface name is ‘xe’).  Because you can group things by description or part of a hostname, you can just say anything with the string “edge” in the hostname should be considered a “edge router” so a group “ER” can be created for these devices.  With autodiscovery, as soon as you add a device, it will get automatically be put into the group that the rule/regular expression matches it.

One of the more interesting features is Libre’s API.  You can get pretty much any detail you want out of what Libre has collected and stored in the DB.  It will also create graphs for you on the fly.  One case I have had in the past is to create daily and weekly total bandwidth graphs for a set of specific ports on a group of switches.  The switch ports have a particular unique string I can match on so I was able to create a “group” called “peering” that included these ports over all of the switches.

I wrote this simple script called create_public_graphs.sh that asked for a graph for daily and weekly time frames.  I also added various options to the request such as don’t show the legend of interfaces and make the in and out directions all one color.  The other option is to make different colors for each interface.  We wanted a clean look so we went for a solid color.  The API doesn’t do everything you may want such as titling the graph.  This is where I use the “convert” program from imagemagick to overlay some text at the top of the graph.  You can see the final result at the SFMIX site.

Why you should never use zip (wire) ties…

Lets get geeky…

A good argument against zip ties was very well documented by a friend of mine, Steve Lampen at Belden[1]. It gets down to the fact that, as with any wire, you have a transmission medium. Much like coax and open line, you need to have a constant impedance through out the length of the line, else your signal will not arrive on the other end the way you may want due to mis-matches in the impedance of the wire causing standing waves.

What determines the impedance of a transmission line is the dielectric (the material between the conductors, the diameter of the conductors and the distance from each other. Zip ties typically will change that with the amount of pressure that is used for installation of the ties.

Additionally, you have to be careful in how you tie your cables together. Everyone wants a neat looking installation and will likely have a nice even spacing between ties. With the change in the characteristic impedance of the ties and spacing them out evenly, you get a really big notch in the frequency response of the line that is based on the distance of the ties. It is actually better to randomly space the ties.

Cable lacing with flat waxed string is one of the better ways to tie down cable. It provides for less pressure on the cable and is also less of an obstruction when you have to work with adding more or moving cable around the area that you have previously tied down. Many colos and Central Offices (CO) I have worked with have banned wire ties for these reasons.

A great alternative to lacing is VELCRO[2] or hook and loop fabric straps. Velcro has the advantage that it can be quickly moved around and doesn’t bind or change the cable specification. Typically it is used to bind a bunch of cables together. You may still need to use lacing to tie that bundle down to a wire tray. RIPTIE was one of the first companies to sell VELCRO for cable bundling. They have nice individual ties but they are expensive. If you don’t mind not having a fancy tag on your ties, then just get a roll of VELCO from pretty much any dealer. Wrap the cable and cut as needed. You can even get different colors to identify different bundles. Such as using the resistor color code to indicated bundle numbers. At my last company, we would use one color for core cables and another color for customer facing cabling.

And again, check out Steve’s report.

[1] http://www.belden.com/pdfs/PDF/hdcarltp.pdf
[2] VELCRO is the registered trademark for the Velcro Industries‘ product