System2 online updates

There’s so much more to a System2 board than just waggling a point motor, reading a switch or indicating on a panel via an LED. This week we’ve been focussing on the online update system.

For those who don’t know, we provide free online updates for System2 products. This uses your Wi-Fi connection and Internet access to check for and update products. Updates may contain bug fixes or new features.

It is a core value that we must not break the online update system, and that’s why we house the update infrastructure using Amazon S3 storage buckets. This is mass high bandwidth storage that your board will access during the updating process to initially assess whether an update is available and to download the update should the user so wish.

Deep dive …

Digging into the update process, once initiated, we download the update in two stages. The first stage contains the executable code. The second stage downloads the static files such as the web pages the board will serve to you when configuring the device.

Going deeper …

We’ve also included technology (vendor supplied) that tests for a working program after a download and will automatically roll back to the previous version if it is borked (broken). This will only be tested once on the first reboot, after which, the old program image is automatically discarded.

Going down!!!

The second download is not a program file, but the web pages and data necessary for the user to interact with over the web interface. This is outside the rollback vendor supplied technology discussed in the previous paragraph. The file is the larger of the two downloads and if interrupted may render you unable to manage your device, though it will still continue to function as per your previous saved settings. Settings are not changed during the update process.

This seems to happen in less than 1% of cases and would be caused by an interruption to the patching process.

If this happens, you now have a board that works, with no way to administer it.

Coming up?

Over the past week we’ve been testing for the files within the file system and if they are not present, having the board silently reconnect to the update server and download it again. It’s worked so well that if we take a new uninitialized board and just install the code without the static files, it self-repairs automatically.

The clue is that the on board RUN LED flashes erratically just as it does when downloading an update under manual control. We’ve versioned the files so that all boards that will eventually support this technology will automatically download the correct version for the correct release of their installed software.

After the download completes, the board automatically reboots and the admin web interfaces will be present again.

So if sometime in the future you lose access to your admin interface and you’ve downloaded a version of the firmware that supports dynamic repair (possibly patched to all products within the next 2 weeks), just leave the board alone for some 20 minutes, it might just come back.

Apologies for the long wordy post. I thought it might be interesting for those who want to understand more.