16 Sep 2017 Running RancherOS on Scaleway Cloud, the right(-ish) way.
There are some posts out there showing how to install RancherOS on SCW using the installer + syslinux/grub, thus effectively side-stepping the SCW infrastructure and forcing user to manage whole stack, from iPXE to a booted system. I think we can do better. After all, there is a rootfs available and, based on some SCW articles, this should be enough, right? Well, almost.
Create a new server instance and change the bootscript (it's under "Advanced" options) to one that ends with rescue. This will run the instance with system loaded to RAM and leave the disk free for us to tinker with. The actual system chosen doesn't matter, we're gonna destroy it anyway in just a minute.
After the machine boots up, connect to it through ssh.
First step is to zero out partition table and create a new filesystem. I'm doing it the RancherOS installer way, although it shouldn't be strictly necessary. Just remember to not create any partitions, the whole drive should be formatted as is, otherwise Scaleway scripts won't pick it up properly.
1 2 |
# dd if=/dev/zero of=/dev/vda bs=512 count=2048 # mkfs.ext4 -F -i 4096 -L RANCHER_STATE /dev/vda |
Now mount it.
1
|
# mount /dev/vda /mnt
|
Download and unpack RancherOS rootfs image to the disk.
1
|
# curl -Lo - https://github.com/rancher/os/releases/download/v1.0.4/rootfs.tar.gz | tar xz -C /mnt
|
Move to the partition mount directory, for easier access.
1
|
# cd /mnt
|
RancherOS image stores init
script at the root of the filesystem, but Scaleway boot script expects it under /sbin
, so symlink it.
1
|
# ln -s ../../init sbin/init
|
Since the rootfs is just a filesystem image, it does not contain kernel, nor - more importantly - kernel modules. Scaleway images use a custom script that downloads appropriate modules from their servers on first run. Unfortunately, this script requires a Debian-like environment, with full-blown shell, GNU Wget, depmod, etc. We don't have that in our barebones RancherOS image.
Therefore, I've written a different script, which, bundled with the proper (more on this in just a second) busybox, is able to do this without any other dependencies.
The proper busybox. We cannot use the one that's on the image, as it doesn't contain many of the actions we need (wget
, depmod
, ...). We also cannot use the "stock" one, downloaded from busybox website, because it uses a "simplified" depmod, which produces output incompatible with regular modules utilities.
So, I have downloaded the sources and built a custom busybox binary which contains all that's needed and with proper configuration. The .config
file used is available as gist.
There is still one more gotcha here, though. For the name resolution in wget
to work, we need to link against uclibc(-ng)
. Statically linking against glibc
will not work and we, quite obviously, don't want a dynamic link. There are two ways to tackle this. Either use a distro that's built around uclibc
, which you probably don't have (unless you're running a x86 version of OpenWrt somewhere [TIL]). Or build a cross toolchain. I went the second path and used crosstool-NG as my guide.
With all that out of the way, the following commands will download the script and the proper busybox binary and put them in /usr/local/sbin
, as expected by Scaleway boot script.
1 2 3 4 5 |
# mkdir -p usr/local/sbin # curl -Lo usr/local/sbin/scw-sync-kernel-modules \ https://gist.github.com/KenjiTakahashi/d3660cf8120c38d514d43436af9c2f90/raw/a99f9d568dcb609f4d16b10b5c53e1e103d55d7e/scw-sync-kernel-modules # curl -Lo usr/local/sbin/busybox https://img.kenji.sx/busybox # chmod +x usr/local/sbin/busybox usr/local/sbin/scw-sync-kernel-modules |
Last, but not least, you should add a cloud-config
file with your ssh key. Otherwise, you won't be able to login into the system in any way. Setting a hostname
might be a good idea, too.
1 2 3 4 5 6 7 |
# mkdir -p var/lib/rancher/config/cloud-config.d # cat <<EOF > var/lib/rancher/config/cloud-config.d/user-config.yml #cloud-config hostname: rancheros ssh_authorized_keys: - <YOUR_SSH_PUBLIC_KEY_HERE> EOF |
Remember to change the above command to contain your real ssh (public) key!
Unmount the disk and disconnect from the server.
1 2 3 |
# cd # umount /mnt # ^D |
Power off ("archive") the machine.
There are two ways to go from now.
If you want to have an image that can be used to bootstrap many servers automatically, etc., you should now create disk snapshot, create image off of the snapshot. The downside of this is that you need to keep that snapshot around (as the image will be tied to it) and snapshot storage costs money.
If not, you could just change bootscript of the existing server to x86_64 <kernel_version> rancher #1
, reboot and it should work.
Note that the bootscript type is important, the rancher one is the one that contains all modules necessary for RancherOS's system-docker
to work as expected. If you went the first path, you can also set it up as default bootscript for the image, to ease future deployments.