linux.conf.au 2016 talk

Last week I did this ridiculous thing where I flew around the world in the easterly direction, giving talks at FOSDEM and linux.conf.au. The linux.conf.au staff always do a great job of making talk videos, and this year was no exception.

My talk was on LXD and live migration, a brief history of both as well as a status update and some discussion of future work on both. There were also lots of questions in this talk, so there's a lot of discussion of basic migration questions and inner workings.

Unforatunately, I can't embed it here, so I'll give you a link instead. Also, keep in mind at the time I was giving this talk I had been up for ~40 hours, so I forgot some English words here and there :)

https://www.youtube.com/watch?v=ol85OJxDaHc

Live Migration in LXD

There has been a lot of interest on the various mailing lists as well as internally at Canonical about the state of migration in LXD, so I thought I'd write a bit about the current state of affairs.

Migration in LXD today passes the "Doom demo" test, i.e. it works well enough to reproduce the LXD announcement demo under certain conditions, which I'll cover below. There is still a lot of ongoing work to make CRIU (the underlying migration technology) work with all these configurations, so support will eventually arrive for everything. For now, though, you'll need to use the configuration I describe below.

First, I should note that things currently won't work on a systemd host. Since systemd re-mounts the rootfs as MS_SHARED, lots of things automatically become shared mounts, which confuses CRIU. There are several mailing list threads about ongoing work with respect to shared mounts in CRIU and I expect something to be merged that will resolve the situation shortly, but for now your host machine needs to be a non-systemd host (i.e. trusty or utopic will work just fine, but not vivid).

You'll need to install the daily versions of liblxc and lxd from their respective PPAs on each host:

sudo apt-add-repository -y ppa:ubuntu-lxc/daily
sudo apt-add-repository -y ppa:ubuntu-lxc/lxd-git-master
sudo apt-get update
sudo apt-get install lxd

Also, you'll need to uninstall lxcfs on both hosts:

sudo apt-get remove lxcfs

liblxc currently doesn't support migrating the mount configuration that lxcfs uses, although there is some work on that as well. The overmounting issue has been fixed in lxcfs, so I expect to land some patches in liblxc soon that will make lxcfs work.

Next, you'll want to set a password for your new lxd instance:

lxc config set password foo

You need some images in lxd, which can be acquired easily enough by lxd-images (of course, this only needs to be done on the source host of the migration):

lxd-images import lxc ubuntu trusty amd64 --alias ubuntu

You'll also need to set a few configuration items in lxd. First, the container needs to be privileged, although there is yet more ongoing work to remove this restriction. There are also a few things that CRIU does not support, so we need to set our container config to respect those as well. You can do all of this using lxd's profiles mechanism, that is:

lxc config profile create migratable
lxc config profile edit migratable

And paste the following content in instead of what's there:

name: migratable
config:
  raw.lxc: |
    lxc.console = none
    lxc.cgroup.devices.deny = c 5:1 rwm
    lxc.start.auto =
    lxc.start.auto = proc:mixed sys:mixed
  security.privileged: "true"
devices:
  eth0:
    nictype: bridged
    parent: lxcbr0
    type: nic

Finally, launch your contianer:

lxc launch ubuntu migratee -p migratable

Finally, add both of your LXDs as non unix-socket remotes (required for now, but not forever):

lxc remote add lxd thishost:8443   # don't use localhost here
lxc remote add lxd2 otherhost:8443 # use a publicly addressable name

Profiles used by a particular container need to be present on both the source of the migration and the sink, so we should copy the profile to the sink as well:

lxc config profile copy migratable lxd2:

And now, you're ready for the magic!

lxc start migratee
lxc move lxd:migratee lxd2:migratee

With luck, you'll have migrated the container to lxd2. Of course, things don't always go right the first time. The full log file for the migration attempts should be available in /var/log/lxd/migratee/migration_{dump|restore}_<timestamp>.log, on the respective host where the dump or restore took place. If you aren't successful in migrating things (or parsing the dump/restore log), feel free to mail lxc-users, and I can help you debug what went wrong.

Happy hacking!

lxd and Doom migration demo

Last week at the Openstack Developer Summit I gave a live demo of migrating a linux container running doom, which generated quite a lot of excitement! Several people asked me for steps on reproducing the demo, which I have just posted.

I am one of Canonical's developers working on lxd, and I will be focused on bringing migration and other features into it. I'm very excited about the opportunity to work on this project! Stay tuned!

Live Migration of Linux Containers

Recently, I've been playing around with checkpoint and restore of Linux containers. One of the obvious applications is checkpointing on one host and restoring on another (i.e. live migration). Live migration has all sorts of interesting applications, so it is nice to know that at least a proof of concept of it works today.

Anyway, onto the interesting bits! The first thing I did was create two vms, and install criu's and lxc's development versions on both hosts:

sudo add-apt-repository ppa:ubuntu-lxc/daily
sudo apt-get update
sudo apt-get install lxc

sudo apt-get install build-essential protobuf-c-compiler
git clone https://github.com/xemul/criu && cd criu && sudo make install

Then, I created a container:

sudo lxc-create -t ubuntu -n u1 -- -r trusty -a amd64

Since the work on container checkpoint/restore is so young, not all container configurations are supported. In particular, I had to add the following to my config:

cat << EOF | sudo tee -a /var/lib/lxc/u1/config
# hax for criu
lxc.console = none
lxc.tty = 0
lxc.cgroup.devices.deny = c 5:1 rwm
EOF

Finally, although the lxc-checkpoint tool allows us to checkpoint and restore containers, there is no support for migration directly today. There are several tools in the works for this, but for now we can just use a cheesy shell script:

cat > migrate <<EOF
#!/bin/sh
set -e

usage() {
  echo $0 container user@host.to.migrate.to
  exit 1
}

if [ "$(id -u)" != "0" ]; then
  echo "ERROR: Must run as root."
  usage
fi

if [ "$#" != "2" ]; then
  echo "Bad number of args."
  usage
fi

name=$1
host=$2

checkpoint_dir=/tmp/checkpoint

do_rsync() {
  rsync -aAXHltzh --progress --numeric-ids --devices --rsync-path="sudo rsync" $1 $host:$1
}

# we assume the same lxcpath on both hosts, that is bad.
LXCPATH=$(lxc-config lxc.lxcpath)

lxc-checkpoint -n $name -D $checkpoint_dir -s -v

do_rsync $LXCPATH/$name/
do_rsync $checkpoint_dir/

ssh $host "sudo lxc-checkpoint -r -n $name -D $checkpoint_dir -v"
ssh $host "sudo lxc-wait -n u1 -s RUNNING"
EOF
chmod +x migrate

Now, for the magic show! I've set up the container I created above to be a web server running micro-httpd that serves an incredibly important message:

$ ssh ubuntu@$(sudo lxc-info -n u1 -H -i)
ubuntu@u1:~$ sudo apt-get install micro-httpd
ubuntu@u1:~$ echo "Meshuggah is the best metal band." | sudo tee /var/www/index.html
ubuntu@u1:~$ exit
$ curl -s $(sudo lxc-info -n u1 -H -i)
Meshuggah is the best metal band.

Let's migrate!

$ sudo ./migrate u1 ubuntu@criu2.local
  # lots of rsync output...
$ ssh ubuntu@criu2.local 'curl -s $(sudo lxc-info -n u1 -H -i)'
Meshuggah is the best metal band.

Of course, there are several caveats to this. You've got to add the lines above to your config, which means you can't dump containers with ttys. Since containers have the hosts's fusectl bind mounted and fuse mounts aren't supported by criu, containers or hosts using fuse can't be dumped. You can't migrate unprivileged containers yet. There are probably others that I'm forgetting, though list of troubleshoting steps is available at criu.org/LXC#Troubleshooting.

There is ongoing work in both CRIU and LXC to get rid of all the caveats above, so stay tuned!