linux.conf.au 2016 talk

Last week I did this ridiculous thing where I flew around the world in the easterly direction, giving talks at FOSDEM and linux.conf.au. The linux.conf.au staff always do a great job of making talk videos, and this year was no exception.

My talk was on LXD and live migration, a brief history of both as well as a status update and some discussion of future work on both. There were also lots of questions in this talk, so there's a lot of discussion of basic migration questions and inner workings.

Unforatunately, I can't embed it here, so I'll give you a link instead. Also, keep in mind at the time I was giving this talk I had been up for ~40 hours, so I forgot some English words here and there :)

https://www.youtube.com/watch?v=ol85OJxDaHc

Amazing Grace

Having sung competitively in several a capella choirs, I've always loved gospel music. As I was traveling back from South Africa recently, I watched the movie American Gangster, which has a particularly fantastic arrangement of Amazing Grace. Unfortunately, I can't find a non-movie version of it (it's uncredited on the soundtrack), but I'd very much appreciate a link to where to buy it if someone finds it.

Nicki Minaj - Only

I love the hook from this song, which I suppose means the production is great. Dr. Luke strikes again.

Meshuggah - Lethargica

The (Ophidian Trek) live version of Meshuggah's Lethargica is awesome; it is almost a completely different song than the album version, because of the tempo slowing down at various points in the song. This matches the lyrics (which are great), and gives you the feeling of a lurching machine.

Halsey - Gasoline

Halsey has kind of an interesting story. I know I've heard some of her music before, but this song caught my ear on the radio of the internets today.

Song 2

Where would my sense of humor be if the second tune I posted here wasn't Blur's Song 2? This is a track that I listened to a lot in high school, and most recently I've heard in an ad on TV somewhere, which I think means I'm getting old. Anyway, woohoo!

Blog o' tunes

For a while now I've been kicking around the idea of trying to post one song a week that I've enjoyed. Mostly as an archival tool for myself, but also potentially to share with others. I was hoping to write a little bit about why I like each song as well, but to start out with I'm just going to try to post one track a week.

Anyway, here's a pretty kickass piano tune. I like piano stuff in minor keys.

Using the LXD API from Python

After our recent splash at ODS in Vancouver, it seems that there is a lot of interest in writing some python code to drive LXD to do various things. The first option is to use pylxd, a project maintained by a friend of mine at Canonical named Chuck Short. However, the primary client of this is OpenStack, and thus it is python2. We also don't want to add a lot of dependencies in this module, so we're using raw python urllib and friends, which as you know can sometimes be...painful :)

Another option would be to use python's awesome requests module, which is considerably more user friendly. However, since LXD uses client certificates, it can be a bit challenging to get the basic bits going. Here's a small program that just does some GETs to the API, to see how it might work:

import os.path

import requests

conf_dir = os.path.expanduser('~/.config/lxc')
crt = os.path.join(conf_dir, 'client.crt')
key = os.path.join(conf_dir, 'client.key')

print(requests.get('https://127.0.0.1:8443/1.0', verify=False, cert=(crt, key)).text)

which gives me (piped through jq for sanity):

$ python3 lxd.py | jq .
{
  "type": "sync",
  "status": "Success",
  "status_code": 200,
  "metadata": {
    "api_compat": 1,
    "auth": "trusted",
    "config": {
      "trust-password": true
    },
    "environment": {
      "backing_fs": "ext4",
      "driver": "lxc",
      "kernel_version": "3.19.0-15-generic",
      "lxc_version": "1.1.2",
      "lxd_version": "0.9"
    }
  }
}

It just piggy backs on the lxc client generated certificates for now, but it would be great to have some python code that could generate those as well!

Another bit I should point out for people is lxd's --debug flag, which prints out every request it receives and response that it sends. I found this useful while developing the default lxc client, and it will probably be useful to those of you out there who are developing your own clients.

Happy hacking!

Live Migration in LXD

There has been a lot of interest on the various mailing lists as well as internally at Canonical about the state of migration in LXD, so I thought I'd write a bit about the current state of affairs.

Migration in LXD today passes the "Doom demo" test, i.e. it works well enough to reproduce the LXD announcement demo under certain conditions, which I'll cover below. There is still a lot of ongoing work to make CRIU (the underlying migration technology) work with all these configurations, so support will eventually arrive for everything. For now, though, you'll need to use the configuration I describe below.

First, I should note that things currently won't work on a systemd host. Since systemd re-mounts the rootfs as MS_SHARED, lots of things automatically become shared mounts, which confuses CRIU. There are several mailing list threads about ongoing work with respect to shared mounts in CRIU and I expect something to be merged that will resolve the situation shortly, but for now your host machine needs to be a non-systemd host (i.e. trusty or utopic will work just fine, but not vivid).

You'll need to install the daily versions of liblxc and lxd from their respective PPAs on each host:

sudo apt-add-repository -y ppa:ubuntu-lxc/daily
sudo apt-add-repository -y ppa:ubuntu-lxc/lxd-git-master
sudo apt-get update
sudo apt-get install lxd

Also, you'll need to uninstall lxcfs on both hosts:

sudo apt-get remove lxcfs

liblxc currently doesn't support migrating the mount configuration that lxcfs uses, although there is some work on that as well. The overmounting issue has been fixed in lxcfs, so I expect to land some patches in liblxc soon that will make lxcfs work.

Next, you'll want to set a password for your new lxd instance:

lxc config set password foo

You need some images in lxd, which can be acquired easily enough by lxd-images (of course, this only needs to be done on the source host of the migration):

lxd-images import lxc ubuntu trusty amd64 --alias ubuntu

You'll also need to set a few configuration items in lxd. First, the container needs to be privileged, although there is yet more ongoing work to remove this restriction. There are also a few things that CRIU does not support, so we need to set our container config to respect those as well. You can do all of this using lxd's profiles mechanism, that is:

lxc config profile create migratable
lxc config profile edit migratable

And paste the following content in instead of what's there:

name: migratable
config:
  raw.lxc: |
    lxc.console = none
    lxc.cgroup.devices.deny = c 5:1 rwm
    lxc.start.auto =
    lxc.start.auto = proc:mixed sys:mixed
  security.privileged: "true"
devices:
  eth0:
    nictype: bridged
    parent: lxcbr0
    type: nic

Finally, launch your contianer:

lxc launch ubuntu migratee -p migratable

Finally, add both of your LXDs as non unix-socket remotes (required for now, but not forever):

lxc remote add lxd thishost:8443   # don't use localhost here
lxc remote add lxd2 otherhost:8443 # use a publicly addressable name

Profiles used by a particular container need to be present on both the source of the migration and the sink, so we should copy the profile to the sink as well:

lxc config profile copy migratable lxd2:

And now, you're ready for the magic!

lxc start migratee
lxc move lxd:migratee lxd2:migratee

With luck, you'll have migrated the container to lxd2. Of course, things don't always go right the first time. The full log file for the migration attempts should be available in /var/log/lxd/migratee/migration_{dump|restore}_<timestamp>.log, on the respective host where the dump or restore took place. If you aren't successful in migrating things (or parsing the dump/restore log), feel free to mail lxc-users, and I can help you debug what went wrong.

Happy hacking!

setproctitle() in Linux

While working on LXD, one of the things I occasionally do is submit patches to LXC (e.g. the migration work or other things). In particular, the name of the LXC monitor process (the process that's the parent of init) is fork()ed in the C API call, so whatever the name of the binary that ran the API call (in our case, LXD) is the name of the parent. This could be slightly confusing (especially in the case where LXD dies but a process that looks like it is named LXD lives on). Should be easy enough to fix, right? Lots of *nixes seem to have a setproctitle() function to correct this, so we'll just call that!

And lo, there is prctl() which has a PR_SET_NAME mode that we can use. Done! Except from one small caveat from the man page:

The name can be up to 16 bytes long, and should be null-terminated if it contains fewer bytes.

Yes, you read that, 16 bytes; not useful for a lot of process names, especially something which would be ideal for LXC:

[lxc monitor] /var/lib/lxc container-name

Ok, so how hard can it be to write our own? If you look around on the internet, a lot of people suggest something like strcpy(argv[0], "my-proc-name"). That works, but what happens if your process name is longer than the original? You smash the stack! Try cat /proc/<pid>/environ on the program below:

#include <string.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
    char buf[1024];
    memset(buf, '0', sizeof(buf));
    buf[1023] = 0;
    strncpy(argv[0], buf, sizeof(buf));
    sleep(10000);
    return 0;
}

If your process name is longer than the original environment, you overwrite something else potentially more useful, which could cause all sorts of nastiness, especially as something that runs as root.

The thing is, the environment isn't necessarily all that useful; it doesn't indicate the current environment, just the initial environment. So we could use that space for the process name, as long as the kernel knew the environment wasn't valid any more. prctl() to the rescue again, we can pass it PR_SET_MM and PR_SET_MM_ENV_{START|END} to update these locations.

Problem solved! Except that we want to do this from liblxc.so, which has no concept of argv. prctl() has no PR_GET_MM calls, so we can't just go the other way with it. We could invent some ugly API where you have to pass it in, but that would require users to either set their argv pointers up front, or carry it around until they needed it, or something similarly ugly. Instead, we steal an idea from the CRIU codebase: we look in /proc/<pid>/stat. This file has (in columns 48-51, if your kernel is new enough) exactly the arguments you want from PR_GET_MM_*! Thus, we can use this file to find out inside of liblxc where is safe to put the new proctitle.

Putting it all together, liblxc now has an implementation of setproctitle() that will overwrite your initial environment (but is careful not to overwrite anything else), which can be used to set process titles longer than 16 bytes. Enjoy!