SoftwareContainer using LXC

SoftwareContainer uses the Linux Containers (LXC) project as its container backend. LXC uses namespaces and cgroups, and has been supported in the Linux kernel since 2.6.24.

LXC Template

LXC expects a template that sets up a basic file system structure so that the container has something to boot into. The LXC template SoftwareContainer uses does three things currently:

  • Creates a basic rootfs with all directories one would expect
  • Copies busybox into the rootfs, and populates /bin with all its aliases
  • Adds some conditional options to the LXC configuration file

Create basic rootfs

The rootfs created is a basic FHS-like (FHS) structure, although stripped down, with the added /gateways directory. LXC template will also create the path pointed by CMake with the variable ${CMAKE_INSTALL_PREFIX}. Furthermore, a root user and group will be created, and some configuration options will be set in the following three areas:

  • /etc/pulse/client.conf - tell pulse not to use shm
  • /etc/machine-id - populated with a dbus-uuid
  • /etc/resolv.conf - copied from host

/lib64 and /usr/lib64 are also added to the rootfs - they will be empty unless they exist in the host, in which case they will be bind mounted just like all other file systems, more on that below in LXC Configuration file.

Copy and set up busybox

This step checks for busybox on the host, copies it into the rootfs for the container, then symlinks all its functions to busybox in the /bin directory in the container - so that /bin/ls -> /?bin/busybox.

There is an ongoing discussion on the need for busybox at all - this has implications on startup time as well as for the actual code.

Set up dynamic configuration options

This step adds some entries to the config file for LXC. It sets the location of the rootfs, and adds some mount entries, namely the following:

  • The directory containing init.lxc is bind-mounted to /usr/sbin
  • If $GATEWAY_DIR is set, bind mount its directory to /gateways and chmod 777 it.

Full example

This is the full template used.

#!/bin/sh -e

#
# Copyright (C) 2016-2017 Pelagicore AB
#
# Permission to use, copy, modify, and/or distribute this software for
# any purpose with or without fee is hereby granted, provided that the
# above copyright notice and this permission notice appear in all copies.
#
# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
# WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR
# BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES
# OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
# WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
# ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
# SOFTWARE.
#
# For further information see LICENSE
#

#
# This LXC template is based on the busybox one shipped with LXC.
# Original author: Daniel Lezcano <daniel.lezcano@free.fr>
#
# LXC is free software, released under GNU LGPL 2.1
#
# Reader's guide: to get the full understanding of how the containers are
# set up in this system, make sure to read the lxc configuration file
# for this project (softwarecontainer.conf) where pty allocation, and some
# further mount points, as well as networking set up, is listed.
#

create_rootfs()
{
    rootfs=$1

    # This is the rootfs tree in the container. Apart from the files we
    # create below, this will be empty, and most directories will be bind
    # mounted from the host into the container (/lib etc, see below and
    # the configuration file).

    tree="\
        $rootfs/bin \
        $rootfs/dev \
        $rootfs/etc \
        $rootfs/etc/pulse \
        $rootfs/home \
        $rootfs/lib \
        $rootfs/lib64 \
        $rootfs/proc \
        $rootfs/root \
        $rootfs/sbin \
        $rootfs/usr/lib \
        $rootfs/usr/lib64 \
        $rootfs/usr/sbin \
        $rootfs/sys \
        $rootfs/tmp \
        $rootfs/gateways \
        $rootfs/${CMAKE_INSTALL_PREFIX}"

    # The tree needs to be writeable for the owner and read/execute for all
    mkdir -p $tree || return 1
    chmod 755 $tree || return 1

    # Create entry for root user in /etc/passwd and /etc/group
    echo "root:x:0:0:root:/root:/bin/sh" >> $rootfs/etc/passwd
    echo "root:x:0:root" >> $rootfs/etc/group

    # We may not have a shm fs mounted, so tell pulse not to use it.
    echo "disable-shm=yes" >> $rootfs/etc/pulse/client.conf

    # We generate a unique machine id for D-Bus.
    dbus-uuidgen --ensure=$rootfs/etc/machine-id

    # Copy DNS info into the guest, if available
    if [ -e /etc/resolv.conf ]; then
        cp /etc/resolv.conf $rootfs/etc/resolv.conf
    fi

    BUSYBOX=$(which busybox)
    if [ $? -ne 0 ]; then
        echo "busybox executable is not accessible"
        return 1
    fi

    # We will bind-mount busybox to this location in the container later
    touch $rootfs/bin/busybox
    chmod 755 $rootfs/bin/busybox

    cd $rootfs/bin || return 1
    COMMANDS=$($BUSYBOX --list)
    for COMMAND in $COMMANDS; do
        ln -s busybox $COMMAND
    done

    INITLXCFILE=$(which init.lxc)
    if [ $? -ne 0 ]; then
        echo "init.lxc not found"
        return 1
    fi
    touch $rootfs/bin/init.lxc
    chmod 755 $rootfs/bin/init.lxc

    # If we have network enabled, we need iptables
    if [ ${ENABLE_NETWORKGATEWAY} = "ON" ]; then
        IPTABLES=$(which iptables)
        if [ $? -ne 0 ]; then
            echo "iptables executable is not accessible"
            return 1
        fi

        # We will bind-mount iptables to this location in the container later
        touch $rootfs/bin/iptables
        chmod 755 $rootfs/bin/busybox
    fi

    return 0
}

copy_configuration()
{
    path=$1
    rootfs=$2

    # Set some non-static mount points, the rest are already in the config file
    echo "lxc.rootfs = $rootfs" >> $path/config
    echo "lxc.mount.entry = ${CMAKE_INSTALL_PREFIX} $rootfs/${CMAKE_INSTALL_PREFIX} none ro,bind 0 0" >> $path/config

    # Bind-mount the necessary binaries into the container
    echo "lxc.mount.entry = $(which init.lxc) bin/init.lxc none ro,bind 0 0" >> $path/config
    echo "lxc.mount.entry = $(which busybox) bin/busybox none ro,bind 0 0" >> $path/config

    if [ ${ENABLE_NETWORKGATEWAY} = "ON" ]; then
        echo "lxc.mount.entry = $(which iptables) bin/iptables none ro,bind 0 0" >> $path/config
    fi

    # If the gateway dir variable is set, add a mount entry for that one also.
    if [ -n "$GATEWAY_DIR" ]; then
        echo "lxc.mount.entry = $GATEWAY_DIR gateways none rw,bind 0 0" >> $path/config
        chmod 777 $GATEWAY_DIR
    fi
}

options=$(getopt -o p:n: -l path:,name:,rootfs: -- "$@")
if [ $? -ne 0 ]; then
    echo "Usage: $(basename $0) -p|--path=<path> --rootfs=<path>"
    exit 1
fi
eval set -- "$options"

while true
do
    case "$1" in
        -p|--path)      path=$2; shift 2;;
        --rootfs)       rootfs=$2; shift 2;;
        --)             shift 1; break ;;
        *)              break ;;
    esac
done

if [ "$(id -u)" != "0" ]; then
    echo "This script should be run as 'root'"
    exit 1
fi

if [ -z "$path" ]; then
    echo "'path' parameter is required"
    exit 1
fi

# detect rootfs (either from contents in config, or by using $path
if [ -z "$rootfs" ]; then
    config="$path/config"
    if grep -q '^lxc.rootfs' $config 2>/dev/null ; then
        rootfs=`grep 'lxc.rootfs =' $config | awk -F= '{ print $2 }'`
    else
        rootfs=$path/rootfs
    fi
fi

create_rootfs $rootfs
if [ $? -ne 0 ]; then
    echo "failed to set up softwarecontainer rootfs"
    exit 1
fi

copy_configuration $path $rootfs
if [ $? -ne 0 ]; then
    echo "failed to write configuration file"
    exit 1
fi

LXC Configuration file

The configuration file contains three things: network setup, device and pty/tty allocation, and mount entries.

Network setup

The Network setup configuration is used when LXC creates a veth interface, connected to lxcbr0 (not set up here!), and for the network to be up.

Device and pty/tty allocation

LXC has a directive called “autodev”, creates all needed devices automatically when set. This is used in conjunction with telling LXC to allocate tty and pty devices.

Mount entries

The static mount entries tell LXC to bind mount /usr, /lib, /usr/lib and /proc into the container. These are then amended by the template when run.

Full example:

lxc.utsname=contained

lxc.autodev = 1
lxc.tty = 1
lxc.pts = 1

#
# TODO: Remove this when we get the shutdown timeout issue fixed.
#
lxc.haltsignal = SIGKILL

lxc.network.type = veth
lxc.network.link = lxcbr0
lxc.network.name = eth0


# Auto-mount /proc and /sys with reasonable settings
lxc.mount.auto = proc sys

# Note: mounting all of /usr includes /usr/bin, /usr/lib64 and /usr/local.
lxc.mount.entry = /usr usr none ro,bind 0 0
lxc.mount.entry = /lib lib none ro,bind 0 0

# These are optional, as they may not exist in the host. If they exist
# they will be bind mounted from host to container.
lxc.mount.entry = /lib64 lib64 none ro,bind,optional 0 0

LXC API

TBD