Integration guidelines

This chapter describes the workflow, major tasks and concepts involved in integrating SoftwareContainer into a platform. A brief summary of the general steps involved:

  1. Setting up the various IPC mechanisms, that should be reachable from inside containers, on the host system.
  2. Writing service manifests for capabilities of the platform.
  3. Integrating a launcher that interact with SoftwareContainer.

Note that SoftwareContainer as a project is split into multiple sub-components, and while it is possible to use e.g. the container library directly, these guidelines assumes the integration point is the complete component.

Below follows information about what SoftwareContainer assumes about the host system when handling different IPC mechanisms and examples of configuration and usage.

Service manifests

For details about format and content of service manifests, see Service manifests and Gateways.

Service manifests should be installed in either /usr/local/etc/softwarecontainer//service-manifest.d/ or /usr/local/etc/softwarecontainer//service-manifest.default.d/ depending on if the capabilities defined in the manifests should be applied by default or not. Each manifest is expected to be of type JSON, including the “json” file extension.

Service manifests are read at startup of SoftwareContainer. If two service manifests contain capabilities with the same name the gateway configurations will be combined (without merging or removing duplicates), so when the capability’s gateway configurations are set all configurations from the manifests are included. For more information on how gateway configurations are handled, please refer to Gateways.

A service manifest’s content is used when one or more capability is set by a call to SetCapabilities. The respective gateway configurations that are part of any service manifest that relates to the specified capabilities are then applied to the gateways. If any capability is missing, or if any of the gateway configurations are erroneous, it is treated as a fatal error by SoftwareContainer as it will mean the systems capabilities are not correctly defined and the environment of any applications would be in a bad state.

Example

Here is an example service manifest, with the capabilities for getting and setting temperature:

{
  "version": "1",
  "capabilities": [{
      "name": "com.pelagicore.temperatureservice.gettemperature",
      "gateways": [{
          "id": "dbus",
          "config": [{
              "dbus-gateway-config-session": []
          }, {
              "dbus-gateway-config-system": [{
                  "direction": "outgoing",
                  "interface": "org.freedesktop.DBus.Introspectable",
                  "object-path": "/com/pelagicore/TemperatureService",
                  "method": "Introspect"
              }, {
                  "direction": "outgoing",
                  "interface": "com.pelagicore.TemperatureService",
                  "object-path": "/com/pelagicore/TemperatureService",
                  "method": "Echo"
              }, {
                  "direction": "outgoing",
                  "interface": "com.pelagicore.TemperatureService",
                  "object-path": "/com/pelagicore/TemperatureService",
                  "method": "GetTemperature"
              }, {
                  "direction": "incoming",
                  "interface": "com.pelagicore.TemperatureService",
                  "object-path": "/com/pelagicore/TemperatureService",
                  "method": "TemperatureChanged"
              }]
          }]
      }]
  }, {
      "name": "com.pelagicore.temperatureservice.settemperature",
      "gateways": [{
          "id": "dbus",
          "config": [{
              "dbus-gateway-config-session": []
          }, {
              "dbus-gateway-config-system": [{
                  "direction": "outgoing",
                  "interface": "org.freedesktop.DBus.Introspectable",
                  "object-path": "/com/pelagicore/TemperatureService",
                  "method": "Introspect"
              }, {
                  "direction": "outgoing",
                  "interface": "com.pelagicore.TemperatureService",
                  "object-path": "/com/pelagicore/TemperatureService",
                  "method": "Echo"
              }, {
                  "direction": "outgoing",
                  "interface": "com.pelagicore.TemperatureService",
                  "object-path": "/com/pelagicore/TemperatureService",
                  "method": "SetTemperature"
              }]
          }]
      }]
  }]
}

Network setup

The network setup of SoftwareContainer is dependent on a network bridge being available on the host system, if compiled with support for the NetworkGateway. By default, SoftwareContainer will create such a bridge on the system if it is not already there. This can be changed, so that SoftwareContainer will simply fail with an error message if the bridge was not available.

The selection of whether or not to create the bridge is a compile-time option given to CMake. Please see the README for more information about how to set the various CMake options.

For each container a virtual ethernet device will be set up and be bridged to the above mentioned network bridge on the host system. The virtual ethernet device is then mapped to an ethernet device inside of the container, configured to be eth0. The LXC template also copies /etc/resolv.conf from the host into the container, if it is available in the host. That means the same name servers will be used in the container as on the host.

In order to configure what traffic is allowed the NetworkGateway is used. The NetworkGateway converts the configuration it receives into iptables rules which are set for the network device inside of the container. See Gateways for more information.

Network setup sequence

On a typical IVI-system, this is how network setup will most likely look, from a SoftwareContainer perspective.

  1. System boots

    1. System sets up whatever network settings are needed in general
  2. SoftwareContainer starts

    1. SoftwareContainer reads all available service manifests
  3. SoftwareContainer sets up a bridge interface (if configured to do so)

    1. Bridge interface name and ip settings are read from the main configuration file
    2. SoftwareContainer sets up NAT between the host and the network defined by the bridge
blockdiag System SoftwareContainer System boot System network setup Start SoftwareContainer Read Manifests Create Bridge Configure NAT

All of the above will be done globally from a SoftwareContainer perspective, meaning it will not be done per-container. The below steps are per-container

  1. User starts an app somehow (probably through a launcher)

  2. Launcher creates a container

    1. LXC creates a veth interface on the host and connects it to the bridge
    2. LXC creates a network interface inside the container, and pairs it with the host interface.
  3. Launcher sets app capabilities

    1. All gateway configs that are default are read
    2. All gateway configs that are given are read
    3. The gateway configs are applied
    4. The network interface for the container is brought up and given ip address Note: the IP is automatic, based on container ID
    5. The network gateway applies all the iptables rules given in the configs given by the capabilities - inside the container.
  4. Launcher starts app in container

  5. App runs

  6. App stops (probably called by the launcher)

  7. Container is destroyed

    1. LXC brings down and deletes the container network interface
    2. LXC deletes the veth interface on the host
blockdiag Launcher SoftwareContainer LXC Container App launched Container created Veth created in host Veth created in container Call SetCapabilities Reads default caps Reads given caps Caps applied Veth in container UP Iptables applied Run app binary App runs Stop app Delete veth in container Delete veth in host Container destroyed

Wayland setup

In order to have applications access Wayland, one needs to enable the Wayland gateway, and possibly give access to graphics hardware. Not all applications require direct access to the graphics hardware, see Wayland example. A reasonable capability for a Wayland application would therefore include both the Wayland gateway and a configuration of the Device Node gateway for any graphics hardware access needed.

Example

Here is an example manifest defining Wayland access:

{
    "version": "1",
    "capabilities": [{
        "name": "com.example.wayland-access",
        "gateways": [{
            "id": "wayland",
            "config": [{
                "enabled": true
            }]
        }, {
            "id": "devicenode",
            "config": [{
                "name": "/dev/dri/card0"
            }]
        }]
    }]
}

The role of a launcher

This section describes what typical integration actions are needed to integrate SoftwareContainer with a launcher. For an overview of the general architecture involving a launcher and SoftwareContainer, see Design.

The assumed scenario in this section is that a launcher want to start an application inside the container.

The launcher should do the following:

  • Make the app home directory available inside the container.
  • Set the HOME environment variable in the container point to the above directory.

The above actions are performed by interacting with the SoftwareContainerAgent D-Bus API.

Setting up a home directory and HOME

By calling BindMount and passing a path on the host that will be mounted inside the container at the location specified as the pathInContainer argument, a directory is made available to any application started in the container later.

The path inside the container is intended to be set as the HOME environment variable inside the container. The variable is set when calling Execute with an appropriate dictionary passed as the env argument.

Traffic Shaping

Linux offers a very rich set of tools for managing and manipulating the transmission of packets. Those tools can be used to manage and shape traffic on SoftwareContainer.

The Linux network routing, firewalling and traffic control subsystem is very powerful and flexible and have grown to be a very mature stack of tools. To glance those solutions the linux documentation project would be very good resource to start. Please check TLDP for documentation on the concepts of traffic control in Linux. Linux Advanced Routing & Traffic Control is another good source that presents deep information about the topic.

Even though there are many ways of shaping the traffic, what we describe here is to give brief information with examples about one particular way of shaping network traffic of a container to prevent possible network starvation scenarios.

This can be done via the network classifier cgroup. The network classifier cgroup provides an interface to tag network packets with a class identifier. It is possible to make each SoftwareContainer produce network packages with different class id using different capabilities to set network classifier cgroup. This way tc can be used to assign different priorities to packets from different cgroups.

The network classifier cgroup is supported to mark network packages by the CGroups gateway.

Example

Here is an example way of marking network packages belonging to a container. The example aims to create a SoftwareContainer, configure its related gateways to mark packages and finally present a small howto about shaping traffic on host.

Create a service manifest to let container send/receive files on network and mark network classifier cgroup. Then add this manifest to default manifest folder :

{
    "capabilities": [
        {
            "name": "test.cap.netcls",
            "gateways": [
                {
                    "config": [
                        {
                            "setting": "net_cls.classid",
                            "value": "0x100001"
                        }
                    ],
                    "id": "cgroups"
                }
            ]
        },
        {
            "name": "network.accept-ping",
            "gateways": [
                {
                    "config": [
                        {
                            "direction": "OUTGOING",
                            "allow": [
                                {
                                    "host": "*",
                                    "protocols": "icmp"
                                },
                                {
                                    "host": "*",
                                    "protocols": [
                                        "udp",
                                        "tcp"
                                    ]
                                }
                            ]
                        },
                        {
                            "direction": "INCOMING",
                            "allow": [
                                {
                                    "host": "*",
                                    "protocols": "icmp"
                                },
                                {
                                    "host": "*",
                                    "protocols": [
                                        "udp",
                                        "tcp"
                                    ]
                                }
                            ]
                        }
                    ],
                    "id": "network"
                }
            ]
        }
    ]
}

Then, from the build directory, run the agent:

sudo -E softwarecontainer-agent

Next, we will start a new container:

gdbus call --system \
--dest com.pelagicore.SoftwareContainerAgent \
--object-path /com/pelagicore/SoftwareContainerAgent \
--method com.pelagicore.SoftwareContainerAgent.Create \
'[{"writeBufferEnabled": false}]'

Now we can set our prepared capabilities to mark network packages :

gdbus call --system \
--dest com.pelagicore.SoftwareContainerAgent \
--object-path /com/pelagicore/SoftwareContainerAgent \
--method com.pelagicore.SoftwareContainerAgent.SetCapabilities \
0 \
"['network.accept-ping',
'test.cap.netcls']"

In this stage every network package within the container SC-0 will be marked with class id 0x100001. This means every network package from this container will be marked with major class ID value 10 and minor class ID value 1. So it is wise to configure traffic shaping with tc (please check tc components for more information about tc) according to indicated class IDs in the host :

sudo tc qdisc add dev eth0 root handle 10: htb
sudo tc class add dev eth0 parent 10: classid 10:1 htb rate 1mbit
sudo tc filter add dev eth0 parent 10: protocol ip prio 10 handle 1: cgroup

With this setup SC-0 can consume only up to 1 mbit bitrate. For more information and examples it is suggested to visit following

Working with shared memory in containers

There are two main types of shared memory in Linux systems: POSIX shm and SysV IPC. POSIX shm generally assumes that a there is a tmpfs mounted on /dev/shm where one can handle the shared memory objects, which are accessed through its own family of API calls. POSIX shm is not namespace-aware.

SysV IPC uses mechanisms built-in to the kernel (internally it uses a tmpfs as well, but it is hidden). Among other things, SysV IPC supports shared memory, and is namespace-aware.

How to use this inside containers?

POSIX SHM

Using POSIX SHM requires you to have /dev/shm inside the container. To accomplish this, one simply has to mount it, which should be done through the File gateway.

SysV IPC

There is currently no easy way to, with the LXC API, stay in the same IPC namespace when creating a new container. It is equally hard to attach to an existing IPC namespace (for example, one to use to share data between host and container) when the container is already set up.

It is however possible, when using the attach() API call on the LXC container, to specify not to enter a new IPC namespace, so it is indeed possible to share memory between host and container when running a specific command. This would only be local to that command or application, and not to the whole container. The init process and any other processes running inside the container that are not instructed to share their IPC namespace would still live in a separate IPC namespace. This method is however not implemented in SoftwareContainer, so currently any calls to attach() from SoftwareContainer will always enter new namespaces (including the IPC namespace), as this is the default behavior in LXC.

Write buffered filesystems

To enable the write buffers for the filesystems of an app, you need to specify a short json object that is sent to the com.pelagicore.SoftwareContainerAgent.Create(id) method. What will happen is described in detail in the Filesystem chapter.

Example

The following snippet will create a container where a write buffer will be created for all filesystems in that container:

[{
    "writeBufferEnabled": true
}]

The following example will disable write buffers for all filesystems within the container:

[{
    "writeBufferEnabled": false
}]

systemd.service

As it is mentioned in the Getting Started section the user API is implemented by the SoftwareContainerAgent as a D-bus service. The API is used to start, configure, and control containers. Thus in order to use SoftwareContainer containers, the SoftwareContainerAgent should be run first. To be able to start the SoftwareContainerAgent and watch its process/system logs it would be wise decision to use it as a systemd service.

systemd is an init system used in Linux distributions to bootstrap the user space. As of 2015, a large number of Linux distributions adopt systemd as their default init system. More information about this can be found in freedesktop systemd docs.

To add the agent to init system, the integrator should prepare a configuration file whose name ends in .service. This file encodes information about a process controlled and supervised by systemd. The .service files are generally kept in the /lib/systemd/system directory.

There is a special syntax to prepare service files. More information about syntax can be found in freedesktop service unit configuration docs with examples.

Resource Management in containers

Resource Management can be configured for containers by using the cpu.shares CGroup setting. This results in that the CPU time available to all containers will be distributed according to the values specified for each container.