LinuxMeerkat

I swear! Meerkats can do Linux


Leave a comment

Python detection of USB storage device with DBUS

I spent quite some time lately trying to do all of sort of things with USB and Python. I will put out what I’ve learned through this course. Ofcourse I could use Pyusb or some other library but where’s the learning in that? As the Brits say: give a man a fish and you feed him for a day; teach a man to fish and you feed him for a lifetime.

This tutorial starts with a brief overview and then goes into general usage of Dbus and lastly using Dbus with UDisks2 to get notified if a USB storage device has been inserted.

Update:
I initially wrote this article with the title “Python detection of USB storage device”. After some time of working with UDisks2 and Dbus however I found out that the code tends to be very ugly and unexpected behaviours occur (not sure if it’s because of Dbus or UDisks2). So if you want to control an input/output device like a USB device or get events, I recommmend you use pyudev straight away! I will probably write an article on how to use that too. If you want to use DBus for educational purposes or want to use it maybe with some other software, please continue reading.

The daemons checkin’ out your USB

There are two main daemons running on Ubuntu (and probably all major distros) that deal with connecting/disconnecting devices.

The one is udisks which deals with storage devices like USB sticks and the like. The second is udev, a daemon that deals with all kind of devices from PCI boards to the keyboard and mouse (including everything that udisks deals with).

Let’s prove that they exist and that they are running on our system:

manos@box:~$ ps ax | egrep 'udev|udisks'
  352 ?        S      0:00 upstart-udev-bridge --daemon
  357 ?        Ss     0:00 /lib/systemd/systemd-udevd --daemon
 2844 ?        Sl     0:00 /usr/lib/gvfs/gvfs-udisks2-volume-monitor
 2850 ?        Sl     0:00 /usr/lib/udisks2/udisksd --no-debug
..

Now, depending on how old your Linux distribution is you might have udisks or udisks2 (fourth process) and then you have the udev daemon (second process). But what do these daemons do? Well let’s check for ourselves, shall we?

Let’s monitor our USB

Open two terminals. In the first type:

udisksctl monitor

And on the second one type:

udevadm monitor

These are front-ends for the daemons udisks and udev respectively.

Now, while looking at the two terminals we do the below:

  • First remove or put in a non-storage device like a mouse or a keyboard (USB mouse/keyboard works fine).
  • Now put in or remove a storage device (USB stick, external hard disk, etc).

You will notice that for the first action, udev prints a bunch of stuff while udisks does nothing. For the second case where we mingle with a storage device, both services print out stuff.

For example when I insert a USB stick I get this on the udisk terminal:

manos@box:~$ udisksctl monitor
Monitoring the udisks daemon. Press Ctrl+C to exit.
11:31:00.793: The udisks-daemon is running (name-owner :1.76).
11:31:05.850: Added /org/freedesktop/UDisks2/drives/Samsung_1040d8f1f9b92b56e6d1af1d7a0dd4c82781
  org.freedesktop.UDisks2.Drive:
    CanPowerOff:                true
    Configuration:              {}
    ConnectionBus:              usb
    Ejectable:                  true
..

and this on the udev terminal:

manos@box:~$ udevadm monitor
monitor will print the received events for:
UDEV - the event which udev sends out after rule processing
KERNEL - the kernel uevent

KERNEL[2569.327243] add      /devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3.4/2-3.4.2 (usb)
KERNEL[2569.328960] add      /devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3.4/2-3.4.2/2-3.4.2:1.0 (usb)
KERNEL[2569.329026] add      /devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3.4/2-3.4.2/2-3.4.2:1.0/host7 (scsi)
..

This makes it clear that udisks records only changes to storage devices like USB sticks and hard disks. udev on the other hand monitors any device that can connect/disconnect to your PC.

On a sidenote (in case you’re wondering) the KERNEL[blah blah] messages that appeared on the udev terminal, are Netlink messages sent from the kernel to udev.

The bigger picture

Now, we know that udisks is just like udev but with the difference that udisks plays with storage devices.

It turns out that udisks actually uses udev itself:

DEVICE INFORMATION
udisks relies on recent versions of udev(7) and the Linux kernel.

So what does udev do then? Well udev’s main work is to populate the /dev folder on your root directory. So it’s kind of a discovering daemon that tells the kernel what devices are connected and where. That’s all!

Below you can see the bigger picture of how everything works together.

Kernel udev udisks dbus communication

udev, udisks and dbus, they are all daemons running in the background from the time you start your computer.

Netlink and Dbus
Netlink and Dbus are two different protocols used for processes to talk with each other. In the first case the kernel communicates with udev and in the second case udisks with the dbus daemon. Many applications use the dbus protocol like Unity, NetworkManager, Skype, and pretty much all Gnome apps.

Notice that the dbus-daemon that is running all the time is part of the dbus protocol. I talk more about how dbus works, later so you get an understanding as to why a daemon is needed.

What’s udev?

The kernel uses the Netlink protocol (essentially UNIX sockets) to send messages to udev daemon. These messages are the ones we saw earlier when we typed udevadm monitor in the terminal. The way this works is by udev setting up some specified rules that the kernel reads at bootup. Then whenener a device is connected to your computer, a message to udev is sent from the kernel.

udisks uses the udev library and thus has access to all this. If you can use udev then you can pretty much access the kernel messages sent to udev.

And what about that dbus thingy?
The dbus daemon is not specific to udisks only. dbus daemon runs always and is a generic solution for processes to exchange information. NetworkManager for example uses this, so someone could create a program that automatically gets notified if an ethernet card is inserted by simply using dbus. All programs that use dbus, have access to each other. Luckily for us, udisks uses dbus 🙂

Choosing between udev, dbus and direct kernel messaging

From all the above it should become apparent that there are three main ways we can get notified about a USB insertion:

  1. Talk directly with the kernel with Netlink messages. In this case we essentially place ourselves in udev’s position.
  2. Use the udev library like udisks does. In python this is done with using pyudev.
  3. Use the Dbus protocol to talk with udisks directly.

There are pros and cons with all these three approaches. I will explain a bit on all three starting from the lowest level to the highest level.

Kernel
The kernel way is probably the best but is probably a hell to get working since it’s so low-level. You need to deal with raw sockets since that’s how Netlink communication is done. The good thing with this approach is that you are not dependent on anything more than the kernel itself. So it’s pretty hard that something will break and you can always be sure that your solution will work for any distribution.

udev
udev is probably the best practical solution. It handles the messages sent from the kernel so we don’t need to handle them. We only use a udev library and that’s all. From what I am aware all major Linux distributions use udev so you’re not going wrong with this solution. The only bad thing (and this is my own opinion) is that the udev library bindings for Python don’t come with Python so you need to install them on your system on your own.

Dbus
Dbus uses itself udev as we mentioned earlier so it has already a dependency. However as we sayied, pretty much all distros use udev so that shouldn’t be a problem. The main good things with Dbus is that the dbus library comes with Python (both 2 and 3) so you can start coding directly without having to install a bunch of stuff. An extra good thing is that maaaaaaany programs use the Dbus protocol so once you learn it, you can potentially access a lot of things. The bad thing with dbus is its ugly design (my opinion again).

In this tutorial I will talk about dbus since it’s one of the things that you learn once and can use in many different situations in the future.

How dbus works

Choosing Dbus means we will have to learn a bit on how Dbus protocol works and then try and find out how udisks uses dbus. I will try to make this as easy as possible even if it’s a quite hard task considering how sparse information is and how ugly some things in dbus are.

So what exactly is dbus? How does it connect things together? I think the easiest way to explain, is by showing you a diagram.

dbus diagram

The dbus-daemon keeps a sort of a freeway which all programs can access. In our diagram, udisks, NetworkManager and our python program can share information between them. The dotted yellow rectangles in the diagram are representing the messages exchanged between the programs.

There are actually two freeways (buses)

Dbus uses two different freeways like the one I mentioned above. That also means there are two dbus-daemon running. One is a system-wide freeway and the other is a session-wide freeway. In this context, session is the duration of time that a user is logged in.

These dbus daemons are running:

ps ax | grep dbus-daemon
  844 ?        Ss     0:00 dbus-daemon --system --fork
 2451 ?        Ss     0:00 dbus-daemon --fork --session --address=unix:abstract=/tmp/dbus-IP8uQnHdYR
 2545 ?        S      0:00 /bin/dbus-daemon --config-file=/etc/at-spi2/accessibility.conf --nofork --print-address 3

As you see there are two dbus-daemon running, or three in this case. Omit the third one (no idea what that is). The two main daemons are the first two. Notice the --system in the first one and the --session in the second. These are exactly what you might guess they are.. The first one keeps the system-wide freeway and the second keeps the session-wide freeway. I think it’s time we drop the ‘freeway’ word and use bus instead.

Now each freeway has its own rules. Furthermore, each program using dbus, can setup its own rules. All these rules are of the type “root can send this type of message” and “any user can access this type of message but not send this type”, etc.

An important thing to notice is that surprisingly there is a dbus-daemon running on your system. This daemon is used to put all programs using dbus communication together. So when we will try to connect to program A via dbus, we will actually connect to dbus-daemon which acts as a mediator.

So how do we use this module? I give below two examples. First I try to access NetworkManager and then udisks. I do this to show you how the workflow is and as a proof that you can use dbus for more than merely USB insertion scenarios.

Listing programs connected to dbus

The simplest way to find which programs are connected to dbus is from Python.::

>>> import dbus
>>> bus = dbus.SystemBus()
>>> bus.list_names()
dbus.Array([dbus.UTF8String('org.freedesktop.DBus'), dbus.UTF8String(':1.7'), dbus.UTF8String(':1.8'), dbus.UTF8String(':1.9'), dbus.UTF8String('org.freedesktop.ModemManager1'), dbus.UTF8String('org.freedesktop.NetworkManager'), dbus.UTF8String('com.ubuntu.Upstart'), dbus.UTF8String('org.freedesktop.Accounts'), dbus.UTF8String('org.freedesktop.RealtimeKit1'), dbus.UTF8String(':1.60'), dbus.UTF8String(':1.61'), dbus.UTF8String(':1.62'), dbus.UTF8String(':1.40'), dbus.UTF8String('com.canonical.NMOfono'), dbus.UTF8String(':1.85'), dbus.UTF8String(':1.63'), dbus.UTF8String(':1.86'), dbus.UTF8String('org.freedesktop.PolicyKit1')
..

That will list all programs connected to the system bus. In practice, this means that we can connect to any of these programs (as long as we connect to the right bus ofcourse)!

In the same fashion we can list programs connected to the session bus (these are probably much more):

>>> import dbus
>>> bus = dbus.SessionBus()
>>> bus.list_names()
dbus.Array([dbus.UTF8String('org.freedesktop.DBus'), dbus.UTF8String(':1.128'), dbus.UTF8String('com.canonical.Unity.Launcher'), dbus.UTF8String('org.freedesktop.Notifications'), dbus.UTF8String(':1.7'), dbus.UTF8String('com.canonical.indicator.datetime'), dbus.UTF8String(':1.8'), dbus.UTF8String(':1.9'), dbus.UTF8String('org.gtk.Private.AfcVolumeMonitor'), dbus.UTF8String('com.canonical.indicator.sound'), dbus.UTF8String('org.gtk.vfs.Daemon'), dbus.UTF8String('org.pulseaudio.Server'), dbus.UTF8String('com.canonical.indicator.application'), dbus.UTF8String('com.canonical.Unity.Webapps.Service'), dbus.UTF8String('com.ubuntu.Upstart'), dbus.UTF8String(':1.80'), dbus.UTF8String('org.gnome.SessionManager'), dbus.UTF8String(':1.81'), dbus.UTF8String('org.gnome.evolution.dataserver.Sources2'), dbus.UTF8String('com.canonical.indicator.session'), dbus.UTF8String('com.canonical.hud'), dbus.UTF8String(':1.82'), dbus.UTF8String(':1.83'),
..

Notice that in some cases we get domain looking strings and in some cases we get weird numbers like ‘:1.83’ and ‘:1.128’. These are pretty much the same thing as domains and IP on the internet. In this case however instead of the domain/IP pointing to a host, they point to a program on our computer. Ofcourse the number “:1.128” is quite cryptic and there’s actually no way to know which program it resolves to. But that is fine since the programs that are meant to be accessed, are given domain-like names just so that we can find them easily.


The right tools

Now that we know which program we want to connect to, we have to somehow find out how and what we can access from it. For this I will use a tool called d-tree. This tool gives you an overview of all programs connected to dbus and also give you a list of things that you can access.

On Ubuntu you can install this program directly with:

sudo apt-get install d-feet

On a side-note, you might feel tempted to access all this via Python or by using the terminal. Do your self a favour and don’t. I already took that path and there are just too many issues you might run into that you’ll never be able and truly understand how things run without getting in the low-level source code.

Running a program’s method

Whenever you want to access a program’s method or property, you need exactly four things:

  1. The program’s domain name
  2. Object path
  3. Interface
  4. The method or property (signals are a special case so will talk about them separately)

All these things are pretty random and make little sense. However that’s the API that we have for DBus so if you want it you have to bare with me. You just have to learn how to find each one of them and then how to use them in your Python program to access the property or method you want. You will be using d-feet to locate all these four things!

As an example I will take NetworkManager. Just look below..

d-feet NetworkManager

The method I want to run in this case is GetDevices(). However in order to run it I have to locate all the other things: the program, the interface and the object path. Just looking at the snapshot it’s easy to see where all these are located. As said, you just have to get used to finding them for any program.

Translating all this to code looks like this:

import dbus
bus = dbus.SystemBus()
obj = bus.get_object('org.freedesktop.NetworkManager', '/org/freedesktop/NetworkManager')
obj.GetDevices()

Where ‘org.freedesktop.NetworkManager’ is the program’s domain and ‘/org/freedesktop/NetworkManager’ is the so called object path. There is no real logic as to what an object path is. It’s just some bad implementation in the core, where someone tried to bring object oriented coding into Dbus which is written in C. What came out of this effort is this monster of illogical terms that just don’t fit together and just complicate things. However we have to learn all this if we are to use Dbus.

Output from the above code:

dbus.Array(
  [
    dbus.ObjectPath('/org/freedesktop/NetworkManager/Devices/0'),
    dbus.ObjectPath('/org/freedesktop/NetworkManager/Devices/1')
  ],
    signature=dbus.Signature('o')
)

Now if you are a person that spots things, you probably noticed that we didn’t enter the Interface (which happens to look exactly like the program name) anywhere in our code. So how come the code works when I just stated that we always need 4 things? Well.. it’s one of those bad designs. The reason it works is because the dbus library tries to “guess” which interface you want to use. In this case it was right. However many times it doesn’t work and you will run into problems that are really hard to debug. So a friendly advice is to ALWAYS give the 4 things mentioned. FOUR IS THE NEW FIVE.

In our case the fail-safe code would be:

import dbus
bus = dbus.SystemBus()
obj = bus.get_object('org.freedesktop.NetworkManager', '/org/freedesktop/NetworkManager')
iface = dbus.Interface(obj, 'org.freedesktop.NetworkManager')
iface.GetDevices()

Same output as before. But use this convention and you will never run into debugging nightmares. After all, explicit is better than implicit and that is especially true in this case.

Listing devices with udisks

Building on top of what we did with NetworkManager we continue but this time with UDisks2.

udisks2 in d-feet

As you might notice, udisks doesn’t have a straighforward method that we can call to list all devices. So someone has to dig through the things, read UDisks2 dbus API or simply google. Luckily for you, I did the digging.

udisks2 in d-feet

From the d-feet snapshot we can see directly the four things we need: the program domain (org.freedesktop.UDisks2), the interface (org.freedesktop.DBus.ObjectManager), the object path (/org/freedesktop/UDisks2) and lastly the method (GetManagedObjects()).

And the appropriate code is:

import dbus
bus = dbus.SystemBus()
obj = bus.get_object('org.freedesktop.UDisks2', '/org/freedesktop/UDisks2')
iface = dbus.Interface(obj, 'org.freedesktop.DBus.ObjectManager')
iface.GetManagedObjects()

You should get a blob of stuff including any mounted drives. Indeed you still have to dig through things but at least you shouldn’t be totally lost by now.

Getting properties/attributes

Building on the above, we will try to access some properties of a device. But first have a look at d-feet and pay attention to the fact that if you scroll down you will see the storage devices connected to your computer.

bunch of devices

Then we can also see that the device I choose has a bunch of properties.

bunch of properties

Now the logical thing to access these properties would be to do as we did earlier with the methods. Instead of methods we would just call the properties. And that my friend.. just won’t work! (Ugly Dbus, I hate you.)

So in order to access any property of any object we need to use a very specific interface: ‘org.freedesktp.DBus.Properties’.

So to not make this too tedious I will give the whole code again:

import dbus
bus = dbus.SystemBus()
obj = bus.get_object('org.freedesktop.UDisks2', '/org/freedesktop/UDisks2/drives/MBED_microcontrolleur_10105a42e87da33c103dccfb6bc235360a97')
iface = dbus.Interface(obj, 'org.freedesktop.DBus.Properties') # Here we use this 'magic' interface
iface.GetAll('org.freedesktop.UDisks2.Drive')

Notice that the object path ‘/org/freedesktop/UDisks2/drives/MBED_microcontrolleur_10105a42e87da33c103dccfb6bc235360a97’ has to be replaced with one that actually exists on your computer.

Then we open the correct interface and call GetAll(). The argument to GetAll() is the interface name that you want to get the properties of. As I told many times.. Dbus is so damn confusing! Anyway, the good thing is that the ‘org.freedesktp.DBus.Properties’ interface has only three methods we need to learn: Get(), Set(), GetAll(). So things can’t get more complicated than this.

If we want to access a very specific property we just use Get() instead of GetAll() and pass the interface name followed by the name of the property we want:

import dbus
bus = dbus.SystemBus()
obj = bus.get_object('org.freedesktop.UDisks2', '/org/freedesktop/UDisks2/drives/MBED_microcontrolleur_10105a42e87da33c103dccfb6bc235360a97')
iface = dbus.Interface(obj, 'org.freedesktp.DBus.Properties')
iface.Get('org.freedesktop.UDisks2.Drive', 'Id')     # Only difference

How about that USB insertion notification? (signals)

We saw how we can call methods but what about the interesting things like getting notified about an event like USB insertion? For this we can use signals.

All a signal is, is an incoming message from UDisks2 telling us that a mountable device has been inserted. From our part this signal will run a function in our code, a so called callback function. If you check in d-feet you will see many types of signals. The one that interests us is InterfacesAdded() from org.freedesktop.DBus.ObjectManager.

So how do we do this? I will get you straight the code for this one and do the explaining afterwards.

import dbus
from dbus.mainloop.glib import DBusGMainLoop
DBusGMainLoop(set_as_default=True)

bus = dbus.SystemBus()

# Function which will run when signal is received
def callback_function(*args):
    print('Received something .. ', args)

# Which signal to have an eye for
iface  = 'org.freedesktop.DBus.ObjectManager'
signal = 'InterfacesAdded'
bus.add_signal_receiver(callback_function, signal, iface)

# Let's start the loop
import gobject
loop = gobject.MainLoop()
loop.run()

If you run the above code, you will get printed things for every disconnection or connection of a storage device. Try plugging in your USB stick and see for yourself.

There are two things worth noticing in the code with signals:

  1. We import and use a bunch of weird things like gobject and DBusGMainLoop
  2. We don’t need the four things like earlier. We only need to know the interface and the signal (oh and both are just strings)

For our use the only thing we need from these libraries are the so called loops. The loops are needed since someone has to be looping somewhere waiting for the signal to arrive. We could probably create our own loop but I am not so sure how easy it would be. You could take that route if you want, may Zoidberg Jesus guide you the way. Below I mention a bit on what excactly Glib and Gobject is, just in case you’re curious and bored to google.

Glib
Glib is a library that originally was developed to be used with Gnome’s GTK. However later it split from it so that it could be used on any Linux machine. Many programs use the Glib library since it’s it provides so many things that is hard to find (in C) in a general-purpose library. Some things offered are data-structures like hash tables and linked lists (just keep in mind next time you will need any of this).

Gobject
Gobject is built on top of Glib. What it does is merely allow object oriented programming with the Glib. Since Glib is a C library, it doesn’t have object-oriented things lik objects, inheritance, classes etc. For that reason Gobject was developed. In our case, the reason we use it is to easily be able and change things on the loop without having to dig inside the C code.

Unblocking the loop

Now the loop above seems to work fine. There is a problem though. Your whole program blocks in this loop. Even if you add 1000 threads, they will all block for some weird reason and only the loop will run.

In order to stop the loop you have to:

  1. Add the code with the signal on a Python thread
  2. Initialize internally a thread for gobject (no clue what that does but won’t work if you don’t do it)

So in practice we just put the whole code above in a separate thread and then we make sure that gobject is calling threads.init() before we run() the loop.

And here is the code with the loop running on its own thread:

import threading, time

def start_listening():
    import dbus
    from dbus.mainloop.glib import DBusGMainLoop
    DBusGMainLoop(set_as_default=True)

    bus = dbus.SystemBus()

    # Function which will run when signal is received
    def callback_function(*args):
        print('Received something .. ', args)

    # Which signal to have an eye for
    iface  = 'org.freedesktop.DBus.ObjectManager'
    signal = 'InterfacesAdded'
    bus.add_signal_receiver(callback_function, signal, iface)

    # Let's start the loop
    import gobject
    gobject.threads_init()      # Without this, we will be stuck in the glib loop
    loop = gobject.MainLoop()
    loop.run()

# Our thread will run start_listening
thread=threading.Thread(target=start_listening)
thread.daemon=True              # This makes sure that CTRL+C works
thread.start()

# And our program will continue in this pointless loop
while True:
    time.sleep(1)
    print("tralala")

This is a good starting point for your program. Feel free to copy/paste it 🙂

Digging further

All this should give you an idea on how to get started. Now the main thing is probably that you have to filter out some signal events since InterfacesAdded() might run several times for a single unplug or plug-in of a device. Or depending on your case maybe you want to wait for a different signal. Using d-feet this should be rather easy to figure out.

The easiest way to do things from my experience is to use d-free and the Python interpreter directly to test things. However if you feel comfortable with ugly complicated interfaces then the UDisks2 dbus API might be of benefit for you.

References

http://dbus.freedesktop.org/doc/dbus-specification.html
https://www.kernel.org/doc/ols/2003/ols2003-pages-249-257.pdf
http://www.freedesktop.org/wiki/Software/dbus/
http://udisks.freedesktop.org/docs/latest/
http://en.wikipedia.org/wiki/Udev
http://en.wikipedia.org/wiki/D-Bus
http://dbus.freedesktop.org/doc/dbus-python/api/
http://doc.opensuse.org/products/draft/SLES/SLES-admin_sd_draft/cha.udev.html
http://free-electrons.com/doc/udev.pdf
http://dbus.freedesktop.org/doc/dbus-python/doc/tutorial.html


18 Comments

Running a GUI application in a Docker container

This guide will show you how to run a GUI application headless in a Docker container and even more specific scenarios involving running Firefox and Chrome. If you are not interested about those then you can just stop in the middle of this tutorial.

What the hell is X?

X is a program that sits on a Linux machine with a monitor (so servers usually don’t use X). X’s job is to talk to the Linux kernel in behalf of GUI programs. So if you are playing a game for example, the game (that is, the application) is constantly sending drawing commands to the X server like “draw me a rectangle here”. X forwards all this to the Kernel which will further forward the information to the GPU to render it on the monitor.

X can even receive commands from the keyboard or mouse. When you click to shoot on your game for example, the command “click at 466,333” is sent from your mouse to the kernel, from the kernel to the X and from X to the game. That way the game can have a clue on what is happening!

You will often hear X being called a server and the reason for that is simply because the way the applications send commands to it is through sockets. For that reason the applications are also referred to as clients many times.

If you are reading this then the X is running on your PC. Let’s prove it:

> ps aux | grep X
root      1436  3.2  0.7 687868 94444 tty7     Ssl+ 09:47   2:50 /usr/bin/X -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch

We can see that X is running as root and has PID 1436. An other important thing is to notice the :0 which is called display in X jargon. A display is essentially:

  • A monitor
  • A mouse
  • A keyboard

And this is the bigger picture of how it all looks together:

Now there is a variable in Linux that is used whenever we run a GUI program. That variable is surprisingly called DISPLAY. The syntax of the DISPLAY variable is

<hostname>:<display>.<monitor>

. Let’s check the DISPLAY on our computer:

> echo $DISPLAY
:0

I get :0, which means we use display 0. Notice however that this says nothing about which monitor we use. This makes sense since if you are running 2 or more monitors on your Linux you still have the same environment variables in both of them. It wouldn’t make sense that an environment variable changes just because you echo it from a different screen, would it? For that reason we get the display and not the monitor so that we get the same output on both. As about the hostname, since there is no info about it, the local host is assumed.

On a notice, if you have multiple monitors you can still specify which monitor to run an application by simply typing the full display variable you want. So if you have a monitor 0 and a monitor 1 on the current display, I can run firefox on monitor 1 with:

DISPLAY=:0.1 firefox

Creating a virtual monitor

Instead of running X, we can run a different version of it that can create virtual displays. Xvfb (virtual framebuffer – whatever the hell that means) will create a virtual monitor for us.

So let’s make a new monitor (I assume you have installed xvfb):

Xvfb :1 -screen 0 1024x768x16

This will start the Xvfb server with a display 1 and a virtual screen(monitor) 0. We can access this by simply typing DISPLAY=:1.0 before running our graphical program. In this case the program will start in the virtual screen instead of our monitor.

Let’s make sure that the screen is still running:

> ps aux | grep X
root      1436  3.1  0.7 684580 91144 tty7     Ssl+ 09:47   3:31 /usr/bin/X -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
manos    22018  0.0  0.1 164960 20756 pts/27   Sl+  11:37   0:00 Xvfb :1 -screen 0 1024x768x16

We see we have the normal display 0. (A way to tell it is the default screen is to see that it runs as root.) We can also see the second display :1 and screen 0 with resolution 1024×768. So what if we want to use it?

Open a new terminal and type:

> DISPLAY=:1.0 firefox
..

This will start firefox at the given display. The reason I use the DISPLAY at the same line is to make sure that the subprocess inherits the variable DISPLAY. An other way to do this is to type:

> DISPLAY=:1.0
> export DISPLAY
> firefox
..

Run a GUI program in a Docker container

We will now create a virtual screen inside a docker container.

> docker run -it ubuntu bash
root@660ddd5cc806:/# apt-get update
root@660ddd5cc806:/# apt-get install xvfb
root@660ddd5cc806:/# Xvfb :1 -screen 0 1024x768x16 &> xvfb.log  &
root@660ddd5cc806:/# ps aux | grep X
root        11  0.0  0.1 169356 20676 ?        Sl   10:49   0:00 Xvfb :1 -screen 0 1024x768x16

So now we are sure that we are running the virtual screen. Let’s access it and run something graphical on it. In this case I will run Firefox and Python+Selenium just as a proof of concept of what is happening.

First I put my display variable and use export to assure that any sub-shells or sub-processes use the same display (with export, they inherit the variable DISPLAY!):

root@660ddd5cc806:/# DISPLAY=:1.0
root@660ddd5cc806:/# export DISPLAY

Now we can simply run a browser

root@660ddd5cc806:/# firefox
(process:14967): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed
Xlib:  extension "RANDR" missing on display ":99.0".
(firefox:14967): GConf-WARNING **: Client failed to connect to the D-BUS daemon:
//bin/dbus-launch terminated abnormally without any error message
..

The errors don’t mean anything. But we can’t be sure, can we? I mean, since we can’t see what’s happening it’s really hard to tell. There are two things we can do, either use ImageMagick to take a snapshot and send it to our host via a socket or we can simply use Selenium. I will do that since most people probably want to achieve all this for testing purposes anyway.

root@0e395f0ef30a:/# apt-get install python-pip
root@0e395f0ef30a:/# pip install selenium
root@0e395f0ef30a:/# python
>>> from selenium import webdriver
>>> browser=webdriver.Firefox()
>>> browser.get("http://www.google.com")
>>> browser.page_source

If you get a bunch of HTML, then we have succeeded!

The Chrome issue

If you try and run Chrome in a Docker container, it won’t work even if you have setup everything correctly. The reason is that Chrome uses something called sandboxing. Reading this I could not let but notice the word jail. Apparently it seems that Chrome uses Linux containers (the same that Docker uses). For this reason you have to put a bit of extra effort to solve this issue since because of technical difficulties it’s not possible to run containers in containers.

There are two workarounds:

  1. Use my docker-enter
  2. Use –privileged when running the container

The second solution is probably the best one. However while testing things, there’s nothing wrong with the first one.

So to make things work (notice I run everything from the start):

> docker run -it --privileged ubuntu bash
root@7dd2c07cb8cb:/# apt-get update
root@7dd2c07cb8cb:/# apt-get install wget python-pip xvfb
root@7dd2c07cb8cb:/# wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
OK
root@7dd2c07cb8cb:/# echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list
root@7dd2c07cb8cb:/# apt-get update
root@7dd2c07cb8cb:/# apt-get install google-chrome-stable
root@7dd2c07cb8cb:/# pip install selenium

I have now installed Selenium, Chrome and Xvfb. Now I am going to make make a virtual monitor and run Chrome:

root@7dd2c07cb8cb:/# Xvfb :99 -screen 0 1024x768x16 &> xvfb.log &
[1] 6729
root@7dd2c07cb8cb:/# DISPLAY=:99.0
root@7dd2c07cb8cb:/# export DISPLAY
root@7dd2c07cb8cb:/# google-chrome
Xlib:  extension "RANDR" missing on display ":99.0".
Xlib:  extension "RANDR" missing on display ":99.0".
[6736:6736:1017/143449:ERROR:desktop_window_tree_host_x11.cc(802)] Not implemented reached in virtual void views::DesktopWindowTreeHostX11::InitModalType(ui::ModalType)
ATTENTION: default value of option force_s3tc_enable overridden by environment.
failed to create drawable
[6775:6775:1017/143449:ERROR:gl_surface_glx.cc(633)] glXCreatePbuffer failed.
[6775:6775:1017/143449:ERROR:gpu_info_collector.cc(27)] gfx::GLContext::CreateOffscreenGLSurface failed
[6775:6775:1017/143449:ERROR:gpu_info_collector.cc(89)] Could not create surface for info collection.
[6775:6775:1017/143449:ERROR:gpu_main.cc(402)] gpu::CollectGraphicsInfo failed (fatal).
[6775:6775:1017/143449:ERROR:sandbox_linux.cc(305)] InitializeSandbox() called with multiple threads in process gpu-process
[6775:6775:1017/143449:ERROR:gpu_child_thread.cc(143)] Exiting GPU process due to errors during initialization
[6736:6736:1017/143449:ERROR:gpu_process_transport_factory.cc(418)] Failed to establish GPU channel.

It seems that it works. It’s normal that we get the gpu errors since we don’t have a gpu! However I don’t like gambling so we will take it a step further to check that the browser actually works. However for this I will need to download the webdriver for Google Chrome..

root@7dd2c07cb8cb:/# apt-get install curl unzip
root@7dd2c07cb8cb:/# cpu_arch=$(lscpu | grep Architecture | sed "s/^.*_//")
root@7dd2c07cb8cb:/# version=$(curl 'http://chromedriver.storage.googleapis.com/LATEST_RELEASE' 2> /dev/null)
root@7dd2c07cb8cb:/# url_file="chromedriver_linux${cpu_arch}.zip"
root@7dd2c07cb8cb:/# url_base="http://chromedriver.storage.googleapis.com"
root@7dd2c07cb8cb:/# url="${url_base}/${version}/${url_file}"
root@7dd2c07cb8cb:/# wget "$url"
root@7dd2c07cb8cb:/# unzip chromedriver_*.zip -d tmp
root@7dd2c07cb8cb:/# mv tmp/chromedriver usr/bin/

Now (FINALLY!) we can test with Selenium:

root@7dd2c07cb8cb:/# python
>>> from selenium import webdriver
>>> browser=webdriver.Chrome()
>>> browser.get("http://en.wikipedia.org/wiki/Open_source")
>>> browser.page_source

You should get a bunch of HTML code. So there we go!

Common errors

.. Gtk: cannot open display:

DISPLAY has wrong value or you forgot to export it!

References

http://www.x.org/wiki/Development/Documentation/HowVideoCardsWork/
http://www.x.org/archive/X11R7.7/doc/man/man1/Xvfb.1.xhtml
http://blog.mecheye.net/2012/06/the-linux-graphics-stack/
http://people.freedesktop.org/~marcheu/linuxgraphicsdrivers.pdf
http://www.tldp.org/HOWTO/Framebuffer-HOWTO/
http://en.wikipedia.org/wiki/X_Window_System
http://en.wikipedia.org/wiki/Framebuffer
http://en.wikipedia.org/wiki/X.Org_Server
http://en.wikipedia.org/wiki/Display_server
http://www.google.com/googlebooks/chrome/med_26.html
http://linux.die.net/man/1/xvfb
https://www.freebsd.org/doc/handbook/x-understanding.html


1 Comment

Docker in a development enviroment

Intro

A few days ago I wrote a tutorial on how to setup docker and use it between different machines. Now that was a nice first insight on how to jump-start using docker. It was also a nice way to showcase the possibilities and limitations of Docker.

In this post I will give some practical information on how to use docker as a developer.

Setup

To use Docker for development of software we want mainly three things:

  • Have our source code on the host machine. That way we can use GUI editors and whatever tools we want from outside the container.
  • Be able to have multiple terminals to the same container. This is good for debugging
  • Setup a docker image which we will use for running our program. I will use Django and Python for that.

And for the visual brains out there:
container_host_communication
As you see in the pic, I am using Ubuntu as my host machine. At the same machine I have a folder with the source code and two terminals. Then I run a container with OpenSUSE. The folder and terminals reside ont he host machine but they communicate directly with the container. I will describe below how to achieve all this.

Multiple terminals

The easiest way to have multiple terminas is to use a small tool called nsenter. The guide can be found at https://github.com/jpetazzo/nsenter but it sums up to running this one-liner from any folder:

> docker run --rm -v /usr/local/bin:/target jpetazzo/nsenter

That installs nsenter on the host machine. After that, we can use it directly. So let’s try it. Open bash in a container with ubuntu as our basic image:

> docker run -t -i ubuntu /bin/bash
root@04fe75de21d4:/# touch TESTFILE
root@04fe75de21d4:/# ls
TESTFILE  boot	etc   lib    media  opt   root	sbin  sys  usr
bin	  dev	home  lib64  mnt    proc  run	srv   tmp  var

In the terminal above, I created a file called TESTFILE. We will try to open a second terminal and check to see the file from it.

To use xsenter we need the process ID of the container. Unfortunately we can’t use ps aux but rather have to use docker’s inspect command. I open a new terminal and type the below

> PID=$(docker inspect --format {{.State.Pid}} 04fe75de21d4)
> sudo nsenter --target $PID --mount --uts --ipc --net --pid

The string 04fe75de21d4 I gave is the ID of my container. If everything went ok, your terminal prompt will change to the same ID:

root@04fe75de21d4:/# ls
TESTFILE  bin  boot  dev  etc  home  lib  lib64  media	mnt  opt  proc	root  run  sbin  srv  sys  tmp	usr  var

See the TESTFILE there? Congrats! Now we have a second terminal to the exact same container!

Share a folder between host and container

Now I want to have a folder on my host computer and be able and access it through a container.

Luckily for us there is a built-in way to do that. We just have to specify a flag -v to docker. First let’s make a folder though that will be mounted:

> mkdir /home/manos/myproject

Let’s now mount it into the container:

> sudo docker run -i -t -v /home/manos/myproject:/home/myproject ubuntu /bin/bash
> root@7fe33a71ac2f:/#

If I now create a file inside /home/manos/myproject the change will be reflected from inside the container and vice versa. Play a bit with it by creating and deleting files from either the host or from inside the container to see for yourself.

Create a user in the container

It is wise to have a normal user in your image. If you don’t then you should create one and save the image. That way the source files can be opened from a normal user on your host – you won’t need to launch your IDE with root privileges.

> adduser manos
..

Follow the instructions and then commit your image. That way whenever you load the image again, you will be having user manos. To change to user manos just type

> su manos

All files you create now, will be accesible by a normal user at the host machine. Something else you could do is to somehow

Real life scenario: Python, Django and virtualenv

I wanted to learn Django. Installing Django is commonly made with the package manager pip, but pip has a bad history of breaking up things since it doesn’t communicate with Debian’s apt. So at some point if you installed/uninstalled python stuff with apt, pip wouldn’t know about it and vice versa. In the end you would end up with a broken python environment. That’s why a tool called virtualenv is being used – a tool that provides isolation. Since we have docker though which also provides isolation we can simply use that.

So what I really want:

  1. Have the source code on my host.
  2. Run django and python inside a container.
  3. Debug from at least two terminals.

Visually my setup looks as something like this:

docker_dev_setup_labels_900x500

I assume you have an image with django and python installed. Let’s call the image py3django.

Firstly create the folder where you want your project source code to be. This is the folder that we will mount. My project resides in /home/manos/django_projects/myblog for example.

Once it’s created I just run bash on the image py3django. This will be my primary terminal (terminal 1):

> sudo docker run -i -t -p 8000:8000 -v /home/manos/django_projects/myblog:/home/myblog py3django /bin/bash
root@2fe3611c1ec2:/home#

The flag -p makes sure that docker doesn’t choose a random port for us. Since we run Django we will want to run a web server with a fixed port (on 8000). The flag -v mounts our host folder /home/manos/django_projects/myblog to the container’s folder /home/myblog. py3django is the image I have.

Now we have a folder where we can put our source code and a working terminal to play with. I want though a second terminal (terminal 2) to run my python webserver. So I open a second terminal and type:

> sudo nsenter --target $(docker inspect --format {{.State.Pid}} 2fe3611c1ec2) --mount --uts --ipc --net --pid
> root@2fe3611c1ec2:/#

Mind that I had to put the appropriate container ID in the command above.

Now all this is very nice but admittedly it’s very complex and it will be impossible to remember all these commands and boring to type them each single day. Therefore I suggest you create a BASH script that initiates the whole thing.

For me it took a whole day to come up with the script below:

#! /bin/bash

django_project_path="/home/manos/django_projects/netmag" # Path to project on host
image="pithikos/py3django_netmag_rmved"                  # Image to run containers on

echo "-------------------------------------------------"
echo "Project:  $django_project_path"
echo "Image  :  $image"


# 1. Start the container in a second terminal
proj_name=`basename $django_project_path`
old_container=`docker ps -n=1 -q`
export docker_line="docker run -i -t -p 8000:8000 -v $django_project_path:/home/$proj_name $image /bin/bash"
export return_code_file="$proj_name"_temp
rm -f "$return_code_file"
gnome-terminal -x bash -c '$docker_line; echo $? > $return_code_file'
sleep 1
if [ -f "$return_code_file" ] && [ 0 != "$(cat $return_code_file)" ]
then
	echo
	echo "--> ERROR: Could not load new container."
	echo "    Stop any other instances of this container"
	echo "    if they are running and try again."
	echo 
	echo "    To reproduce the error, run the below:"
	echo "    $docker_line"
	echo
	rm -f "$return_code_file"
	exit 1
fi
rm -f "$return_code_file"


# 2. Connect to the new container
while [ "$old_container" == "`docker ps -n=1 -q`" ]; do
	sleep 0.2
done
container_ID=`docker ps -n=1 -q`
sudo nsenter --target $(docker inspect --format {{.State.Pid}} $container_ID) --mount --uts --ipc --net --pid

This script starts a container on a second terminal and then connects to the container from the current terminal. If starting the container fails, an appropriate message is given. django_project_path is the full path to the folder on the host with the source code. The variable image holds the name of the image to be used.

You can combine this with devilspie, an other nice tool that automates the position and size of windows when they’re launched.

In case you wonder about the top window with all the containers, that’s simply a watch command, a tool that updates regularly a command. In my case I use watch with docker ps. Simple stuff:

> watch docker ps

I use this because I personally like having an overview on the running containers. That way I don’t end up with trillions of forgotten containers that eat up my system.

Now that you have everything setup you can also run django server from one of the two terminals or whatever else you might want.


1 Comment

Docker tutorial

So as an intern in a big company I was given the task to get comfortable with docker. The problem is that docker is quite fresh so there isn’t really that much of good tutorials out there. After reading a bunch of articles and sparse tutorials (even taken the official tutorial at https://www.docker.com/tryit), I still straggled to get a firm grip on what docker even is supposed to be used for. Therefore I decided to make this tutorial for a total beginner like me.

Docker vs VirtualBox

There are many different explanations on the internet about what docker is and when to use it. Most of them however tend to complicate things more than giving some practical information for a total beginner.

Docker simply put is a replacement for virtual machines. I will use virtual box as a comparison example since it’s very easy for anyone to download and try to see the differences themselves.

The application VirtualBox is essentially a virtual machine manager.
text3806

Each and every of the OSes you see in the picture above, is an installed virtual machines. Each such machine has it’s own installed OS, kernel, virtual devices like hard disks, network cards etc. All this takes a considerate amount of memory and needs extra processing power. All virtual machines (VM) like VMware, Parallels, behave the same.

text4037

Now imagine that we want to use nmap from an OpenSUSE machine but we are on an Ubuntu. Using VirtualBox we would have to install the whole OS and then run it as a virtual machine. The memory consumption is humongous for such a trivial task.

In contrast to VirtualBox or any other virtual machine, Docker doesn’t install the whole OS. Instead it uses the kernel and hardware of our primary computer (the host). This makes Docker to virtualize super fast and consume only a fraction of the memory we would else need. See the benefits? Imagine if we wanted to run 4 different programs on 4 different OSes. That would take at a minimum 2GB of RAM.

But why would you want to run nmap on openSUSE instead of the host computer? Well this was just a silly example. There are other examples that prove the importance of a tool like Docker. Imagine that you’re a developer and you want to test your program on 10 different distributions for example. Or maybe you are the server administrator on a company and just updated your web server but the update broke something. No problem, you can run your web server virtualized on the older system version. Or maybe you want to run a web service in a quarantine for security reasons. As you see there are loads of different uses.

One question might rise though: how do we separate each “virtual machine” from the rest of the stuff on our computer? Docker solves this with different kernel (and non-kernel) mechanisms. We don’t have to bother about them though, since Docker takes hands of everything for us. That’s the beauty of it afterall: simplicity.

Install docker

Docker is in the ubuntu repositories (Ubuntu 14.04 here) so it’s as straightforward as:
sudo apt-get install docker

Once installed, a daemon (service) of the docker will be running. You can check that with

sudo service docker.io status

The daemon is called docker.io as you might have noticed. The client that we willuse is simply called docker. Pay attention to this tiny but significant detail.

Configuration

Do these two things before using docker to avoid any annoying warnings and problems.

Firstly we need to add ourselves to the docker group. This will let us to use docker without having to use sudo every time:

sudo adduser <your username here> docker

Log out and then in.

Secondly we will edit the daemon configuration to ensure that it doesn’t use any local DNS servers (like 127.0.0.1). Use your favourite editor to edit the /etc/default/docker.io file. Uncomment the line with DOCKER_OPTS. The result file looks like this for me:

# Docker Upstart and SysVinit configuration file

# Customize location of Docker binary (especially for development testing).
#DOCKER="/usr/local/bin/docker"

# Use DOCKER_OPTS to modify the daemon startup options.
DOCKER_OPTS="-dns 8.8.8.8 -dns 8.8.4.4"

# If you need Docker to use an HTTP proxy, it can also be specified here.
#export http_proxy="http://127.0.0.1:3128/"

# This is also a handy place to tweak where Docker's temporary files go.
#export TMPDIR="/mnt/bigdrive/docker-tmp"

We need to restart the daemon for the change to take effect:

sudo service docker.io restart

Get an image to start with

In our scenario we want to virtualize an Arch machine. On VirtualBox, we would download the Arch .iso file and go through the installation process. In Docker we download a “fixed” image from a central server. There are thousands of different such image files. You can even upload your own image as you will see later.

> docker pull base/arch
Pulling repository base/arch
a64697d71089: Download complete 
511136ea3c5a: Download complete 
4bbfef585917: Download complete

This will download a default image for Arch Linux. “base/arch” is the identifier for the Arch Linux image.

To see a list of all the images locally stored on your computer type

> docker images
REPOSITORY            TAG                   IMAGE ID            CREATED             VIRTUAL SIZE
base/arch             2014.04.01            a64697d71089        12 weeks ago        277.1 MB
base/arch             latest                a64697d71089        12 weeks ago        277.1 MB

Starting processes with docker

Once we have an image, we can start doing things in it as if it was a virtual machine. The most common thing is to run bash in it:

> docker run -i -t base/arch bash
[root@8109626c57f5 /]#

See how the command prompt changed? Now we are inside the image (virtual machine) running a bash instance. In docker jargon we are actually inside a container. The string 8109626c57f5 is the ID of the container. You don’t need to know much about that now. Just pay attention to how we acquired that ID, you will need it.

Let’s do some changes. Firstly I want to install nmap. Since pacman is the default package manager in Arch, I will use that:

[root@8109626c57f5 /]# pacman -S nmap
resolving dependencies...
looking for inter-conflicts...
..

Let’s run nmap to see if it works:

> nmap www.google.com
Starting Nmap 6.46 ( http://nmap.org ) at 2014-07-18 13:33 UTC
Nmap scan report for www.google.com (173.194.34.116)
Host is up (0.00097s latency).
..

It seems we installed it successfully! Let’s also create a file:

[root@8109626c57f5 /]# touch TESTFILE

So now we have installed nmap and created a file in this image. Let’s exit the bash

[root@8109626c57f5 /]# exit
exit
> 

In VirtualBox you can save the state of the virtual machine at any time and load it later. The same is possible with docker. For this I will need the ID of the container that I was using. In our case that is 8109626c57f5 (it was written in the terminal prompt all the time). In case you don’t remember the ID or you have many different containers, you can list all the containers:

> docker ps -a
CONTAINER ID     IMAGE                    COMMAND      CREATED            STATUS
8109626c57f5     base/arch:2014.04.01     bash         25 minutes ago     Exit 0

Let’s save the current state to a new image called mynewimage:

> docker commit -m "Installed nmap and created a file" 8109626c57f5 mynewimage
6bf56047833bd41c43c9fc3073424f37bfbc96993b65b868cb8d6a336ac28b0b

Now we have the saved image locally on our computer. We can load it anytime we want to come back to this state. And the demonstration..

> docker run -i -t mynewimage bash
[root@55c343f1643a /]# ls
TESTFILE  bin  boot  dev  etc  home  lib  lib64  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var
[root@55c343f1643a /]# whereis nmap
nmap: /usr/bin/nmap /usr/share/nmap /usr/share/man/man1/nmap.1.gz

Loading the image on an other computer

We now have two images, the initial Arch image we started with and the new image that we saved:

> docker images
REPOSITORY            TAG                   IMAGE ID            CREATED             VIRTUAL SIZE
mynewimage            latest                6bf56047833b        2 hours ago         305.8 MB
base/arch             2014.04.01            a64697d71089        12 weeks ago        277.1 MB
base/arch             latest                a64697d71089        12 weeks ago        277.1 MB

It’s time to load the new image on a totally different computer. First I need to save the image on a server though. Luckily for a docker user, this is very simple. First you need to make an account at https://hub.docker.com/

Once that is done we need to upload the image to the hub. However we have to save the image in a specific format, namely username/whatever.

Let's save the image following that rule:
> docker commit -m "Installed nmap and created a file" 8109626c57f5 pithikos/mynewimage
12079e0719ce517ec7687b4bf225381b99b880510cda3bc1e587ba1da067bd3b

First we need to login to the server:

> docker login
Username (pithikos): 
Login Succeeded

Then I upload the image to the server:

> docker push pithikos/mynewimage
The push refers to a repository [pithikos/mynewimage] (len: 1)
Sending image list
..

Once everything is uploaded, we can pull it from anywhere just as we did when we first pulled the Arch image.

I will do that from inside an OpenSUSE install on a totally different machine. First I try to run nmap

Screenshot from 2014-07-18 17:17:05

As you see it’s not installed on the computer. Let’s load our Arch image that we installed nmap on
Screenshot from 2014-07-18 17:22:34

Once the download of the image is complete we will run a bash on it just to look around:

Screenshot from 2014-07-18 17:27:19_

As you see, the TESTFILE is in there and we can run nmap. We are running the same Arch I ran earlier on Ubuntu, on a totally new machine with a totally different OS, but still running it as an Arch.

A bit on containers

Now you probably got a good idea on what images are. Images are simply states of a “virtual machine”.

When we use docker run whatever we are running is put inside a container. A container is pretty much a Linux concept that arose recently with the recent addition of Linux containers to the kernel. In practise container is running a process (or group of processes) in isolation from the rest of the system. This makes the process in the container to not being able to have access to other processes or devices.

Every time we run a process with Docker, we are creating a new container.

> docker run ubuntu ping www.google.com
PING www.google.com (64.15.115.20) 56(84) bytes of data.
64 bytes from cache.google.com (64.15.115.20): icmp_seq=1 ttl=49 time=11.4 ms
64 bytes from cache.google.com (64.15.115.20): icmp_seq=2 ttl=49 time=11.3 ms
^C
> docker run ubuntu ping www.yahoo.com
PING ds-any-fp3-real.wa1.b.yahoo.com (46.228.47.114) 56(84) bytes of data.
64 bytes from ir2.fp.vip.ir2.yahoo.com (46.228.47.114): icmp_seq=1 ttl=45 time=46.5 ms
64 bytes from ir2.fp.vip.ir2.yahoo.com (46.228.47.114): icmp_seq=2 ttl=45 time=46.1 ms
^C

Here I ran two instances of the ping command. First I pinged http://www.google.com and then http://www.yahoo.com. I had to stop them both with CTRL-Z to get back to the terminal.

> docker ps -a | head
CONTAINER ID      IMAGE             COMMAND                CREATED              STATUS
7c44887b2b1c      ubuntu:14.04      ping www.yahoo.com     About a minute ago   Exit 0
72d1ca1b42c9      ubuntu:14.04      ping www.google.com    7 minutes ago        Exit 0

As you see, each command got its own container ID. We can further analyse the two containers with the inspect command. Below I compare the two different ping commands I ran to make it more apparent how the differentiate in the two containers:

> docker inspect 7c4 > yahoo
> docker inspect 72d > google
> diff yahoo google 
2,3c2,3
<     "ID": "7c44887b2b1c9f0f7c10ef5e23e2643026c99029fce8f1575816a23e56e0c2d0",
<     "Created": "2014-07-21T10:05:45.106057381Z",
---
>     "ID": "72d1ca1b42c995973086a5fb3c5e256d5cb2c5055e8f9040037bb6bb915c8187",
>     "Created": "2014-07-21T10:00:08.653219667Z",
6c6
<         "www.yahoo.com"
---
>         "www.google.com"
9c9
<         "Hostname": "7c44887b2b1c",
---
>         "Hostname": "72d1ca1b42c9",
29c29
<             "www.yahoo.com"
---
>             "www.google.com"
44,45c44,45
..

Notice that I don’t have to write the whole string. For example instead of 7c44887b2b1c, I just type the first three letters 7c4. In most cases this will suffice.


Leave a comment

Who does the cleanup, the process or the kernel?

For a long time I believed that cleanup is work that the kernel does. However that is not completely true. Part of cleanup is also taken by the process itself. But let’s dive into the details..

Ways to terminate a program

There are some ways to terminate a process from whithin the process itself(also referred sometimes as “normal termination”):

  • exit()
  • _exit()
  • return

(break can not be used, as it must be within a switch or loop.)

There are other ways to terminate the process from outside the process by sending a signal. The signal can be sent from a terminal by typing

kill [processID]

or by hitting CTRL+C after you run the program. This will send a SIGINT signal to the process. There are also ways to send signals from whithin the process, like for example using the function abort(). abort() will send a SIGABRT and thus terminate the process(if default handler for SIGABRT is used). However as the same signal can be sent outside the process by the user explicitly(from the terminal), we can easily put this way of terminating a process in the same basket with the other “abnormal” ways of termination.

Do something on exit( atexit )

This is not something that has an absolute connection with cleanup but I mention it for a deep understanding of what happens on a process’ termination.

On any normal termination of a process we can do some specific work. Maybe I want something printed on screen every time the process terminates normally:

void onexit(void){
  puts(&quot;Process terminated normally.&quot;);
}

int main() {
  atexit(onexit);
  /* do stuff here */
  return 0;
}

Now every time main returns, our onexit() function is invoked. Pay attention to that even “return 1;” would work as it is still considered a normal termination of the process. Remember, “normal termination” has not to do with the value returned, but rather with what caused the process termination(was it an outside signal or was just the exit function invoked?).

A more compilcated example using atexit() would be to free some memory after exiting the program:

#include
#include

void *a; //we need global scope

void* allocSomeMemory(void* a) {
  return malloc(10);
}

void freeSomeMemory(void) {
  free(a);
}

int main() {
  a=allocSomeMemory(a);
  atexit(freeSomeMemory);
  /* do stuff with variable a */
  return 0;
}

This will allocate 10 bytes to the global variable a. Then we add a handler(a function that takes care of something) to free the memory allocated whenever we exit the process.

Now this will work, however if it’s a practical thing to do is a subjective thing. The kernel is going to free all resources for the process anyway so we just add more delay to the termination by freeing memory explicitly on exit. However this can be good practice for debugging in some cases.

What is process cleanup?

Cleanup is just the state of bringing back the kernel’s resources to as they were before running a program. That means freeing memory, flushing buffers, closing files, removing the process ID from the process table in the kernel, decrementing counters for open files, removing kernel timers, sending signals to the parent of the process, and much more.

There are two main players when it comes to cleanup:

  • The kernel
  • The process

We will start with the process cleanup which is somehow bulkier to describe.

Cleanup from the process

Process cleanup can occur in both normal and abnormal termination scenarios. In normal termination the cleanup occurs when exit() or return from main occurs. In fact returning from main is going to invoke exit() automatically. The process itself has an overhead(which it got when we compiled it) that tells it what to clean and how. Cleanup of the process are considered these things:

  1. Do some last work if there is registered with atexit()
  2. Flush all unwritten buffered data
  3. Close open streams
  4. Remove all files created by function tmpfile()
  5. Return the exit status and control to the kernel

Personally I don’t think point 1 is so much of a cleanup in the broader sense, but it gives the opportunity to the programmer to add some default behavior on every normal termination that in many cases would be to tidy up things.

Now all this occurs with the call of exit(). There is an exit function that will bypass all first 4 steps and go directly to the 5th step. That function is _exit(). So calling _exit() instead of exit() is actually going to give control directly to the kernel.

“Ok, so what’s the big deal with using _exit() instead of exit() and vice versa?”

In most cases exit() is the way to go. However some times you don’t want the same things “cleaned twice”. One example is using fork(). With fork(), a child process is created. The child inherits a lot of things by the parent process and amongst others the parent’s buffers. If exit() is called from the child, the inherited buffers will be flushed. Later on when the parent also exits, it will flush its buffers as well. In this scenario we will get double output.

Using _exit() in the child, we bypass the flushing from the child process and thus we don’t get unnecessary side effects(like double output).

Cleanup from the kernel

No matter if exit() or _exit() is used, in the end kernel is the big reaper. We will not go too deep into what excactly the kernel does but a few points in the cleanup routine are:

  • destroying kernel structures that were created for the process
  • memory allocated for the process is freed
  • decrementing open files
  • sending signals to the parent process

At this point the process is dead, that is, it’s not loaded in the memory. A very few structures are still present in the kernel solely in case the parent process might be interested. This is what we call a zombie process.

In order for these last structures to be destroyed, the parent must wait() for the child process. Once that has happened, the zombie process disappears and all resources in conjunction with the dead process are free.


18 Comments

File descriptors explained

File descriptors are often used in conjunction with file input and output. However it is not that clear for many people what file descriptors essentially are and that makes it harder to code. That’s what I will try to elaborate in this article so that you really know what you’re dealing with when you close file descriptors, duplicate them, pipe them, etc. Notice that this article uses code and conventions from C.

Files vs File descriptors

First of all we’ll start with the difference between a file structure and a file descriptor. Imagine you have an array of files like this:

files[] -> file1 | file2 | file3

file1, file2 and file3 are file data structures. A file structure is an opaque data structure. Opaque means that we don’t really know how that data structure looks like and we don’t either bother about it. All we need to know is that a file structure represents a file on a hard disk, a USB stick or whatever other storage device.

Going back to our example, the file descriptors are the indices(plural for index) of the array. So the indices in the above example would look like this:

index 0 1 2
file structure file1 file2 file3

Why do we need file descriptors when we have file stuctures?

The truth is that a process keeps track only of file descriptors. The file structures are on the side of the kernel. So the reason is the same reason that we use pointers and not actual data structures when we program in general: to save space and time, or otherwise efficiency. (There is also the element of security but we will not go into that.)

In C code you will often see something like this:

FILE* myfile;

This is something totally different from the a file structure in the kernel that we saw earlier. FILE in C is just a wrapper, or simply said; a structure holding an other structure. In fact FILE in C is just a file descriptor with some extra bells and whistles, nothing more. So why use FILE in C when we can use the file descriptors? Well file descriptors are just numbers. Imagine if we had to open and close numbers all the time. It would be hard to keep track of what we are doing. Everything would be a mess! Except the much friendlier name of a FILE, the FILE structure in C lets us also use more advanced functions which can take a FILE as argument but not a naked file descriptors.

When you start a new process, three file descriptors are created by default. These three file descriptors are called the standard file descriptors and are given the numbers 0, 1, 2. If you remember the Unix mantra, it says that everything in a Unix system is considered a file. That is even true for hardware devices like your monitor and keyboard. In fact there are file structures in the kernel that are corresponding to just those. The file descriptors 0, 1, 2 are indices to these special files. To be more exact, 0 is the index corresponding to the “keyboard file” and 1 and 2 are indices corresponding to the “monitor file”.

Streams

Earlier we said that there are three file descriptors created by default: 0, 1 and 2. We said that 0 corresponds to the “keyboard file” in the kernel and 1 and 2 correspond to the “monitor file”. If we were to sketch all this on paper, it would look a bit like below.

Process1 has the three default file descriptors we talked about. Notice how we hide the “keyboard file” and “monitor file” in the kernel box. That is mostly for simplicity as there is a lot of things going on in the kernel. Other than that the user(in this case the programmer) is not meant to know the inner workings of the kernel. The only thing the programmer can see is the C-like FILE structures and the file descriptors so we play that way.

Now there is something more than file descriptors in this figure. It’s the arrows that show the flow of the data. These arrows, call them channels, call them buses, call them rivers, or whatever you like, have a special name: streams. Stream is just an abstract name to make it easier for the programmer to visualize what is happening with the data. It’s merely easier to talk about stream of data than to talk about indices and the file structures in the kernel that those indices correspond to.

Now the three default file descriptors we talked about earlier, have in fact been baptized with their own special names: stdin, stdout and stderr.

These names are just abstract words that we use to talk about three specific channels of data(in most cases characters). Stdin is the data that we get from the user. Stdout is the flow of normal output to the user and stderr is the flow of output when errors happen in the program. Hopefully now it makes sense why file descriptor 0 is corresponding to the keyboard and file descriptors 1 and 2 are corresponding to the monitor. That is the program gets input from the keyboard while output goes to the monitor.

Now in C and many other languages there are three specific macros that are called stdin, stdout and stderr. These are not streams although they have the same name. That wouldn’t be possible anyway as streams is just an abstract idea(as said earlier) so that the programmer has it easier to visualize what is happening(even if it makes it a living hell for some). The stdin, stdout and stderr macros in C are just pointers to FILE structures(the C-like ones). You can in fact use these functions as you would use any FILE when programming in C. For example see the code below.

fprintf(stdout, "linux");
fprintf(stderr, "meerkat");

Will print “linuxmeerkat” on the screen. A question arises: Why do we have two macros that direct data to the same place? This is the same question as: Why do we have two streams to the monitor? The answer is rather simple. Many times when we develop a program we need to output errors on a different channel than the normal output. It’s just more neat to keep different things separated than mix them. Think for example how easier it is now if we want to hide all error messages in our program or just redirect them to a different place than the monitor.

Finally here is a table with the standard file descriptors(fd) and the corresponding standard streams:

fd stream
0 stdin
1 stdout
2 stderr

File descriptor tables

It’s an important detail to understand that file descriptors are the only file-relevant thing that a process can keep track of. As said earlier FILE structures in C are just a wrapper for a file descriptor so they can also be thought as file descriptors to justify the fact that a process has knowledge of only file descriptors. Now, each process keeps its unique file descriptor table. Say that we have two processes: process 1 and process 2.

process1
0
1
2
process2
0
1
2

When a new process gets created, file descriptors 0, 1 and 2 are created automatically and mapped to stdin, stdout and stderr.

To the eye the above two file descriptor tables look the same. That is, they have the same numbers. However the file descriptor 0 in process1 can be pointing to a totally different thing than the file descriptor 0 in process2.

(You may ask yourself: “How can file descriptor 0 in the two processes point to different things, if as we assumed, file descriptors are indices?”. Well that is a very logical question and the explanation is rather simple. A file descriptor is bundled with a pointer(in the abstract meaning), or if you like, an index to the global Open File Table. That is however nothing we should be worried about, so it’s not shown in the diagrams of this article.)

Just because we say that file descriptors 0, 1, 2 are standard it doesn’t mean that they are always going to correspond to stdin, stdout and stderr. Standard streams are as we think until the programmer decides it’s time to change things around. That is something that I will demonstrate. Bellow you see a visual representation of file descriptors and their connection with the kernel.

This is how the processes look like if we assume that file descriptors are not altered in any way. The arrows show the flow of data. The keyboard and screen icons in the kernel are file structures of the special files. The truth is that things are more complicated in the kernel. However now we focus on keeping track of the file descriptors. That’s also the only thing we can alter directly from inside the process. However don’t assume that because there are two icons that there are only two file structures.

Now let’s change the file descriptors in the first process a bit and see what happens:

close(1);                             //we close the stdout stream
FILE* f=fopen("myarticle.txt", "w");  //open a file for writing
fprintf(stderr, "file has fd: %d\n", fileno(f));

Pay attention that we fprintf() to the stderr as stdout is closed. From the code above this is what we get printed when we run the program:

file has fd: 1

From this output it’s crystal clear that file descriptor 1 is not pointing to stdout anymore. In fact it’s pointing to the file myarticle.txt. How do I know? The reason I know is that the usual behavior of the kernel is whenever we try to create a new file descriptor to give it the lowest number possible. That keeps things clean. Following this paradigm, after we close the stdout stream in the example, we immediately create a file for writing. This file needs a file descriptor. The kernel sees that the lowest number that can be used is 1 so the file gets that index number.

What about the file in the kernel that already has index 1? you might ask. As we said, the kernel is a bit more complicated than shown in the above figure. The kernel doesn’t mess files from different processes, even if they are the same files. So file descriptor 1 from the first process corresponds to a totally different file structure in the kernel than the file descriptor 1 in the second process. That’s something that a programmer shouldn’t worry about. As we said.. care only about the file descriptors. The kernel is a magician that you shouldn’t be aware of how his tricks work.

Here follows a figure with the file descriptors and the kernel after we ran the above code. The file description tables of both processes look like before. However look how the streams on the first process are now differentiated!

We see a new arrow there pointing to a new icon in the kernel. That icon is a new file data structure created in the kernel. The new arrow is a stream like stdin, stdout, stderr. The only difference is that we don’t have a standard name for it. However we can clearly see from the figure that the data is flowing to a file and particularly the file myarticle.txt. Notice that stderr is still flowing to the monitor. That’s also the reason we can still print text on fprintf() to the screen. If we closed this channel then we wouldn’t be able to output anything on the screen anymore.

TIP: When you want to sketch the file descriptors in paper I find it convenient to write the file descriptor in the first column and in the second column the stream-name or filename in case the stream is nameless. In this example I would write it like bellow.

process1
0 stdin
1 myarticle.txt
2 stderr

Why clones are helpful (duplicating file descriptors)

When we say that a file descriptor is removed or closed, it means that the file descriptor is destryied! Deallocated. No returning back. Nada! It’s gone forever and ever! We have lost it and with it also lost the stream it was connected to.

Now I want to remind you of the popular so called memory-leak in C. Say we allocate some memory in a function for a pointer. If we have only one pointer to the allocated memory space and somehow we lose it, we automatically get a memory leak as there is no way we can reach the memory where the pointer was pointing at. In C a solution would be to have a backup pointer and that’s also a solution used with file descriptors.

When we duplicate a file descriptor with dup(), what we actually do is making a second index for a file structure in the kernel. Let’s take an example:

int newfd;      //we declare a new file descriptor
newfd=dup(1);   //we make it a clone of file descriptor 1
printf("newfd: %d\n", newfd);

Output:
newfd: 3

The new file descriptor newfd is a clone of file descriptor 1 (stdout). Otherwise said, file descriptor 1 and 3 point to the same file structure in the kernel. That means that destroying file descriptor 1 is not going to have any effect on newfd. See the bellow code where we continue on the same example.

close(1);                 //destroying file descriptor 1 (stdout)
close(2);                 //destroying file descriptor 2 (stderr)
dprintf(newfd, "test");   //sending some data to the cloned file descriptor

Output:
test

As you see we destroyied all file descriptors to the monitor. However as we had made a backup file descriptor of 1, we can still print on the screen. By the way dprintf() is the same function as fprintf() with only difference that it takes a file descriptor as parameter instead of a FILE.

Here is also a visualisation of what we did:

I hope you can clearly see now how dup() works. The function merely duplicates a file descriptor. You might wonder about something however. If file descriptors are numbers, then why not just copy the file descriptor number to a new integer variable? Something like this:

FILE *f=fopen("test.txt", "r");   //open a file
int filefd=fileno(f);             //get file's fd
int filefd2;                      //a secondary fd to the file
filefd2=filefd;                   //this is where we "duplicate"

This is wrong and will not work. You see we don’t make a new file description in this case. filefd2 and filefd are going to have the exact same value and thus being the same exact index. The idea of duplication is to make a new file descriptor, a new index. In this example if we delete filefd, there is no way to access the file structure in the kernel through filefd2 as filefd2 and filefd are the exact same thing and thus deleting one is like deleting the other.

The below code is the correct way to do it.

FILE *f=fopen("test.txt", "r");   //open a file
int filefd=fileno(f);             //get file's fd
int filefd2;                      //a secondary fd to the file
filefd2=dup(filefd);              //this is where we "duplicate"

If filefd gets deleted now, we can still access it through filefd2.

A few words on pipes

I was concidering leaving pipes outside of this article. However I want people to have a full grasp of the tight relation between pipes and file descriptors so I will mention the basics on pipes here.

Pipe is essentially a pair of file descriptors. One file descriptor is used for input while the other file descriptor is used for output. Now what we feed to the one file descriptor will magically pop out the other file descriptor. Binary, characters, integers, everything is welcome and works. You can think of a pipe as a physical pipe that whatever you drop on the top end, will get out from the end at the bottom.

We declare a pipe just as an array of two integers like this:

int mypipe[2];

Now this alone doesn’t do anything. We have to tell the kernel to setup the pipe for us so that we can use it:

pipe(mypipe); //initializing the pipe

Now we have our fully working pipe.

PipeAs we said data goes through one file descriptor and comes out through the other one. The file descriptor that takes input is mypipe[1] and the one used for output is mypipe[0]. Notice that the numbers 1 and 0 are not file descriptors. They are the indices of the pipe. You should be very super extra careful on which end of the pipe is supposed to get data and which end is supposed to give data.

Programmers are probably some very egoistic bastards. I refer to the people that implemented the pipes in the kernel and you will see why. When it comes to pipes we refer to the write end of the pipe or the read end of the pipe from the view of the programmer and not the pipe itself. So for example the write end of the pipe is mypipe[1] while the read end is mypipe[0]. However if you are a plummer or an electronics person then you are familiar with reading the input/output from the view of the pipe itself. So be extra careful on that detail! You might spend endless hours, days or weeks debugging code just because you mixed the read end with the write end.

Now let’s test our pipe:

char buffer[5]="";
write(mypipe[1], "test", 5);      //writing to pipe
read(mypipe[0], buffer, 5);       //reading from pipe
puts(buffer);

This should output the text “test” on the terminal.

I will not go any deeper into pipes as that needs its own article in my humble opinion. I hope I made it a bit clearer what file descriptors really are and their relation to pipes and files.