LinuxMeerkat

I swear! Meerkats can do Linux


Leave a comment

Mastering sed

sed
sed is a very famous tool to the UNIX* community but which is very often misused. Most people try to use it in cases that it’s not the right tool or they tend to use it in the wrong way. I try in this post to show the things that are worth knowing when it comes to sed. I thus skip things like buffer holders, labels etc. which just make scripts totally unreadable for no benefit. I also talk a bit about the inner-workings of sed so the user has a grasp of why sometimes things don’t work as expected.

Why even learn sed?

For two reasons:

  1. You can automate any boring mechanical work you would do on a normal text editor
  2. If you know the syntax of sed, then you are a better vi/vim user

The second point, becomes obvious when you realise that the two programs have similar if not the exact syntax. For example, substituting the word “cow” with the word “horse” in the third line of the document in vim is :3s/cow/horse/ while in sed it’s 3s/cow/horse/. See the magic?!

A bit of history

Sed, awk and grep are the offsprings of a line editor called ed. ed pretty much let the user to edit one line at a time. That is also the reason that sed, awk and grep work on lines. All three programs have inherited the syntax of ed to some extend. In ed, to search & replace the word “cow” with the word “milk” in a text document, someone would type s/milk/beer/. That is exactly the same command used in sed – an indication of ancestry.
The tool grep actually takes its name from the command g/re/p, a command used in ed to only show lines that contain a specific regular expression. The ‘p’ in the end means to print on the screen while the ‘g’ in front means to go through all the lines.

When to use sed

While all three of these tools (grep, sed and awk) work on lines, sed and awk are very similar to each other while grep is more of a loner. Grep is used merely to filter out (or in) a line based on a regular expression. Sed and awk offer much more. What differs sed from awk is the data that they were built to edit.
Awk should be used when every line in the file has a specific structure. In other words that includes files where each line has a specific number of fields with every field separated with a delimiter (in most cases a tab). Such files can be CVS files, tables, the output of ls in linux, and more.
For everything else, use sed. Common examples are raw texts, this post, a C file, a script, an HTML file, etc. What all these files have in common is that lines don’t have a specific structure: the first line can have one word, while the second can have 100 words etc. Sed can still edit files that awk edits, but the opposite is most times impossible. If you are trying to just do that then you are most probably using the wrong tool.

Syntax

Sed in the terminal

The common syntax for sed in the terminal is:

sed SCRIPT INPUTFILE

The meat of sed is the SCRIPT and that is pretty much what I cover in this post. It’s a good convention to put quotation marks around it in case there is a space inside it (the shell might interpret it as multiple commands then).
sed can have multiple SCRIPTS or it can use a file with commands.

Multiple script lines:

sed 'SCRIPT1; SCRIPT2; SCRIPT3;' INPUTFILE

Using a script file:

sed -f SCRIPTFILE INPUTFILE

The SCRIPTFILE should have a command on each line. So SCRIPT1 should be on separate line than SCRIPT2 and SCRIPT3 on a separate etc.
Thoughout

Sed’s script syntax

Sed uses three things to accomplish tasks:

  • line specifiers (address)
  • commands
  • flags

The syntax of a single SCRIPT line is:

<line><command><flag>

A line or line specifier is a way to specify which lines you want the command to affect (parse). If the line specifier is missing, then the command affects all lines, which is the default behaviour.

A command is denoted by single character. For example to replace (substitute) a word with an other word we use the command ‘s’:

s/word1/word2/

A flag is used to modify the <command> or the <line>’s behaviour a bit and is placed after the whole command or line. Using the flag g on the example above, we get:

s/word1/word2/g

g stands for global and is used to replace all word1 in the line. Without it, only the first occurrence of word1 in the line is replaced.

A flag goes hand in hand with regular expressions so they can only be used if <command> or <line> have a regular expression in them. If the <line> is specified with a number for example, it’s illegal to use a flag:

#Illegal
sed -n '6g'

The reason is that flags are made for text strings (patterns) so it doesn’t make sense to sed when you are telling sed to use the flag ‘g’ on a line (which is something else than a string) and not a pattern.

The minimum thing needed on a SCRIPT line is a command or a line specifier. Someone can have both, one of them – with a flag or without. These combinations (or permutations to be exact) are allowed:

<line>
<line><flag>
<command>
<command><flag>
<line><command>
<line><command><flag>
<line><flag><command><flag>
<line><flag><command>

A flag applies to the command or line before it and assume that the previous has a regular expression in it.

How sed works internally

Imagine we have the file list.txt with the lines:

Today I will drink my milk
and afterwards I will eat a cow.
The cow will taste like cow.

Sed works with lines. As stated earlier we can have multiple script lines seperated with question marks:

sed 's/Today/Tomorrow/g; s/Today/Next Friday/g' list.txt

sed has a working buffer for each line (called pattern space). sed will initially load the first line of list.txt in the buffer. Then it will go through all script lines one by one altering everything in-place (in the buffer). In our example the buffer for the first line initially has:

Today I will drink my milk

After the first script command executes, it becomes:

Tomorrow I will drink my milk

The second script line doesn’t alter anything as sed can’t find an occurrence of the word ‘Today’ since it just got altered.
Once the first line is done, sed will load the second line in the buffer and go through the same procedure until all lines in list.txt have been parsed.

Something important to notice is that each SCRIPT line will run regardless if the previous SCRIPT line succeeded or not. With success we mean that the command did what it is meant to do. If substitution is used, then we define success as the alteration of a line. If we just specify a line, then success is if the line exists etc.

Addressing specific lines

<line><command><flag>
By default, sed goes through all the lines. However, one can address a specific line or a range of lines. That can be done by specifying lines by their number in the file (1st line, 2nd line, 50th line etc.) or by a line’s content.

To run a command on the 10th line we do:

10<command>

To run a command on each line that contains the word “cow” we do:

/cow/<command>

The latter makes use of regular expressions. When we use regular expressions we need to add slashes to the start and end of the regular expression.

For a range of lines we use a comma (awkward, I know). To specify all lines between the 10th and 25th line (including those) we would write:

10,25<command>

If we want to specify a range of lines by using regular expressions we still have to encapsulate the regular expressions in slashes. To run a command on all the lines between the first line found with the word “cow” and the first line found with the word “grass”, we would issue:

/cow/,/grass/<command>

Bellow you can see all the ways for specifying lines.

<line1>,<line2>
Lines between <line1> and <line2> (including those)
<line1>~N
Every Nth line after <line1>
<line1>!
All lines except line1
/<regex>/
Lines matching the regular expression
$
Last line
<line1>,+N
All N number of lines after <line1>

Some more practical examples can be seen bellow. The command p is used to print the line that is specified. (Notice that sed needs the parameter -n for the command p to work.)

sed -n '2p'             -> print line 2
sed -n '2,4p'           -> print line 2 to 4
sed -n '$p'             -> print last line
sed -n '2!p'            -> print all lines but line 2
sed -n '/red/p'         -> print every line that contains the word red
sed -n '/red/,/green/p' -> print all lines between the <strong>first occurrences</strong> of the words 'red' and 'green'

Of course for all these examples to work you need to either feed sed with a stream from a file or a pipe.

Using commands

<line><command><flag>
Now we are to the meat of all meats.I have explained the substitution command a bit but bellow you can see all the commands with their syntax (if they have one).

s/<regex>/<subst>/ substitute Replace a match (using regular expression) with a string.
p print Prints a specific line or a range of lines. The sed flag -n should be used for this command to work.
= line number Shows the number of the line
d delete Deletes (omits) the matched line
y/<char1>/<char2>/ transform Substitutes char1 with char2. Works even with a sequence of characters. For example y/abc/ABC/ will replace a with A, b with B and c with C.

Substitution

Substitution is the most commonly used command. It’s syntax is as follows:

s<d><regex><d><subst><d>

where <d> is a delimiter which should be a single character. Most commonly the delimiter is a slash / but it can essentially be any character. All lines bellow are equivalent.

s/cow/horse/
s_cow_horse_
sDcowDhorseD

In the example above the first occurrence of the word cow on each line will be replaced with the word horse, which is the default behaviour. If you want all occurrences of the word cow in a line to be substituted, the flag g (global) has to be appended to the line:

s/cow/horse/g

The substitution field <subst> can take some special variables like the ampersand symbol “&” which holds the matched string from the regular expression. There are also a few macros to automate conversion of capitals to lower-case and vice versa.
All these are shown in the table bellow.

& Holds the matched string.
\1 Holds a part of the match specified in the regular expression with parentheses.
\U<string> Converts all letters in <string> to capitals.
\u<string> Converts the first letter in <string> to capital.
\L<string> Converts all letters in <string> to lower-case.
\l<string> Converts the first letter in <string> to lower-case.
<string>\E Ends the conversion at specific point. Should be used in conjunction with \L and \U.

The match holders \1 \2 \3 etc. have to be specified in the regular expression with parentheses. The parentheses themselves have to be escaped or else sed will be looking for parenthesis characters in the line. So to replace each word “cow” with “supercow” we can do:

s/\(cow\)/super\1/

or

s/cow/super&/

The latter is more elegant of course. However there are two cases where the specific holder has to be used:

  1. When we want only a portion of the matched string and not the whole string (&).
  2. When there are many different strings you want to grab from a match.

For example this can’t be solved by merely using the & symbol:

s/\(\w*\) cow \(\w*\)/\2 cow \1/

This script will look for each occurrence where “cow” has a word before it and a word after it and it will change their order. The \w matches any character while the asterisk * tells the pattern that the word can be arbitrarily long. We use the parentheses around the first word and the second word to denote which matched parts of the regular expression should be given to \1 and \2. In <subst> we just reverse their order by putting the second word first and the first word second.

Flags

<line><command><flag>
A flag can be used on a <command> or a <line> or both.

g global For all occurences in the line (default is to stop at first occurence)
I Ignore case Not case-sensitive
p Print Output only this line (not everything as default). The sed flag -n should be used for this flag to work.

Find line in lines of lines in lines..

An important concept to comprehend in sed is nesting. I try to leave out all the advanced things sed offers, like holder buffer (horror to read), labels etc. but nesting is worth learning as it gives a lot of extra power for a little learning curve. Nesting is similar to the IF..THEN conditional.

We have this text file:
Today I will drink my milk
and afterwards I will eat a cow.
The cow will taste like cow.
Today is not afterwards if I am a cow. Right?

Imagine that we want to check if the last line of the text contains the word cow. Someone might think that putting $p together with /cow/p would work:

sed -n '$p; /cow/p'

Admittedly the output seems strange:
and afterwards I will eat a cow.
The cow will taste like cow.
Today is not afterwards if I am a cow. Right?
Today is not afterwards if I am a cow. Right?

The reason this doesn’t work as expected is that the second script runs regardless if the first one succeeded or not. Thus in the first line none of the scripts print anything. On the second line, the first script fails but the second script finds the word cow so it prints the line. The same happens at the third line. At the forth line, the first script succeeds as the line is the last line of the file, so that line gets printed. Then the second script runs and that succeeds. Thus we get the line printed again (a second time).

A way to solve this is is to nest the second script line in the first somehow so that it runs only if the first script line has succeeded. This is similar to the pseudocode:

if <line1>
  then SCRIPT

Where <line1> is a line specified by a number, range or pattern (regular expression).
The syntax for nesting script lines in sed is

<line>{<line><command><flag>}

For our example:

sed -n '${/cow/p}'

This translates to: if this is the last line ($) then do whatever is in the brackets. So everything in the brackets will be checked only if the current line being parsed is the last one.

We can even nest inside a nested script:

sed -n '2,4{/cow/{/Today/p}}'

This script will go through lines 2 to 4. It will check first if a line contains the word cow. If the line contains the word cow, then it will check if it contains the word Today. If it does, it will print it. Ofcourse there is no practical reason to have second braces in the above example. I just wrote it to show that it’s feasible. In some cases you need braces to accomplish tasks, especially in cases where we avoid using too advanced things.

Examples

Printing a specific line

sed -n '2p'

The p in this case is the p command (and not flag). Remember that flags apply only to regular expressions. When we use sed’s -n parameter, only lines specified with the p command or flag will be printed on screen.

Printing a range of lines

sed -n '1,3p'

The comma is used to specify a range. In this example we specify line 1 to line 3 and then we print each such line.

Hiding a specific line

sed -n '2!p'

This is similar to:

sed '2d'

Show the last line

sed -n '$p'

In regular expressions the dollar sign $ denotes the end of a string. In sed when it’s being used with substitution it denotes the end of a line.
However when used with the command p, it denotes the last line of the input. In the same way the regex symbol ^ denotes the first line.

Converting a specific word to uppercase

sed 's/Today/\U&/' list.txt

This will match any word ‘Today’ and replace it with ‘TODAY’. In the replacement the escaped U (\U) tells sed to convert to uppercase everything following in the replacement string. In this example we use an ampersand which in sed represents the matched string. If we wanted to stop the conversion we just need to add \E where we want it to end.

Converting all text to uppercase

sed 's/.*/\U&/'

For this we use the substituion command s.
.* is a regular expression which fits any sequence of characters. As sed is working on lines, .* matches a whole line each time.

Converting all text to lowercase

sed 's/.*/\L&/'

Similar to the previous example with only difference that we use \L instead of \U.

Grabbing all content between the body tags in HTML

Say we have the ugly HTML code bellow and we want to grab all the content between the body tags.

<html><head></head>
<h1>h1 outside of body</h1><body><h1>h1 stuck to body</h1>
<img src="images/soon.png"/>
<p>a paragraph</p></body></html>

Most sed experts would go about using some very advanced commands to accomplish this. The readability of that becomes horrific. As my opinion is that you can do the same things without knowing all the advanced commands I am going to just do that.
Notice that I use pipes instead of using advanced commands.

Solution:

sed -n '/<body>/,/<\/body>/p' | sed 's_.*<body>\(.*\)_\1_; s_\(.*\)</body>.*_\1_'

It might seem like a mess but I can assure you that it’s much more elegant than a pure single sed SCRIPT solution. I will break it down so you can see how it works.

/body/,/body/p

matches all lines between the body tags, including the body tags. In this way I have minimized my problem to:

<h1>h1 outside of body</h1><body><h1>h1 stuck to body</h1>
<img src="images/soon.png"/>
<p>a paragraph</p></body></html>

After that, we are sure that the first line has the start of the body tag and the last line has the closing of the body tag. So we start by filtering out things we don’t need from the first line: the tag itself and everything preceding it.

s_.*<body>\(.*\)_\1_

I use _ as a delimeter instead of slashes to make the substitution code a bit more readable. The regular expression .* matches 0 to infinite number of arbitrary characters. So I use it around the body tag in case there is something before and after it. I put the second .* in parentheses to grab the text that might be there as I want to keep that. Using \1 in the substitution field I accomplish substituting the whole line with the text after body.

s_\(.*\)</body>.*_\1_

We do something similar to the line with </body>. The only difference is that now the portion of the match that we are interested in is the one before the body tag so we move the parentheses there.

Keep in mind that this solution will not work if the body tag includes some attributes like style. Someone might think that just using the bellow regular expression in the substitution would work.

.*<body.*>\(.*\)

Notice that the only difference is that we added the regular expression .* between body and its closing arrow > to point out that there can be nothing in between or there could be some arbitrary things (in our case attributes).

That regular expression doesn’t work however as it matches the last > in the line. The reason is that in sed .* is greedy and there is no way to make it non-greedy. With greedy we mean that the pattern will try to match as much as it can in the line. So if you want to match the first > it’s not possible. Or.. actually it is possible but the code as you will see in the next example starts looking like a monster.

Grabbing all content between tags in HTML

You are probably better off learning Perl or Python if you need to do these kind of “advanced things”. I will however show that you can achieve things like this without using the more advanced commands or other programs/languages. This solution is a continuation of the previous example on catching the content between the body tags. The mere difference is that in this solution we allow even attributes to a tag and are a bit more permissive towards whitespaces. This makes it a more general solution that can be used for other tags than the body tag.

Solution:

sed -n '/<body>/,/<\/body>/p' | sed '1s_.*<body[a-zA-Z0-9="\x27_ ]*> *<\(.*\)_<\1_; 1{/<body>/d}; s_\(.*\)</body>.*_\1_'

Essentially the only thing that got changed from the previous example is the alteration of the commands in the first line:

's_.*<body>\(.*\)_\1_;'

to two commands:

'1s_.*<body[a-zA-Z0-9="\x27_ ]*> *<\(.*\)_<\1_; 1/{/<body>/d};'

The first script line (everything before the first ;) looks for a second tag after body. We don’t really need to specify line 1 but it is a good convention as it makes the code easier to understand and the processing faster. I am looking for the body tag followed by an attribute or not. TO define an attribute in HTML, only the characters in the brackets are allowed (says the HTML protocol, not me). \x27 is the code for a single quote mark . The reason I use its code instead of the mark itself (I use the double quote after all), is that I use the single quote marks around the whole command so if I insert it in the expression, then it will break it. After that I use ” *<" (notice the space) to denote that there might be an arbitrary number of spaces or none before the opening of the new tag. I replace everything with the opening of the tag after body with the rest of the line.

If the first command didn’t succeed then it means that there’s not a second tag after the body. Thus it’s safe to delete the whole line in that case. First we check if there is a body tag in the first line. If there is a body tag (/<body>/), then we delete it with the command d.


Leave a comment

Converting a URL string into Json

I was making a website the other day and I wanted to somehow pass variables that can be read with javascript. So if the user browsed to
http://www.example.com?height=100px&width=50px, the variables height and width should be read from javascript. Note that this method of passing the variables in the URL is used most notably for CGI, aka server-side scripting(PHP anyone?).

JSON

So I was a bit puzzled while looking around at stackoverflow as many people do think that there is something magical about Json. Json is nothing more than a standard on how things are stored. The standard pretty much sums up to this:

  1. Variables are stored as varName = value
  2. Arrays are stored as arrayName = [value1, value2, value3 .. ]
  3. A value can be: a number, a string, true, false, none
  4. The whole thing is encapsulated in wavy braces

For my example, the Json structure(after parsing the URL string) should look like this:

{
   "height": "100px"
   "width" : "50px"
}

In this case the values are strings. Keep in mind however that according to the standard, they could be anything between a number, a string, true, false and none.

My parser

So to get this structure from the URL, I needed some kind of parsing. All the solutions I found either used regular expressions, or they wanted a whole library to be imported, or just didn’t support arrays. So I made up my own ugly solution.

The function url2json() uses pure JavaScript code which doesn’t use regular expressions and accepts both arrays and variables:

function url2json(url) {
   var obj={};

   function arr_vals(arr){
      if (arr.indexOf(',') > 1){
         var vals = arr.slice(1, -1).split(',');
         var arr = [];
         for (var i = 0; i < vals.length; i++)
            arr[i]=vals[i];
         return arr;
      }
      else
         return arr.slice(1, -1);
   }

   function eval_var(avar){
      if (avar[1].indexOf('[') == 0)
         obj[avar[0]] = arr_vals(avar[1]);
      else
         obj[avar[0]] = avar[1];
   }

   if (url.indexOf('?') > -1){
      var params = url.split('?')[1];
      if(params.indexOf('&') > 2){
         var vars = params.split('&');
         for (var i in vars)
            eval_var(vars[i].split('='));
      }
      else
         eval_var(params.split('='));
   }

   return obj;
}

To keep things clean, all values are parsed into strings. As the input of the function is a string it just makes sense to give back strings so no extra processing takes place if it’s not needed(checking if every value is of a certain type). It’s up to the user to convert the strings into numbers or whatever they want, if they really have to.

Parsing variables

To use the function with the example above, I would just run

obj = url2json("http://www.example.com?height=100px&amp;width=50px
");
console.log(obj.height);
console.log(obj.width);

Launching the console in the browser(CTRL+SHIFT+K), we get the results:

"100px"
"50px"

Parsing arrays

Arrays are parsed like this

obj = url2json("www.x.com?numbers=[100,45,88,90]&mixed=[red,56,blue,20]");
console.log(obj);

The object logged in the console looks like this:

{
   "numbers" : ["100", "45", "88", "90"]
   "mixed" : ["red", "56", "blue", "20"]
}


16 Comments

Emailing from a gmail acount via Telnet

Telnet is a small amazing tool that can be used for pretty much everything concerning networks. Telnet sends ASCII strings directly to a host. So as long as a protocol supports ASCII messages, we can communicate with a host with any protocol we want. Some examples are SMTP and HTTP.

In practise this method is being used to test things. If you are developing a web-server for example it would be easy if you could test your web-server directly without having to launch a browser and type a command. It’s a blessing for automation.

Here we will try to send an email by using the SMTP protocol. We will use telnet for that, by sending and receiving raw messages with the SMTP server.

Preparation

First we need to install telnet with SSL support. All SMTP servers use SSL encryption nowadays so using simply telnet will not work. The telnet shipping with Ubuntu does not support SSL so we need to upgrade it:
sudo apt-get install telnet-ssl

To run the new telnet-ssl we just issue telnet in the terminal as before. The difference is that telnet now supports some new flags, namely -z, which is used for SSL.

We are also going to need http://www.base64encode.org to encode our username and password to base 64 format. Keep in mind that base 64 does not encrypt a message. It just changes the way the information is stored. In practise we use base 64 to be able and send any kind of data as simple characters. The real encryption is under the hood however. Check Extras for more info.

Steps

  1. Start telnet
  2. telnet -z ssl smtp.gmail.com 465
    The flag -z ssl tells telnet to use SSL over the connection. smtp.gmail.com is Gmail’s server domain and 465 is the port used by the server.

  3. Handshake with server
  4. HELO yo
    The text after the HELO is supposedly your domain or literal address. Essentially you can enter whatever you want. Pressing enter, the exact message arrives at the mail server, which in response should reply with
    250 mx.google.com at your service

  5. Login on gmail
  6. AUTH LOGIN
    This message is self-explanatory. After the server receives this message, it’s going to ask for our username. The response looks like this
    334 VXNlcm5hbWU6 and indicates that the username should be entered.

  7. Username
  8. Go to http://www.base64encode.org and encode your username (username@gmail.com) to 64 base. Copy and paste the encoded string in your telnet session and hit enter. If the username is accepted, you get back the message bellow.
    334 UGFzc3dvcmQ6
    This message indicates that the password should be entered.

  9. Password
  10. Do the same as above but instead of your username, encode your password. Then paste the password and hit enter. If the username/password combination is correct, you are greeted with the message
    235 2.7.0 Accepted

  11. Email sender
  12. First we are going to add the sender of the email:
    MAIL FROM: <username@gmail.com>
    For gmail it doesn’t really matter who you set as sender. We logged in our gmail account when we sent the AUTH LOGIN message to the server. Gmail has as a mechanism to set the sender automatically to your email address.
    The response back is
    250 2.1.0 OK js17sm40481494lab.5 - gsmtp

  13. Email recipient
  14. Now we will add the recipent of the email. Here we need to provide a valid email or else the email will not go any further. If the email doesn’t exist, we will probably get it back into our inbox which is the default behaviour for undelivered emails.
    RCPT TO: <username@gmail.com>
    On success we get
    250 2.1.5 OK js17sm40481494lab.5 - gsmtp

  15. Email body
  16. Now we want to start typing the email. We send the following command to the server so that the server knows that what follows is the actual email.
    DATA
    The response from the server is
    354 Go ahead js17sm40481494lab.5 - gsmtp

    Now we need to set the subject(if we want).
    Subject: test
    and everything else following is considered the text of the email

    This is a line in the email.
    This is a second line in the email.
    .

    The dot on its own line in the end is to signal the end of the email. After that, a response from the server should come:
    250 2.0.0 OK 1381416452 js17sm40481494lab.5 - gsmtp
    Now the email has been sent.

  17. Close telnet
  18. QUIT
    This is an actual command to the telnet client and not a message sent to the mail server.

Extras

EHLO vs HELO

EHLO can be used instead of HELO. The difference is that EHLO lists also all the commands that the server supports. EHLO is newer than HELO and is suggested to be used instead of EHLO. However HELO is always going to be supported, so in reality there is no big difference as long as the server supports the message.

SSL and encryption

Something important, as mentioned in the beginning of this page, is that encoding data to base 64 is not encryption but rather a different representation of the same data. It’s SSL which adds the encryption to our communication and you can be assured that everything is encrypted between the client and the server.

I captured the whole procedure of sending an email, with Wireshark to prove my point.
Screenshot from 2013-10-10 17:33:52_crop

The packets are in the order they were sent and received. We can see that telnet first contacts the DNS server to get the domain name of the mail server. 192.168.1.1 is my router and 192.168.1.2 is my computer from where I use telnet.

After we get the domain we do a first handshake with the mail server. It’s those three TCP messages with SYN/ACK in them. After that we initiate a TLS session. TLS and SSL are pretty much the same thing. They are just different versions for encrypted communication, with TLS being the newer one.

Everything we type and read from the time we start telnet is encrypted. The reason we see the messages clearly in the terminal is that they get decrypted by telnet on their arrival. When messages leave telnet, they get encrypted to travel on the wire. That’s why with Wireshark we only see the decrypted data, because we are looking at what is passing through the wire. I circled the TLS messages to make it more apparent that the whole session is encrypted by using TLS. The few TCP packets seen here and there, are packets with encrypted data. If you could check what these packets hold inside(with Wireshark), you would only see gibberish.

If instead of Gmail we were using an SMTP server that doesn’t use SSL, we would be able to intercept all the packets with their raw data, and thus the actual messages, passwords, usernames, etc.


Leave a comment

Traversing binary trees

It can be hard to remember how to traverse trees by name. It can also be hard to understand the difference between traversing in different ways(postorder, preorder, inorder, level order, etc). I try here to make it easier for the reader to understand the different ways of traversing a binary tree. Intuition is the key to remembering(and visualization ofcourse).

I use the words parents and children for the elements in the tree instead of nodes, root, branches and leafs. I think those words are easier to describe what is going on without getting anyone too confused.

Recursive traversal

I want to firstly point out that this method is commonly called “depth/height traversal”. However I use the word recursion as it seems more appropriate to me.
There are three ways to traverse a tree recursively. The only real difference between them is the order we visit the parent node.

  • Pre order ——- here we visit the parent in the beginning
  • In order ——— here we visit the parent second
  • Post order —— here we visit the parent lastly

Because recursion is hard to grasp and even harder to have a visual insight of the goings, we will use a simple example to begin with. We start by examining, with all three methods, this binary tree:
treebin_3nod_25dpi

Pre order

Here the path we follow to traverse is:

  1. Parent
  2. Left child
  3. Right child

preorder

Applied to our example tree:
preorder_example

So if we use pre-order to traverse the example binary tree, then we would get the values(by order we visited them):

4, 2, 7

In order

Here the path we follow to traverse is:

  1. Left child
  2. Parent
  3. Right child

inorder

Applied to our example tree:
inorder_example

So if we use in-order to traverse the example binary tree, then we would get the values(by order we visited them):

2, 4, 7

Post order

Here the path we follow to traverse is:

  1. Left child
  2. Right child
  3. Parent

postorder

Applied to our example tree:
postorder_example

So if we use in-order to traverse the example binary tree, then we would get the values(by order we visited them):

2, 7, 4

The bigger picture

All this seems simple for a tree with only three nodes. But what happens when we have a huge tree with multiple nodes? Which node would we visit first?

The answers are rather simple but there are not good resources actually to give a good visualization of how a recursive traversing can look like.

So let’s start by taking this big tree:
treebin_7nod_spaced

The trick is to start by seeing see the whole structure as a tree with three nodes. In all three traversing methods we start with the node on the top of the tree: the root node. We thus see that as the parent in the beginning of the traversing. All the rest of the nodes on the left is considered the left child and everything on the right of it is considered the right child.
treebin_7nod_spaced_cells_abc
The bigger complex tree has been compressed to a three-node tree. We can now start traversing it with whatever -order traversal method we want.

I will demonstrate how I would go with in-order traversal. Read this line by line.

  1. First I visit the left child: node b
    1. I now get into this subtree and visit the left child: 5
    2. Then I go to the parent node: 2
    3. Then I go to the right child: 9
  2. Then I visit the parent: node a
    1. Here there is no parent or children. The only thing I can do is to return the value of the single node: 4
  3. Then I visit the right child: node c
    1. I now get into this subtree and visit the left child: 6
    2. Then I go to the parent node: 7
    3. Then I go to the right child: 1

Note that 1, 2, 3 are traversal of the a, b, c nodes. All nested traversals are traversals for the nodes inside a, b and c.

An even more complicated tree could look like bellow. However exactly the same principles are valid.
treebin_nod12_cells
This looks a lot like a fractal, doesn’t it? That’s because fractals are actually based on recursion. The only difference with this example is that the recursion is not endless but instead is made of three different levels: the outer level where we see three cyan circles, the level where we see three green circles and the level where we see three nodes(inside each green circle).

Traversal by Breadth

All methods discussed earlier are actually what is called traversal by height or depth. I am not so sure as to why we call it that instead of traversal by recursion as the real difference between depth traversal and breadth traversal is:

  • Height traversal uses recursion.
  • Breadth traversal is linear, going from one node to the next as you see the whole tree.

Traverse by level

Traversing by breadth or by level(which is more intuitive) is the simplest method to traverse a tree. You can see the nodes of an arbitrary tree grouped into different levels depending on how close to the root they are. In the bellow tree I have numbered the different level of nodes:
treebin_7nod_levels
Node with value 4(the root) belongs to level 1. Nodes 2 and 7 belong to level 2. Lastly, nodes 5, 6, 9 and 1 belong to level 3.

To traverse the tree with traversal by level we just go from left to right on each level jumping from node to node like this:
treebin_7nod_levels_trav1

treebin_7nod_levels_trav2

treebin_7nod_levels_trav3

Thus we traverse the nodes in this order:

4, 2, 7, 5, 9, 6, 1

If the tree would be more complicated or not complete(missing nodes at some level) then we just jump over the missing nodes and get to the next one. An example follows bellow.

treebin_12nod

treebin_12nod_levels

treebin_12nod_levels_trav

The nodes visited in order:

4, 2, 7, 5, 9, 6, 1, 26, 13, 10, 8, 3


14 Comments

Installing Ubuntu on Samsung 5 (SSD+HDD)

I got some messages on Ubuntu Forums on how about installing Ubuntu on the Samsung 5 and especially the model NP530U3C-A07.
I make this guide to help whoever else gets into this problem as the installation is not as straightforward as it may seems.

Boot from USB

I assume you have a bootable USB with Ubuntu. Then follow these steps:

  1. Make sure you insert the USB stick in the USB2 slot and not the USB3(it has a blue contact)
  2. Disable “Fast BIOS Mode” from BIOS>Advanced
  3. Disable “UEFI” from BIOS>Boot
  4. Reboot, and in BIOS>Boot set your USB stick as the first device to boot from
  5. Save and reboot

Install

The Samsung NP530U3C-A07 and some others don’t use hybrid hard discs. What they have is a normal HDD and then a very small SSD of about 4 to 30GB. Both of them are seen as separate devices so it’s easy to distinguish which is which by their size.

On the bootup of Linux choose Install Ubuntu(or either Try Ubuntu and then Install from there). When you get to the point where you have to choose where to install(“Installation Type”),

choose “Something Else”. A new screen will pop out letting you choose where to install Ubuntu, format, make partitions, etc.  Follow carefully the bellow:

  1. Choose and format the smaller disk (SSD) to install Ubuntu
  2. Choose and format the bigger disk (HDD) to as many partitions you want. I format mine as a big ext4 to just store media. The thing to pay attention to is to not reformat this drive in the future as that would probably screw up the bootloader. That happened to me.
  3. On “Device for bootloader installation:” choose the HDD. It doesn’t matter if you choose the device itself or the partition on it.

After Install

Not everything works out of the box but luckily for us there are things that can fix most of them. Bellow are several fixes for different problems.

Flickering screen when changing brightness

  1. sudo add-apt-repository ppa:voria/ppa
  2. sudo apt-get update && apt-get install samsung-backlight

Fn keys not working

  1. sudo add-apt-repository ppa:voria/ppa
  2. sudo apt-get update && apt-get install samsung-tools

Monitor doesn’t “remember” the brightness after restart

I haven’t fixed this but a workaround is to manually set a default brightness value for every restart. You can do that by automatically editing the /sys/class/backlight/acpi_video0/brightness on every boot. For that we add a line on /etc/rc.local a file that runs on every boot of the OS.

Add echo 31 > /sys/class/backlight/acpi_video0/brightness to the file /etc/rc.local just before line “exit 0”

Your rc.local file should look something like this:

#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.

echo 31 > /sys/class/backlight/acpi_video0/brightness

exit 0

You can change the value 31 to what suits you. You can see what current brightness is with

cat /sys/class/backlight/acpi_video0/brightness

Extras

Mounting hard disc on bootup

Probably your hard disc isn’t mounting automatically on bootup as it doesn’t have any system files. For this we need to edit the fstab file and set the HDD to be mounted automatically on system startup.

  1. sudo gedit /etc/fstab
  2. Add an entry for your hard disc. To get the UUID of your hard disc, just type blkid. Add a new record with default settings and appropriate type. The fstab should look something like this in the end:
    #
    proc            /proc           proc    nodev,noexec,nosuid 0       0
    UUID=fdb6fc55-00a1-4ef6-b65b-27e14140f6c2 /               ext4    discard,errors=remount-ro 0       1
    UUID=900c5900-225a-4c43-953d-f226b8a4cef4 none            swap    discard,sw              0       0
    
    #Hard disc
    UUID=8af90249-881c-4e2a-b55c-85b6a66c3767 /media/Hitachi  ext4    defaults                0       0
    

Sombolic links(shortcuts) to HDD

If you use the directories Downloads, Movies, etc. in your home directory it should be wise to make shortcuts for them so they point to the hard disc. There are two major reasons for this. First of all, media files are not going to be “faster” if they are on the SSD so why sacrifice space by using the ultra small SSD and not the huge HDD? Something else to consider is that SSD drives have a specific number of write cycles before they die. Therefore it’s good to have all kind of files that are going to be written/rewritten all the time, on a HDD.

Symbolic links are the same thing as shortcuts in Windows. They are just files that point to an other place, in our case the hard disc.

  1. First make the apropriate directories on your hard disc. For me the structure for those directories look like this:
    /media/Hitachi/Downloads/
    /media/Hitachi/Videos/
    /media/Hitachi/Music/
    /media/Hitachi/Documents/
    /media/Hitachi/Pictures/
    
  2. Delete the already-there folders in home directory:
    rmdir ~/Downloads
    rmdir ~/Videos
    rmdir ~/Music
    rmdir ~/Documents
    rmdir ~/Pictures
    
  3. Make the symbolic links. BE CAREFULL now. You have to be in the folder where you want your symbolic links to appear in:
    cd ~
    ln -s /media/Hitachi/Downloads/ Downloads
    ln -s /media/Hitachi/Videos/ Videos
    ln -s /media/Hitachi/Music/ Music
    ln -s /media/Hitachi/Documents/ Documents
    ln -s /media/Hitachi/Pictures/ Pictures
    

To check that everything is done correctly type

ls -l | egrep "^l"

That shows all symbolic links in current directory.


10 Comments

Headless mp3 player

I am working on a greasemonkey script where I want to be able and play mp3 files in the background. The <audio> tag would be a good and easy solution but unfortunately Firefox doesn’t support mp3 playback because of license issues.

With headless we mean gui-less. So I was looking for a dummy mp3 player that would just do the playback while all the control would be javascript driven.

I found in the end this simple .swf file here that supports the follow:

  • Play
  • Pause
  • Stop
  • Loop on/off

The problem I faced, as usual, was zero documentation. Although the source code of the mp3 player is available, it was hard for me to decode how to use the player, especially when I am totally unfamiliar with actionscript.

In the end I managed and in fact I made two minimal templates to help out future users.

Steps

  1. Download this package. Alternative links: on 2shared, on speedyshare
  2. Unpack to your server.
  3. Browse to index.html or index2.html

Scenario 1

index.html uses inline javascript inside the html code to control the mp3 player. In particular it makes normal <a> tags that once clicked send a query to the mp3player.swf:

<a href="javascript:mp3player.playSound('song.mp3')">Play</a>
<a href="javascript:mp3player.pauseSound()">Pause</a>
<a href="javascript:mp3player.stopSound()">Stop</a>
<a href="javascript:mp3player.loopOn()">Loop On</a>
<a href="javascript:mp3player.loopOff()">Loop Off</a>

This is the fast dirty way to get things work without using an external javascript file.

Scenario 2

Index.html2 loads an external javascript file which in this case is controlplayer.js. This is much more flexible and lets you add advanced behaviour from inside the javascript file.

As you notice the html code is as simple as this

<button id="playButton">Play</button>
<button id="pauseButton">Pause</button>
<button id="stopButton">Stop</button>
<button id="loopOnButton">Loop On</button>
<button id="loopOffButton">Loop Off</button>

the key elements in this segment are the ids. As long as the ids are not changed you are free to change the tag type as you wish. For example

<h1 id="playButton">Play</h1>
<p id="pauseButton">Pause</p>
<div id="stopButton">Stop</div>
<p id="loopOnButton">Loop On</p>
<a href="javascript:return" id="loopOffButton">Loop Off</a>

does work as good as the previous code without breaking any functionality.

NOTICE: If controlling the playback doesn’t work once you unzipped, make sure that the files are on your website folder. Launching the .html files directly from a local folder(for example “desktop”) will not work. That is due to security reasons when it comes to flash.


Leave a comment

Who does the cleanup, the process or the kernel?

For a long time I believed that cleanup is work that the kernel does. However that is not completely true. Part of cleanup is also taken by the process itself. But let’s dive into the details..

Ways to terminate a program

There are some ways to terminate a process from whithin the process itself(also referred sometimes as “normal termination”):

  • exit()
  • _exit()
  • return

(break can not be used, as it must be within a switch or loop.)

There are other ways to terminate the process from outside the process by sending a signal. The signal can be sent from a terminal by typing

kill [processID]

or by hitting CTRL+C after you run the program. This will send a SIGINT signal to the process. There are also ways to send signals from whithin the process, like for example using the function abort(). abort() will send a SIGABRT and thus terminate the process(if default handler for SIGABRT is used). However as the same signal can be sent outside the process by the user explicitly(from the terminal), we can easily put this way of terminating a process in the same basket with the other “abnormal” ways of termination.

Do something on exit( atexit )

This is not something that has an absolute connection with cleanup but I mention it for a deep understanding of what happens on a process’ termination.

On any normal termination of a process we can do some specific work. Maybe I want something printed on screen every time the process terminates normally:

void onexit(void){
  puts(&quot;Process terminated normally.&quot;);
}

int main() {
  atexit(onexit);
  /* do stuff here */
  return 0;
}

Now every time main returns, our onexit() function is invoked. Pay attention to that even “return 1;” would work as it is still considered a normal termination of the process. Remember, “normal termination” has not to do with the value returned, but rather with what caused the process termination(was it an outside signal or was just the exit function invoked?).

A more compilcated example using atexit() would be to free some memory after exiting the program:

#include
#include

void *a; //we need global scope

void* allocSomeMemory(void* a) {
  return malloc(10);
}

void freeSomeMemory(void) {
  free(a);
}

int main() {
  a=allocSomeMemory(a);
  atexit(freeSomeMemory);
  /* do stuff with variable a */
  return 0;
}

This will allocate 10 bytes to the global variable a. Then we add a handler(a function that takes care of something) to free the memory allocated whenever we exit the process.

Now this will work, however if it’s a practical thing to do is a subjective thing. The kernel is going to free all resources for the process anyway so we just add more delay to the termination by freeing memory explicitly on exit. However this can be good practice for debugging in some cases.

What is process cleanup?

Cleanup is just the state of bringing back the kernel’s resources to as they were before running a program. That means freeing memory, flushing buffers, closing files, removing the process ID from the process table in the kernel, decrementing counters for open files, removing kernel timers, sending signals to the parent of the process, and much more.

There are two main players when it comes to cleanup:

  • The kernel
  • The process

We will start with the process cleanup which is somehow bulkier to describe.

Cleanup from the process

Process cleanup can occur in both normal and abnormal termination scenarios. In normal termination the cleanup occurs when exit() or return from main occurs. In fact returning from main is going to invoke exit() automatically. The process itself has an overhead(which it got when we compiled it) that tells it what to clean and how. Cleanup of the process are considered these things:

  1. Do some last work if there is registered with atexit()
  2. Flush all unwritten buffered data
  3. Close open streams
  4. Remove all files created by function tmpfile()
  5. Return the exit status and control to the kernel

Personally I don’t think point 1 is so much of a cleanup in the broader sense, but it gives the opportunity to the programmer to add some default behavior on every normal termination that in many cases would be to tidy up things.

Now all this occurs with the call of exit(). There is an exit function that will bypass all first 4 steps and go directly to the 5th step. That function is _exit(). So calling _exit() instead of exit() is actually going to give control directly to the kernel.

“Ok, so what’s the big deal with using _exit() instead of exit() and vice versa?”

In most cases exit() is the way to go. However some times you don’t want the same things “cleaned twice”. One example is using fork(). With fork(), a child process is created. The child inherits a lot of things by the parent process and amongst others the parent’s buffers. If exit() is called from the child, the inherited buffers will be flushed. Later on when the parent also exits, it will flush its buffers as well. In this scenario we will get double output.

Using _exit() in the child, we bypass the flushing from the child process and thus we don’t get unnecessary side effects(like double output).

Cleanup from the kernel

No matter if exit() or _exit() is used, in the end kernel is the big reaper. We will not go too deep into what excactly the kernel does but a few points in the cleanup routine are:

  • destroying kernel structures that were created for the process
  • memory allocated for the process is freed
  • decrementing open files
  • sending signals to the parent process

At this point the process is dead, that is, it’s not loaded in the memory. A very few structures are still present in the kernel solely in case the parent process might be interested. This is what we call a zombie process.

In order for these last structures to be destroyed, the parent must wait() for the child process. Once that has happened, the zombie process disappears and all resources in conjunction with the dead process are free.


19 Comments

File descriptors explained

File descriptors are often used in conjunction with file input and output. However it is not that clear for many people what file descriptors essentially are and that makes it harder to code. That’s what I will try to elaborate in this article so that you really know what you’re dealing with when you close file descriptors, duplicate them, pipe them, etc. Notice that this article uses code and conventions from C.

Files vs File descriptors

First of all we’ll start with the difference between a file structure and a file descriptor. Imagine you have an array of files like this:

files[] -> file1 | file2 | file3

file1, file2 and file3 are file data structures. A file structure is an opaque data structure. Opaque means that we don’t really know how that data structure looks like and we don’t either bother about it. All we need to know is that a file structure represents a file on a hard disk, a USB stick or whatever other storage device.

Going back to our example, the file descriptors are the indices(plural for index) of the array. So the indices in the above example would look like this:

index 0 1 2
file structure file1 file2 file3

Why do we need file descriptors when we have file stuctures?

The truth is that a process keeps track only of file descriptors. The file structures are on the side of the kernel. So the reason is the same reason that we use pointers and not actual data structures when we program in general: to save space and time, or otherwise efficiency. (There is also the element of security but we will not go into that.)

In C code you will often see something like this:

FILE* myfile;

This is something totally different from the a file structure in the kernel that we saw earlier. FILE in C is just a wrapper, or simply said; a structure holding an other structure. In fact FILE in C is just a file descriptor with some extra bells and whistles, nothing more. So why use FILE in C when we can use the file descriptors? Well file descriptors are just numbers. Imagine if we had to open and close numbers all the time. It would be hard to keep track of what we are doing. Everything would be a mess! Except the much friendlier name of a FILE, the FILE structure in C lets us also use more advanced functions which can take a FILE as argument but not a naked file descriptors.

When you start a new process, three file descriptors are created by default. These three file descriptors are called the standard file descriptors and are given the numbers 0, 1, 2. If you remember the Unix mantra, it says that everything in a Unix system is considered a file. That is even true for hardware devices like your monitor and keyboard. In fact there are file structures in the kernel that are corresponding to just those. The file descriptors 0, 1, 2 are indices to these special files. To be more exact, 0 is the index corresponding to the “keyboard file” and 1 and 2 are indices corresponding to the “monitor file”.

Streams

Earlier we said that there are three file descriptors created by default: 0, 1 and 2. We said that 0 corresponds to the “keyboard file” in the kernel and 1 and 2 correspond to the “monitor file”. If we were to sketch all this on paper, it would look a bit like below.

Process1 has the three default file descriptors we talked about. Notice how we hide the “keyboard file” and “monitor file” in the kernel box. That is mostly for simplicity as there is a lot of things going on in the kernel. Other than that the user(in this case the programmer) is not meant to know the inner workings of the kernel. The only thing the programmer can see is the C-like FILE structures and the file descriptors so we play that way.

Now there is something more than file descriptors in this figure. It’s the arrows that show the flow of the data. These arrows, call them channels, call them buses, call them rivers, or whatever you like, have a special name: streams. Stream is just an abstract name to make it easier for the programmer to visualize what is happening with the data. It’s merely easier to talk about stream of data than to talk about indices and the file structures in the kernel that those indices correspond to.

Now the three default file descriptors we talked about earlier, have in fact been baptized with their own special names: stdin, stdout and stderr.

These names are just abstract words that we use to talk about three specific channels of data(in most cases characters). Stdin is the data that we get from the user. Stdout is the flow of normal output to the user and stderr is the flow of output when errors happen in the program. Hopefully now it makes sense why file descriptor 0 is corresponding to the keyboard and file descriptors 1 and 2 are corresponding to the monitor. That is the program gets input from the keyboard while output goes to the monitor.

Now in C and many other languages there are three specific macros that are called stdin, stdout and stderr. These are not streams although they have the same name. That wouldn’t be possible anyway as streams is just an abstract idea(as said earlier) so that the programmer has it easier to visualize what is happening(even if it makes it a living hell for some). The stdin, stdout and stderr macros in C are just pointers to FILE structures(the C-like ones). You can in fact use these functions as you would use any FILE when programming in C. For example see the code below.

fprintf(stdout, "linux");
fprintf(stderr, "meerkat");

Will print “linuxmeerkat” on the screen. A question arises: Why do we have two macros that direct data to the same place? This is the same question as: Why do we have two streams to the monitor? The answer is rather simple. Many times when we develop a program we need to output errors on a different channel than the normal output. It’s just more neat to keep different things separated than mix them. Think for example how easier it is now if we want to hide all error messages in our program or just redirect them to a different place than the monitor.

Finally here is a table with the standard file descriptors(fd) and the corresponding standard streams:

fd stream
0 stdin
1 stdout
2 stderr

File descriptor tables

It’s an important detail to understand that file descriptors are the only file-relevant thing that a process can keep track of. As said earlier FILE structures in C are just a wrapper for a file descriptor so they can also be thought as file descriptors to justify the fact that a process has knowledge of only file descriptors. Now, each process keeps its unique file descriptor table. Say that we have two processes: process 1 and process 2.

process1
0
1
2
process2
0
1
2

When a new process gets created, file descriptors 0, 1 and 2 are created automatically and mapped to stdin, stdout and stderr.

To the eye the above two file descriptor tables look the same. That is, they have the same numbers. However the file descriptor 0 in process1 can be pointing to a totally different thing than the file descriptor 0 in process2.

(You may ask yourself: “How can file descriptor 0 in the two processes point to different things, if as we assumed, file descriptors are indices?”. Well that is a very logical question and the explanation is rather simple. A file descriptor is bundled with a pointer(in the abstract meaning), or if you like, an index to the global Open File Table. That is however nothing we should be worried about, so it’s not shown in the diagrams of this article.)

Just because we say that file descriptors 0, 1, 2 are standard it doesn’t mean that they are always going to correspond to stdin, stdout and stderr. Standard streams are as we think until the programmer decides it’s time to change things around. That is something that I will demonstrate. Bellow you see a visual representation of file descriptors and their connection with the kernel.

This is how the processes look like if we assume that file descriptors are not altered in any way. The arrows show the flow of data. The keyboard and screen icons in the kernel are file structures of the special files. The truth is that things are more complicated in the kernel. However now we focus on keeping track of the file descriptors. That’s also the only thing we can alter directly from inside the process. However don’t assume that because there are two icons that there are only two file structures.

Now let’s change the file descriptors in the first process a bit and see what happens:

close(1);                             //we close the stdout stream
FILE* f=fopen("myarticle.txt", "w");  //open a file for writing
fprintf(stderr, "file has fd: %d\n", fileno(f));

Pay attention that we fprintf() to the stderr as stdout is closed. From the code above this is what we get printed when we run the program:

file has fd: 1

From this output it’s crystal clear that file descriptor 1 is not pointing to stdout anymore. In fact it’s pointing to the file myarticle.txt. How do I know? The reason I know is that the usual behavior of the kernel is whenever we try to create a new file descriptor to give it the lowest number possible. That keeps things clean. Following this paradigm, after we close the stdout stream in the example, we immediately create a file for writing. This file needs a file descriptor. The kernel sees that the lowest number that can be used is 1 so the file gets that index number.

What about the file in the kernel that already has index 1? you might ask. As we said, the kernel is a bit more complicated than shown in the above figure. The kernel doesn’t mess files from different processes, even if they are the same files. So file descriptor 1 from the first process corresponds to a totally different file structure in the kernel than the file descriptor 1 in the second process. That’s something that a programmer shouldn’t worry about. As we said.. care only about the file descriptors. The kernel is a magician that you shouldn’t be aware of how his tricks work.

Here follows a figure with the file descriptors and the kernel after we ran the above code. The file description tables of both processes look like before. However look how the streams on the first process are now differentiated!

We see a new arrow there pointing to a new icon in the kernel. That icon is a new file data structure created in the kernel. The new arrow is a stream like stdin, stdout, stderr. The only difference is that we don’t have a standard name for it. However we can clearly see from the figure that the data is flowing to a file and particularly the file myarticle.txt. Notice that stderr is still flowing to the monitor. That’s also the reason we can still print text on fprintf() to the screen. If we closed this channel then we wouldn’t be able to output anything on the screen anymore.

TIP: When you want to sketch the file descriptors in paper I find it convenient to write the file descriptor in the first column and in the second column the stream-name or filename in case the stream is nameless. In this example I would write it like bellow.

process1
0 stdin
1 myarticle.txt
2 stderr

Why clones are helpful (duplicating file descriptors)

When we say that a file descriptor is removed or closed, it means that the file descriptor is destryied! Deallocated. No returning back. Nada! It’s gone forever and ever! We have lost it and with it also lost the stream it was connected to.

Now I want to remind you of the popular so called memory-leak in C. Say we allocate some memory in a function for a pointer. If we have only one pointer to the allocated memory space and somehow we lose it, we automatically get a memory leak as there is no way we can reach the memory where the pointer was pointing at. In C a solution would be to have a backup pointer and that’s also a solution used with file descriptors.

When we duplicate a file descriptor with dup(), what we actually do is making a second index for a file structure in the kernel. Let’s take an example:

int newfd;      //we declare a new file descriptor
newfd=dup(1);   //we make it a clone of file descriptor 1
printf("newfd: %d\n", newfd);

Output:
newfd: 3

The new file descriptor newfd is a clone of file descriptor 1 (stdout). Otherwise said, file descriptor 1 and 3 point to the same file structure in the kernel. That means that destroying file descriptor 1 is not going to have any effect on newfd. See the bellow code where we continue on the same example.

close(1);                 //destroying file descriptor 1 (stdout)
close(2);                 //destroying file descriptor 2 (stderr)
dprintf(newfd, "test");   //sending some data to the cloned file descriptor

Output:
test

As you see we destroyied all file descriptors to the monitor. However as we had made a backup file descriptor of 1, we can still print on the screen. By the way dprintf() is the same function as fprintf() with only difference that it takes a file descriptor as parameter instead of a FILE.

Here is also a visualisation of what we did:

I hope you can clearly see now how dup() works. The function merely duplicates a file descriptor. You might wonder about something however. If file descriptors are numbers, then why not just copy the file descriptor number to a new integer variable? Something like this:

FILE *f=fopen("test.txt", "r");   //open a file
int filefd=fileno(f);             //get file's fd
int filefd2;                      //a secondary fd to the file
filefd2=filefd;                   //this is where we "duplicate"

This is wrong and will not work. You see we don’t make a new file description in this case. filefd2 and filefd are going to have the exact same value and thus being the same exact index. The idea of duplication is to make a new file descriptor, a new index. In this example if we delete filefd, there is no way to access the file structure in the kernel through filefd2 as filefd2 and filefd are the exact same thing and thus deleting one is like deleting the other.

The below code is the correct way to do it.

FILE *f=fopen("test.txt", "r");   //open a file
int filefd=fileno(f);             //get file's fd
int filefd2;                      //a secondary fd to the file
filefd2=dup(filefd);              //this is where we "duplicate"

If filefd gets deleted now, we can still access it through filefd2.

A few words on pipes

I was concidering leaving pipes outside of this article. However I want people to have a full grasp of the tight relation between pipes and file descriptors so I will mention the basics on pipes here.

Pipe is essentially a pair of file descriptors. One file descriptor is used for input while the other file descriptor is used for output. Now what we feed to the one file descriptor will magically pop out the other file descriptor. Binary, characters, integers, everything is welcome and works. You can think of a pipe as a physical pipe that whatever you drop on the top end, will get out from the end at the bottom.

We declare a pipe just as an array of two integers like this:

int mypipe[2];

Now this alone doesn’t do anything. We have to tell the kernel to setup the pipe for us so that we can use it:

pipe(mypipe); //initializing the pipe

Now we have our fully working pipe.

PipeAs we said data goes through one file descriptor and comes out through the other one. The file descriptor that takes input is mypipe[1] and the one used for output is mypipe[0]. Notice that the numbers 1 and 0 are not file descriptors. They are the indices of the pipe. You should be very super extra careful on which end of the pipe is supposed to get data and which end is supposed to give data.

Programmers are probably some very egoistic bastards. I refer to the people that implemented the pipes in the kernel and you will see why. When it comes to pipes we refer to the write end of the pipe or the read end of the pipe from the view of the programmer and not the pipe itself. So for example the write end of the pipe is mypipe[1] while the read end is mypipe[0]. However if you are a plummer or an electronics person then you are familiar with reading the input/output from the view of the pipe itself. So be extra careful on that detail! You might spend endless hours, days or weeks debugging code just because you mixed the read end with the write end.

Now let’s test our pipe:

char buffer[5]="";
write(mypipe[1], "test", 5);      //writing to pipe
read(mypipe[0], buffer, 5);       //reading from pipe
puts(buffer);

This should output the text “test” on the terminal.

I will not go any deeper into pipes as that needs its own article in my humble opinion. I hope I made it a bit clearer what file descriptors really are and their relation to pipes and files.