02 | May | 2012

Extracting several columns from different data files and combining them into a single data file with gawk

Imagine the following scenario: You have two data files, containing interesting numerical results, you want to proceed using gnuplot, for example. The possible command in gnuplot to view the results would be:

gnuplot> plot "file1" using 1:2 with lines

Here, we silently assumed that the first column of all of our data files contains x values and the latter contain the function values depending on x. The format of the data file would look like this “x f(x) g(x)”. Here is the output of the files:

$ cat file1.dat
1 2 3
1 2 3
1 2 3
1 2 3

$ cat file2.dat
4 5 6
4 5 6
4 5 6
4 5 6

All is fine so far, but once you decide to plot a new function by adding the function values from different files, you will realize that it is not that easy. One possible solution is to create a new source file, containing all columns from the previously mentioned data files. The “gawk” command allows that. It is also possible to arrange them. So, here is an example how to use it:

$ pr -m -t -s\  file1 file2 | gawk '{print $1,$4,$5,$6,$2,$3}' > file3.dat
$ cat file3.dat
1 4 5 6 2 3
1 4 5 6 2 3
1 4 5 6 2 3
1 4 5 6 2 3

From now on, it is an easy task to plot f1(x)+g2(x) over x, where the input data comes from a single file:

gnuplot> plot "file3" using 1:($3+$5)$ with lines

ewgeny

Just another place to memorize my thoughts, activities and discoveries

Daily Archives: 2 May, 2012

Extracting several columns from different data files and combining them into a single data file with gawk