1

I need to transpose x and y axis, of a file of 450.000 × 15.000 tab separated fields, so I tried first with a small 5 × 4 test file named A.txt:

x   column1 column2 column3

row1    0   1   2

row2    3   4   5

row3    6   7   8

row4    9   10  11

I tried this:

for i in {1..4}; do cut -f"$i" A.txt | paste -s; done > At.txt

but it does not work fine.

The output is:

X   row1    row2    row3    row4
column1 0   3   6   9
column2 1   4   7   10
column3
    2
    5
    8
    11
Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
Jose
  • 21

2 Answers2

1

Your command works just fine assuming the input is a Unix text file with tab-delimited fields, and that GNU paste is used. On non-GNU systems, you would have to use

$ for i in {1..4}; do cut -f"$i" A.txt | paste -s - ; done
x       row1    row2    row3    row4
column1 0       3       6       9
column2 1       4       7       10
column3 2       5       8       11

Notice the - argument to paste which tells it to read standard input.

You most definitely do not want to run this on 450k columns though as that would require reading the file 450000 times. You'd be better off using some other solution for that.

See, for example, "Transposing rows and columns".


If the above command is run on a DOS text file, it would produce the following output in the terminal:

x       row1    row2    row3    row4
column1 0       3       6       9
column2 1       4       7       10
        11

Redirecting the output to a new file and opening that file in the vim editor would show

x   row1    row2    row3    row4
column1 0   3   6   9
column2 1   4   7   10
column3^M   2^M 5^M 8^M 11^M

where each ^M is a carriage return character (the extra character at the end of a DOS text line). These carriage returns makes the cursor move back to the beginning of the line which is why the only thing that is visible on the last line in the terminal is a tab and 11 (which overwrites the other columns).

Make sure that your input file is a Unix text file by running dos2unix A.txt.

Kusalananda
  • 333,661
  • I am working with Cygwin, so I presume it is a GNU environment... I typed for i in {1..4}; do cut -f"$i" A.txt | paste -s - ; done > At.txt – Jose Aug 08 '18 at 13:41
  • but it does not tranpose fine the file – Jose Aug 08 '18 at 13:42
  • @Jose If you get some other output, then please post it in the text of your question. If the output that I get is wrong, then also post the expected output. – Kusalananda Aug 08 '18 at 13:44
  • whem i open At.txt in notepad I see evething transposed but in a single line with no spaces nor enter between them ... – Jose Aug 08 '18 at 14:06
  • @Jose Notepad is a Windows text editor that assumes DOS-formatted text files. You should not use Notepad to edit or view Unix text files. – Kusalananda Aug 08 '18 at 14:07
  • ups...sorry about the x, it replaces the name of the first colum – Jose Aug 08 '18 at 14:09
  • Cygwin is a Gnu environment. The problem may be line endings, MS-Windows dose line endings different. Put the file through dos2unix, first (only do this to text files). – ctrl-alt-delor Aug 08 '18 at 14:33
1

Cygwin is a Gnu environment. The problem is line endings, MS-Windows dose line endings different. Put the file through dos2unix, first (only do this to text files).

I have now reproduced. I pipe it into od -ta, because my terminal renders it different to your dos cmd (cmd is changing a carrage return to a line feed).

#unix2dos A.txt
#for i in {1..4}; do cut -f"$i" A.txt | paste -s; done | od -ta
0000000   x  ht   r   o   w   1  ht   r   o   w   2  ht   r   o   w   3
0000020  ht   r   o   w   4  nl   c   o   l   u   m   n   1  ht   0  ht
0000040   3  ht   6  ht   9  nl   c   o   l   u   m   n   2  ht   1  ht
0000060   4  ht   7  ht   1   0  nl   c   o   l   u   m   n   3  cr  ht
0000100   2  cr  ht   5  cr  ht   8  cr  ht   1   1  cr  nl

Explanation: cut is seeing the carrage return as part of the last field. Newline is the record delimiter.