1

If I have a folder with the following file names:

cluster_sizes_0.txt
cluster_sizes_1.txt
cluster_size_2.txt

etc.

Each file contains a single column of values.

Is there a command within linux such that I could combine all files into cluster_all.txt?

The first column in this new file would correspond to cluster_sizes_0.txt, the second column would be cluster_sizes_1.txt etc.

There could be as many as 200 cluster txt files, but it changes for each folder. I am looking for a way to combine these files, instead of copying each one by one.

Also, I need to make sure they are pasted into the file in order. This may have some issues with the numbering system, since I only include single digits if below 10.

For instance:

paste cluster_size.* > cluster_all.txt doesn't paste them in order due to the numbering. How can I fix the numbering without manually changing all of them?

slm
  • 369,824

1 Answers1

2

The command paste merges columns together. So, for example, if we have these 3 files then paste will create the nice result:

$ cat file_1.txt
1a
1b
1c

$ cat file_2.txt
2a
2b
2c

$ cat file_3.txt
3a
3b
3c

$ paste -d, file_1.txt file_2.txt file_3.txt
1a,2a,3a
1b,2b,3b
1c,2c,3c

So now the question is, really, how to get the files in order. We can cheat and let ls do the work for us

$ ls     
file_1.txt   file_13.txt  file_17.txt  file_20.txt  file_6.txt
file_10.txt  file_14.txt  file_18.txt  file_3.txt   file_7.txt
file_11.txt  file_15.txt  file_19.txt  file_4.txt   file_8.txt
file_12.txt  file_16.txt  file_2.txt   file_5.txt   file_9.txt

$ ls -v
file_1.txt  file_5.txt  file_9.txt   file_13.txt  file_17.txt
file_2.txt  file_6.txt  file_10.txt  file_14.txt  file_18.txt
file_3.txt  file_7.txt  file_11.txt  file_15.txt  file_19.txt
file_4.txt  file_8.txt  file_12.txt  file_16.txt  file_20.txt

$ paste -d, $(ls -v file_*.txt)
1a,2a,3a,4a,5a,6a,7a,8a,9a,10a,11a,12a,13a,14a,15a,16a,17a,18a,19a,20a
1b,2b,3b,4b,5b,6b,7b,8b,9b,10b,11b,12b,13b,14b,15b,16b,17b,18b,19b,20b
1c,2c,3c,4c,5c,6c,7c,8c,9c,10c,11c,12c,13c,14c,15c,16c,17c,18c,19c,20c

Now, beware that parsing ls is normally a bad thing. If there are any unexpected filenames or odd characters (eg whitespace, globbing) then it can break your script. But if you're confident the filenames are "good" then this will work.

  • Thanks, I had a question. If I did put them in order, as a practice run, which I did, why does this not work? It seems like the columns are not separated out as they should be. Note that the column length from one file to another can change, so I dont know if that effects copying into excel. – Jackson Hart Jul 28 '18 at 17:22
  • I don't understand your comment. I just made file_2.txt have 4 lines, whereas the rest are 3 lines, and the paste command made the last line ,2d,,,,,,,,,,,,,,,,,, correctly, as expected. I expect your method of "copying into excel" may be broken. Remember, the default paste separator is a TAB and multiple TABs may not copy cleanly. – Stephen Harris Jul 28 '18 at 17:26