Split and edit resulting file in the same pipeline

Question

I have a script which is every minute detecting new files in a folder and parsing them.

Lately I have had issues with big files. The parser hangs out if the file is very big.

So I am splitting them if the size is too big. I am using this command for that:

split -n l/5 -d filename filename

I calculate the chunk number dividing the size by an acceptable size for the parser.

Now it comes the tricky part. The first two lines of the file I am splitting are very important and I need to add those two lines at the top of the resulting files.

It would be great if I can do it in the same command line somehow parsing the resulting split files... Size is variable and I can have 20 new files or just 2 so I cannot foresee the which is the original file of the resulting files.

Should the first two lines added to every splitted file or only to all but the first (since it included already those lines if the chunk number is not extreme)? — jofel, Feb 08 '17 at 09:16
Recent versions of GNU split have a --filter option: does yours? — steeldriver, Feb 08 '17 at 13:04

sgargel · Answer 1 · 2017-02-08T09:24:13.137

0

Ugly but should work:

split -n l/5 -d filename split_filename && find ./ -name 'split_filename*' -exec sh -c "echo `head -2 filename` | cat - {} > temp && mv temp {}"  \;

edited Feb 08 '17 at 09:24

answered Feb 08 '17 at 09:13

sgargel

505

Hello, thanks for the response, I tried this command and is splitting files and adding the first two lines to a file called "temp" but not adding those lines to the resulting file, maybe is missing another step? – lapinkoira Feb 08 '17 at 10:00
Worked for me but may be an issue with && mv temp {}" that moves the file called temp to the found file alias {} – sgargel Feb 08 '17 at 10:08

score 0 · Answer 2 · answered Feb 08 '17 at 09:19

You can use ed (if the first two lines of the file are not a single dot):

split -n l/5 -d filename split_filename
for i in split_filename* ; do
   (echo 1i && head -n 2 filename && echo -e ".\nw"  ) | ed -s $i
done

# if necessary remove double header from first file:
sed -i "1,2d" split_filenameaa

Split and edit resulting file in the same pipeline

2 Answers2