0

I'm trying to split a long csv into files of 500 lines each. I want the output files in a specific directory, and I want to leave off the first line of the csv.

I can use split and leave off the first line of the csv by piping the output of cat:

cat file.csv | tail -n +2 | split -l 500

And I can specify the output directory like so:

split -l 500 file.csv /mnt/outdir

But when I try something like this:

cat file.csv | tail -n +2 | split -l 500 /mnt/outdir

It thinks that /mnt/outdir is the file I am trying to split and tells me split: /mnt/outdir: Is a directory.

So how to I somehow pipe output into the split command, while specifying an output directory?

1 Answers1

5

Use - as the input filename. e.g.

cat file.csv | tail -n +2 | split -l 500 - /mnt/outdir

but there's no need for cat here.

tail -n +2 file.csv | split -l 500 - /mnt/outdir

Alternatively, use /dev/stdin:

tail -n +2 file.csv | split -l 500 /dev/stdin /mnt/outdir

or process substitution:

split -l 500 <(tail -n +2 file.csv) /mnt/outdir

From man split (GNU version):

split [OPTION]... [FILE [PREFIX]]

DESCRIPTION

Output pieces of FILE to PREFIXaa, PREFIXab, ...; default size is 1000 lines, and default PREFIX is 'x'.

With no FILE, or when FILE is -, read standard input.

You can see from the way that's written [FILE [PREFIX]] that if you use a PREFIX, you must supply an input filename. If FILE & PREFIX were both optional and independent of each other, it would be written as [FILE] [PREFIX].

cas
  • 78,579