2

I noticed something with the cut command today, which somehow I had never noticed before, even though I have been using and working with them for a long time.

From string /users/developer/, I wanted to extract users. So, what I did was,

echo `pwd` | cut -d '/' -f1

This was not returning me anything. My assumption was, since I had specified / as a delimiter, it should have returned me the first field after the delimiter is found.

After a bit of tweaking, I realized that I can get my desired output, when I change my command to retrieve f2

echo `pwd` | cut -d '/' -f2

It led me thinking, that delimiter always expects something to be on the left side of it, and that is why, on trying to retrieve the first field, it was not returning me anything, and when I tried to retrieve second field, I got what I was expecting.

Logically, it makes sense for a delimiter. I just wanted to know, is that really how the delimiter parameter works in cut command? In other words, is my inference correct (that delimiter always expects something to be on the left of delimiter)

Incognito
  • 376

3 Answers3

4

f1 is empty, as in what is before first /.

 echo "/some/path/to/some/location" | cut -d '/' -f1
      |
      +--- empty

f1 is what is between echo " and first /.


To put it another way: empty isn't non-existent. Or a string/field can have no characters as in "".

It is perhaps more clear if you for instance look at something like this:

[In Data]   [Output fields]
             1  2  3  4  5 -f
A:B::D:E  => A, B,  , D, E
A::C:D:E  => A,  , C, D, E
:B:C:D:   =>  , B, C, D, 

Left, in between, or right – empty or not. A field is a field.

Runium
  • 28,811
1

There is nothing in the man page nor in info to explicitely corroborate this but

   -f, --fields=LIST
          select only these fields;  also print any line that contains no delimiter character, unless the -s option is specified

which suggests that the first field is BEFORE the first delimiter (even if that delimiter actually doesn't exist).

For me, I never had any doubt it worked like this, but that may not be the kind of answer you need ;-).

lgeorget
  • 13,914
1

The definition seems pretty clear on this. Look at wikipedia's article on Delimiter.

A delimiter is a sequence of one or more characters used to specify the boundary between separate, independent regions in plain text or other data streams.1 An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values.

Delimiters represent one of various means to specify boundaries in a data stream. Declarative notation, for example, is an alternate method that uses a length field at the start of a data stream to specify the number of characters that the data stream contains.

It says it right there in the first sentence, "..specify the boundary between separate, independent regions...". I would take that to mean exactly as you inferred.

slm
  • 369,824