selecting a field when you cannot count from the start

Question

I have a long file ( 20000+ lines) where each of the lines is a description of a book:

 book_number "title of the book" size type author_number

Where the element in quotes is a string with quotes and the others are numbers, except for type which is a single word:

 23446 Raising Steam 537724 EPUB 4

I want to extract all the size fields but using cut you cannot use negative numbers to count from the back of the result of splitting with -d " ":

 cut -d " " -f -2 books.txt

I cannot count from the front as the books may have any number of spaces in the title (I did not make up this format, I would have used CSV or JSON which require quoting).

Am I missing some option that allows using cut? What else could I use to get the second-before-last field with a one line solution?

Sorry, did not notice that, do I need to delete this question? — user59952, Feb 09 '14 at 12:05
No reason to, it has just been marked as a duplicate, don't worry about it. — terdon, Feb 09 '14 at 14:35

score 1 · Accepted Answer · answered Feb 09 '14 at 12:03

1

With python you can do this (note -3):

 python -c "for x in open('books.txt'): print x.split(' ')[-3]"

or with awk:

 awk '{ print ( $(NF-2) ) }' books.txt

answered Feb 09 '14 at 12:03

Timo

6,332

I like the Python one-liner, never used it that way. – user59952 Feb 09 '14 at 12:06

selecting a field when you cannot count from the start

1 Answers1