1

I have a long file ( 20000+ lines) where each of the lines is a description of a book:

 book_number "title of the book" size type author_number

Where the element in quotes is a string with quotes and the others are numbers, except for type which is a single word:

 23446 Raising Steam 537724 EPUB 4

I want to extract all the size fields but using cut you cannot use negative numbers to count from the back of the result of splitting with -d " ":

 cut -d " " -f -2 books.txt

I cannot count from the front as the books may have any number of spaces in the title (I did not make up this format, I would have used CSV or JSON which require quoting).

Am I missing some option that allows using cut? What else could I use to get the second-before-last field with a one line solution?

1 Answers1

1

With python you can do this (note -3):

 python -c "for x in open('books.txt'): print x.split(' ')[-3]"

or with awk:

 awk '{ print ( $(NF-2) ) }' books.txt
Timo
  • 6,332