-1

Code which works with a single document

pdftotext *.pdf - | grep therapy

You can use find as described in the thread How can I grep in PDF files? but I would like to understand why the above command is not working.

Differential code where pdfgrep may add some benefit but still early in development

pdftotext *.pdf - | pdfgrep therapy
#Wrong syntax so error
# Usage: pdfgrep [OPTION]... PATTERN FILE...
# Syntax Warning: Invalid Font Weight
# Syntax Warning: Invalid Font Weight

I would like to get then a fast way to move to the specific pdf page if there is a good match. However, I have not found any evidence that such a feature exists.

OS: Debian 8.5
Linux kernel: 4.6 backports
Hardware: Asus Zenbook UX303UA
Poppler-utils: pdftotext

2 Answers2

4

Just use pdfgrep directly:

pdfgrep -n therapy *.pdf

The -n option will display the page number of each match.

Stephen Kitt
  • 434,908
1

you could try this;

pdfgrep therapy *.pdf

or

find /tmp -name '*.pdf' -exec pdfgrep test {} +

eg;

user@host $ pdfgrep test *.pdf 
1.pdf:test1
1.pdf:test2
1.pdf:test3
2.pdf:test1
2.pdf:test2
2.pdf:test3
test (copy).pdf:test1
test (copy).pdf:test2
test (copy).pdf:test3


user@host $ find /tmp -name '*.pdf' -exec pdfgrep test {} +
/tmp/test (copy).pdf:test1
/tmp/test (copy).pdf:test2
/tmp/test (copy).pdf:test3
/tmp/1.pdf:test1
/tmp/1.pdf:test2
/tmp/1.pdf:test3
/tmp/2.pdf:test1
/tmp/2.pdf:test2
/tmp/2.pdf:test3
Mustafa DOGRU
  • 195
  • 1
  • 4