28

I was going through the list of files included in coreutils and I was able to come up with an example of how I could personally use all of the commands provided except for ptx. Can you give one or two (or three) examples of using ptx? The more varied the use case the better.

$ apropos ptx
ptx(1)         - produce a permuted index of file contents
dfc
  • 1,026

5 Answers5

16

@Joseph R.'s accepted answer with the history is good, but let's look at how it might be used.

ptx generates a permuted term index ("ptx") from text. An example is easiest to understand:

$ cat input
a
b
c

$ ptx -A -w 25 input
:1:            a b c
:2:        a   b c
:3:      a b   c

         ^^^^  ^ ^^^^-words to the input's right
         |     +-here is the actual input
         +-words to the input's left

Down the right you see the different words from the input and the left and right word context surrounding them. The first word is "a". It occurs on line one and is followed by "b" and "c" to its right. The second word is "b", which occurs on line two with "a" to its left and "c" to its right. Finally, "c" occurs on line three and is proceeded by "a" and "b".

Using this, you can find the line number and surrounding words to any word in a text. This sounds a lot like grep, eh? The difference is that ptx understands the structure of text, in logical units of words and sentences. This makes the contextual output of ptx more relevant when dealing with English text than grep.

Let's compare ptx and grep, using the first paragraph of James Ellroy's American Tabloid:

$ cat text
America was never innocent. We popped our cherry on the boat over and looked back with no regrets. You can’t ascribe our fall from grace to any single event or set of circumstances. You can’t lose what you lacked at conception.

Here's grep (with color matches manually changed to be surrounded by //):

$ grep -ni you text
1:America was never innocent. We popped our cherry on the boat over and looked back with no regrets. /You/ can’t ascribe our fall from grace to any single event or set of circumstances. /You/ can’t lose what /you/ lacked at conception.

Here's ptx:

$ ptx -Afo <(echo you) text
text:1:        /back with no regrets.   You can’t ascribe our fall/
text:1:     /or set of circumstances.   You can’t lose what you/
text:1:      /. You can’t lose what   you lacked at conception.

Because grep is line-oriented, and this paragraph is all one line, the grep output isn't quite as concise or helpful as the output from ptx.

bishop
  • 3,209
11

Apparently, it was used to index the Unix Reference manual in the olden days.

In the References below, the Wikipedia article explains what a permuted index is (also called KWIC, or "Keyword in context") and ends with the cryptic:

Books composed of many short sections with their own descriptive headings, most notably collections of manual pages, often ended with a permuted index section, allowing the reader to easily find a section by any word from its heading. This practice is no longer common.

More searching reveals the remaining articles in the References, which explain more about how the Unix man pages used a permuted index. It seems the main issue they were dealing with is that the man pages had no continuous numbering.

From what I gather, the practice of using a permuted index is now arcane and obsolete.

References

Joseph R.
  • 39,549
5

You might find this collection of examples interesting:

Pattern Matching and Permuted Term Indexing with Command Line Tools in Linux

2

Also known as a concordance. And they are still relevant and quite useful. A good example is quickly identifying bible verses when you only know a few words. Another example would be indexing all of Shakespeare's sonnets to enable similar quick lookup by keyword.

fred
  • 21
  • 1
1

You can see an (old) example of an online permuted index here (Click on the Permuted index link in the top-left frame).

As someone else has mentioned, this is not common anymore because of the capabilities of search engines and custom search apps.

HalosGhost
  • 4,790
evb
  • 11