2

I have a file (file_name) which contains exactly one occurrance of the string "Result: " at the start of a line. I want to print all the characters after the string "Result: " in that line until I encounter a '.' character. Which shell command should I use?

3 Answers3

3

A modern version of GNU grep that does perl regexes will do:

grep -P -o '^Result: \K[^.]*'

-o tells grep to print out only the part of the line that matches. -P with \K in the regex is a zero-width look-behind assertion that causes the stuff before the \K to not be part of the match (zero-width).

See also: Can grep output only specified groupings that match?

camh
  • 39,069
2

You can use sed. Like this:

sed -e 's/Result:\(.*\)\..*$/\1/g' file_name

if you want to save the result in the same file, you may add -i to sed arguments

saeedn
  • 2,494
  • 3
  • 20
  • 15
  • 1
    Two problems with this. You need to anchor Result to the start of the line (^Result), and you need to use [^\.]* inside the group, otherwise if there are multiple periods, the group will match the longest match, up to the last period. – camh Nov 16 '11 at 13:11
  • 1
    @camh Problem statement says that "Result:" occurs only once at the start of line, so it doesn't make any difference. And if the asker's problem is to find the first period, you're right about the second problem :) – saeedn Nov 16 '11 at 13:19
  • 2
    Don't this statement print also all unaffected lines? – enzotib Nov 16 '11 at 14:08
  • Try this: sed -nr 's/^Result:([^.]*).*/\1/p' .. it resolves the "issues"... btw. "one occurrance of the string "Result: " at the start of a line" doesn't rule out the possibility of Result: occurring somewhere else in a line (it depends on exactly what he means), so putting the ^ in place is prudent... (ps. you don't need the trailing *, because .* is greedy. – Peter.O Nov 16 '11 at 15:00
  • 1
    @camh: [\.] will match both \ and . ... Escaping is meaningless within the [square brackets] .. everything is literal between them.. – Peter.O Nov 16 '11 at 15:57
  • @saeedn: The question says there is only one occurrence of 'Result:' at the start of the line. There may be other occurrences that are not at the start of the line. That's how I read it, anyway. – camh Nov 16 '11 at 21:14
  • @fered: You ought to put that sed script into its own answer. It's the only correct sed answer on the page right now. – Jander Nov 17 '11 at 16:05
0
<file_name sed -n 's/^Result: \([^.]*\)\..*/\1/p'
<file_name awk '/^Result:.*\./ {sub(/^Result: /,""); sub(/\..*/, ""); print}'
<file_name grep '^Result: .*\.' | sed 's/^Result: //' | sed 's/\..*//'

If the presence of a . is not required, change to

<file_name sed -n -e 's/\..*//' -e 's/^Result: //p'
<file_name awk '/^Result: / {sub(/^Result: /,""); print}'
<file_name grep '^Result: ' | sed 's/^Result: //'

See this answer for background for the sed solutions.

  • What happens if there's no period on the "Result:" line, or no space after the colon in "Result:"? Also, the first sed script doesn't omit the non-"Result:" lines. – Jander Nov 17 '11 at 16:12
  • @Jander Thanks for the bug report. I took the presence of a . to be a requirement. If it's not, the awk and grep+sed solutions can easily be simplified; the sed solution becomes less compact but arguably easier to read. – Gilles 'SO- stop being evil' Nov 17 '11 at 16:33