12

I have a text file on Linux where the contents are like below:

help.helloworld.com:latest.world.com
dev.helloworld.com:latest.world.com

I want to get the contents before the colon like below:

help.helloworld.com
dev.helloworld.com

How can I do that within the terminal?

Joel Deleep
  • 239
  • 3
  • 12
  • 2
    The grep utility is used for looking for lines matching regular expressions. You could possibly use it here, but it would be more appropriate to use a tool that extracts data from fields given some delimiter, such as the cut utility. – Kusalananda Aug 27 '19 at 17:23
  • I've submitted an edit to take out the word "grep" and replace it with "find" in the title and "get" in the question body, to avoid the X/Y issue of assuming grep is the right tool to solve the actual problem. – Monty Harder Aug 28 '19 at 18:21
  • 1
    All I can say is that the contents before the colon is much better than the contents after the colon ;-). – Peter - Reinstate Monica Aug 30 '19 at 14:02

7 Answers7

40

This is what cut is for:

$ cat file
help.helloworld.com:latest.world.com
dev.helloworld.com:latest.world.com
foo:baz:bar
foo

$ cut -d: -f1 file
help.helloworld.com
dev.helloworld.com
foo
foo

You just set the delimiter to : with -d: and tell it to only print the 1st field (-f1).

terdon
  • 242,166
20

Or an alternative:

$ grep -o '^[^:]*' file
help.helloworld.com
dev.helloworld.com

This returns any characters beginning at the start of each line (^) which are no colons ([^:]*).

Freddy
  • 25,565
19

Would definitely recommend awk:

awk -F ':' '{print $1}' file

Uses : as a field separator and prints the first field.

Centimane
  • 4,490
5

updated answer

Considering the following file file.txt:

help.helloworld.com:latest.world.com
dev.helloworld.com:latest.world.com
no.colon.com
colon.at.the.end.com:

You can use sed to remove everything after the colon:

sed -e 's/:.*//' file.txt

This works for all the corner cases pointed out in the comments—if it ends in a colon, or if there is no colon, although these weren't mentioned in the question itself. Thanks to @Rakesh Sharma, @mirabilos, and @Freddy for their comments. Answering questions is a great way to learn.

kGdmioT
  • 205
4

Requires GNU grep. It would not work with the default grep on e.g. macOS or any of the other BSDs.

Do you mean like this:

grep -oP '.*(?=:)' file

Output:

help.helloworld.com
dev.helloworld.com
  • 4
    If there are two or more colons on the line, this will print everything until the last one, so not what the OP needs. Try echo foo:bar:baz | grep -oP '.*(?=:)'. This will work for the OP's example, but not for the general case as described in the question. – terdon Aug 27 '19 at 17:19
  • there is only one colon and its working fine , but thanks for the update – Joel Deleep Aug 27 '19 at 17:25
-2

You could achieve this with bash string handling, by removing the longest match from the string directly for each line read like so:

for line in $(cat inputfile); do echo "${line%%:*}"; done

This might be a useful alternative if you are parsing the file in a shell script (though I suspect using cut might be more efficient).

-2

In pure POSIX shell without using external commands, I'd do:

#/bin/sh
IFS=:
while read -r a _; do
  echo "$a"
  done < file.txt
unset IFS
Léa Gris
  • 477