0

I am a complete novice when it comes to interfacing with computers, but I'm working on a project that I can't keep up with without making a script, so I need help.

I have two strings in a single line in my files that I want to replace using sed. The problem is they're very similar and I can't figure out how to replace them independently.

the line I want to replace is this:

*xyzfile 0 1 somefilebeingpointedto.xyz

I want to end up with this line:

*xyz 0 1

since the 0 and 1 change from file to file and there's no conserved string before the '.xyz' in the last string, I don't know how to do this or to simply replace the entire line.

What I have been trying to use is the following two sed lines:

sed -i 's/^.*xyzfile/\*xyz/' myfile.inp
sed -i 's/^.\.xyz/" "/' myfile.inp

order doesn't make a difference and it seems like the sed is simply not treating the period as a part of the string in the second line.

If there's a better way of accomplishing this, I'm all ears! Thanks

Pete
  • 11
  • Closely related: https://unix.stackexchange.com/questions/32907/what-characters-do-i-need-to-escape-when-using-sed-in-a-sh-script – Jeff Schaller Aug 01 '19 at 16:48
  • You got good answers to why your script is failing but this is what you apparently want to do: sed 's/^\*xyzfile\(.*\) .*/*xyz\1/' file. – Ed Morton Aug 03 '19 at 20:14

2 Answers2

2

The problem with your second example is that it isn't matching your line at all. It's trying to match ^.\.xyz, which is: <beginning of line><any character>.xyz. But I suspect you're trying to match <some characters>.xyz<end of line>. So far starters you need to remove the ^, and then you need to figure out exactly how to define <some characters> for your situation.

But you don't need two sed invocations, because you can keep patterns in sed. If you surround the pattern you want to keep with (escaped) parentheses you can substitute them in later with \1 (or \2 for the second pattern, and so on).

So you want to strip file and then the final filename in this:

*xyzfile 0 1 somefilebeingpointedto.xyz

So the pattern I think you're looking for is (with the literal text 'file' in it):

<something to keep>file<something to keep><space><pattern without spaces until end of line>

We can match that with:

^\(.*\)file\(.*\) [^ ]*$

Notice that the two parts we want to keep are put inside (escaped) parentheses. If we didn't want to keep them for later we could drop the parentheses (.*file.* [^ ]*)

Next, with sed's substitution you get a full line that looks like:

sed 's/^\(.*\)file\(.*\) [^ ]*$/\1\2/'
  • Only in BRE there is the need to escape the parenthesis. This form (ERE) of the regex does the same as yours sed -E 's/^(.*)file(.*) [^ ]*$/\1\2/', and, as it is ERE, there is no need to escape the (). –  Aug 01 '19 at 16:07
0

The dot in sed regex means Any character, but only one character.
So, the regex ^.\.xyz means: From the start of the line, match one character, then one dot, and then xyz. You could be meaning: ^.*\.xyz$, but that would match the whole line (and erase it). You should use the space as a delimiter (assuming the filenames do not have spaces in the name): [^ ]*\.xyz$, which means: from an space (" ") match several (*) non-space ([^ ]) characters until the the extension .xyz at the end of the line ($). You can express both substitutions if you precede each part with an -e:

sed -e 's/^.*xyzfile/*xyz/' -e 's/ [^ ]*\.xyz$//' myfile.inp

No need to escape the * in the right side of a substitution.

That could be simplified to:

sed -e 's/xyzfile /*xyz /' -e 's/ [^ ]*\.xyz$//' myfile.inp

Spaces in the filename

If the filename could contain spaces, the regex becomes more complex as there is no simple way to select (only) that part of the line.

If the second and third field have only one character you can use a capture parenthesis and place it back with \1:

sed -e 's/xyzfile\( . .\) .*\.xyz$/xyz\1/' myfile.inp

In extended regex syntax:

sed -E -e 's/xyzfile( . .) .*\.xyz$/xyz\1/' myfile.inp

Or, if the fields could have several characters (except space):

sed -E -e 's/xyzfile( [^ ]* [^ ]*) .*\.xyz$/xyz\1/' myfile.inp

That could fail if the line is something like (no xyzfile):

*xyzffff 0 1 pointedto.xyz

In that case, apply each substitution independently:

sed -E -e 's/xyzfile /xyz /' -e 's/( [^ ]* [^ ]*) .*\.xyz$/\1/' myfile.inp