-2

This works:

sudo sed 's/good times/bad times/' Chapter1.html > output/Chapter1.html

This does NOT work:

sudo sed 's/good times/bad times/' Chapter*.html > output/Chapter*.html

This does not work either:

sudo sed 's/good times/bad times/' *.html > output/*.html

As there are fifty chapters, can I get sed to work with wildcards?

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
  • Yes, that is exactly what I am trying to do – MikeLieberman Dec 18 '22 at 12:44
  • When I put the * in the original post, it would not display. – MikeLieberman Dec 18 '22 at 12:45
  • 1
    Use a foor loop ... there's plenty of duplicates... – don_crissti Dec 18 '22 at 12:45
  • Ok consider me stupid, which I may be, but I do not understand how do use this below in my context. If 'f' is a number, where is the input file name?

    for f in outfile_n???.csv; do sed -n '100013,200013p' "$f" > ptally_"$f" done

    – MikeLieberman Dec 18 '22 at 13:02
  • To get this closer to your command, change ptally_ to output/ and realize that the item after -n is the sed command, so replace both that and the -n with your sed command... – user10489 Dec 18 '22 at 13:08
  • for 50 in *.html; do sed -n 's/good/bad/' "$f" > output/"$f" done ----- gives me: Bad for loop variable – MikeLieberman Dec 18 '22 at 13:15
  • @MikeLieberman the syntax is for variable in list; do something; done. Here, the list is created by expanding the glob (the "wildcard") and the variable can have whatever name you want. For example: for file in Chapter*.html; do sed 's/good times/bad times/' "$file" > output/"$file"; done. By the way, never run commands like this with sudo unless you are working on files owned by root. – terdon Dec 18 '22 at 13:25

3 Answers3

3

As others have commented, this requires a loop, not just a wildcard.

For example, to do this with a shell for loop:

for f in ./*.html; do
  sed 's/good times/bad times/' "$f" > "output/$f"
done

This sets variable f to each .html filename in turn, executing the code inside the loop for each iteration of the loop. See below for why I used ./*.html instead of just *.html.

Notice how this has the bare-word f in the for statement itself (because that is where it is having its value set), but $f when the variable is used inside the loop (because that's where it's being expanded).

The variable expansions are also double-quoted, to ensure that they don't break the script (or worse) if they happen to contain white-space characters, or other characters with special meaning to the shell (such as ;, &, >, and many others). Failure to quote variables when they're being used is probably the number-one cause of shell scripting errors. See Why does my shell script choke on whitespace or other special characters? and $VAR vs ${VAR} and to quote or not to quote to understand why.

You can use whatever variable name you like, e.g.

for Chapter in ./*.html; do
  sed 's/good times/bad times/' "$Chapter" > "output/$Chapter"
done

Also worth noting: if there are no .html files in the directory, the shell will set f (or Chapter) to the literal string *.html unless you first turn on the nullglob option with shopt -s nullglob. From man bash:

nullglob

If set, bash allows patterns which match no files (see Pathname Expansion
above) to expand to a null string, rather than themselves.


BTW, I used ./*.html with the for loop instead of just *.html in order to protect against filenames that sed might otherwise interpret as one of its command-line options.

As @StéphaneChazelas mentioned in a comment, if a filename starting with -e and ending with #.html were in the directory, sed would interpret that as a sed script to be executed. This is unlikely (but excrement occurs, as does malice) but it's good to program defensively as much as possible.

By using ./*.html, instead of sed seeing an argument of, e.g., -e1,$d due to a file named -e1,$d#.html (which is a completely valid filename), it sees an argument of./-e1,$d which is not going to be interpreted as one of sed's command line options...sed's options don't start with ./.

Also: because $f starts with ./, the output for a filename like foo.html will be redirected to output/./foo.html. This is perfectly fine, having extra ./ elements in a path still resolves to the same destination. Even something absurd like output/./././[a million more ./s]/foo.html is still just output/foo.html

If you are using GNU sed (which is the standard sed on linux) or (almost?) any modern version of sed, you can use -- to indicate the end of option arguments instead:

for f in *.html; do
  sed 's/good times/bad times/' -- "$f" > "output/$f"
done

or do both:

for f in ./*.html; do
  sed 's/good times/bad times/' -- "$f" > "output/$f"
done
cas
  • 78,579
  • Note well that '$var' in single quotes is not replaced, where "$var" in double quotes is replaced with the value of var. – user10489 Dec 19 '22 at 05:22
  • Yes, but the OP is a complete newbie to shell. It would do more harm than good to bombard with too much detail - I almost didn't bother mentioning nullglob because of that. prefixing the filenames with ./ should work for any sed, without relying on a gnu sed feature. – cas Dec 19 '22 at 09:14
  • -- is not GNU specific, sed -- '...' "$f" or sed -e '...' -- "$f" works in any sed, but in the former case, the -- is only needed in GNU sed when $POSIXLY_CORRECT is not in the environment. (may also be needed in implementations that try to emulate the GNU sed behaviour like busybox') – Stéphane Chazelas Dec 19 '22 at 09:45
  • I wouldn't bet that ALL versions of sed understand --. All, or almost all, modern versions of sed will. But there will still be some ancient versions still in active use on ancient systems that don't. If you're going to go to the trouble of being nitpickingly pedantic, then why quit early? BTW, i just tested and -- works with busybox sed. – cas Dec 19 '22 at 09:55
1

The shell expands wildcards on the command line, and the command being run neither sees them nor knows what to do with them.

So if you had the files a.html b.html and output/b.html output/c.html and you ran

sed ... *.html > output/*.html

the command that would actually run would be

sed ... a.html b.html > output/b.html output/c.html

which is a syntax error (can't redirect to two files) and nothing like what you probably intended.

The solution here is to use a for loop and replace the * with the loop's index variable. Some quoting will be necessary if there are spaces in any of the filenames. There are plenty of examples of how to do this right in the duplicates for this question.

terdon
  • 242,166
user10489
  • 6,740
  • Based on the comments, the OP isn't clear on how to make the loop, so it would be a god idea to edit your answer and demonstrate. – terdon Dec 18 '22 at 13:23
  • I do appreciate your answer, truly. But I must truly be dense. I read your: for file in Chapter.html; do sed 's/good times/bad times/' "$file" > output/"$file"; done. But that just leaves me scratching my head. Here is my annotated confusion. :-) for file [file? what file?] in Chapter.html; do sed 's/good times/bad times/' "$file" [is this a variable as $file has not be defined] > output/"$file"; done. – MikeLieberman Dec 18 '22 at 13:37
  • 1
    @user10489 There is no syntax error, output/c.html would just be passed to sed as another input file. (Still not what's intended, of course.) – DonHolgo Dec 18 '22 at 14:30
  • @MikeLieberman Yes, that's a variable. for file in ... makes the variable file loop over all file names Chapter*.html. – DonHolgo Dec 18 '22 at 14:33
  • @DonHolgo But both of these produce errors: (1) for Chapter in Chapter.html; do sed -n 's/good/bad/' "$f" > "output/$f" done; [with the error: line 2: output/: Is a directory] and (2) for Chapter in Chapter.html; do sed -n 's/good/bad/' "$f" > "$f" done; line 2: : No such file or directory – – MikeLieberman Dec 18 '22 at 14:58
  • @DonHolgo This doesn't produce error but it doesn't produce any results. for Chapter in Chapter*.html; do sed -n 's/Good/Bed/' "$Chapter" > 'output/$Chapter' done; – MikeLieberman Dec 18 '22 at 15:12
  • @MikeLieberman for Chapter in Chapter*.html starts a loop with variable Chapter, so you should use $Chapter instead of $f (which is empty) in the loop. The attempt with 'output/$Chapter' should have produced a file called $Chapter in the output directory due to the single quotes. – DonHolgo Dec 18 '22 at 19:16
  • @DonHolgo I get an error of --- line 3: `$Chapter': not a valid identifier

    | ---- | for $Chapter in Chapter*.html; do sed -n 's/Good/Bad/' $Chapter > output/$Chapter done;

    – MikeLieberman Dec 19 '22 at 01:25
  • @MikeLieberman Use for Chapter in ..., NOT for $Chapter in .... Use $ only when using/expanding a variable, not when setting it...same as you would use varname=value rather than $varname=value. A for loop is a fancy way of setting a variable (to multiple values, one after the other in sequence). – cas Dec 19 '22 at 02:18
-2

Assuming you are in the working directory that contains the files, you could use find

$ find . -name 'Chapter*' -exec sed 's/good times/bad times/woutput/{}' {} 2> /dev/null \;
terdon
  • 242,166
sseLtaH
  • 2,786