2

Trying to capitalize the content of the files that match the filename pattern "_base.txt"; then output the results to another folder with the original filename plus the "_cap.txt" in the end.

I plan to do this through 3,000 files. So I am testing it on a few files at the moment.

First I was able to get the content capitalized. But all output results were not redirected.

BASE="/home/dir/input/"
find "$BASE" -type f -iname "*.txt" -exec awk '{print toupper($0)}' {} + 

Then I added the following, which allowed me to output to a new filename ended with "_cap.txt"

BASE="/home/dir/input/"
find "$BASE" -type f -iname "*_pv_bind_basefile.txt" -exec awk '{print toupper($0) >> (FILENAME "_cap.txt")}' {} +  

The problem showed up when I try to output to a specific folder

BASE="/home/dir/input/"
OUT_FOLDER="/home/dir/output/"
find "$BASE" -type f -iname "*_pv_bind_basefile.txt" -exec awk '{print toupper($0) >> ($OUT_FOLDER FILENAME "_cap.txt")}' {} +

I got the following error message

awk: illegal field $(), name "OUT_FOLDER" input record number 1, file /home/dir/input/TAB1.a1.001.txt source line number 1

I tried several iterations to output the files and failed miserably after a few hours...It must be some simple thing that I overlooked or not knowing how to do it. Any suggestions would be highly appreciated. Thank you.

Molly_K
  • 161

1 Answers1

2

Shell variables and Awk variables are different things.

If you export a variable to the environment, you can access it via awk's ENVIRON array - so you could do (note: I omitted the find, since it's not central to the issue)

export OUT_FOLDER="/home/dir/output/"
awk '{print toupper($0) >> ENVIRON["OUT_FOLDER"] FILENAME "_cap.txt"}'

Alternatively, you can pass variables using the -v option

OUT_FOLDER="/home/dir/output/"
awk -v out_folder="$OUT_FOLDER" '{print toupper($0) >> out_folder FILENAME "_cap.txt"}'
steeldriver
  • 81,074
  • Thanks @steeldriver it's very helpful. Now I just need to figure out how to add basename command into awk. :D – Molly_K Mar 23 '19 at 12:49
  • @Molly_K you could probably use the awk split function to implement a basename e.g. n = split(FILENAME,p,"/"); print p[n]. But perhaps you should consider use -execdir in place of -exec in the find command instead. – steeldriver Mar 23 '19 at 14:33
  • Thank you @seeldriver, I followed your suggestion using split(FILENAME,p."/"); print p[n]. But I was not able to get it to work. I got the following error message awk: bailing out at source line 1. I understand this is not part of the original question, I am happy to open another question if you think that's more appropriate. The full command line I used was find "$BASE" -type f -iname "*sefile.txt" -exec awk '{print toupper($0) >> (ENVIRON["OUT_FOLDER"] (n = split(FILENAME,p,"/"); print p[n]) "_cap.txt")}' {} + I also tried to put split before print toupper($0), also did not work. – Molly_K Mar 25 '19 at 14:36
  • @Molly_K print p[n] was just for illustration: to construct a filename from the value you would sue something more like {n = split(FILENAME,p,"/"); print toupper($0) >> ENVIRON["OUT_FOLDER"] p[n] "_cap.txt"}'. But I recommend you use-execdirif possible so thatFILENAME` will be relative to its parent directory instead. – steeldriver Mar 25 '19 at 14:44
  • Thank you @steeldriver. I was able to execute the awk command under the proper directory using -execdir. Also, moving n = split(FILENAME,p,"/") in front of print toupper works as well. Thank you very much for the quick response. I read up the man page for find and am continuing my reading on the awk book by Dale Dougherty. – Molly_K Mar 25 '19 at 16:16