2

I am trying to work with awk

    datetime=`date +\%Y\%m\%d`
    logdatetime=`date +\%Y\%m\%d`
    foldermonth=`date +\%B_\%Y`
    folderday=`date +\%d`
    inputdir="~/sboper/Standalone/fsplit/GLMS"
    outputdir="~/sboper/StandAlone/input/$foldermonth"
  outputdirday="~/sboper/StandAlone/input/$foldermonth/GLMS_Daily/$folderday"

    awk -v outdir=$outputdirday 'BEGIN{ FS = "~" } ;{if( $12 == "A" ) filename="outdir/Customer_Create_Records.dat" ; print >> filename;close(filename)}' $inputdir/$sourcefile

    awk -v outdir=$outputdirday 'BEGIN{ FS = "~" } ;{if( $12 == "M" ) filename="outdir/Customer_Modify_Records.dat" ; print >> filename;close(filename)}' $inputdir/$sourcefile

however the variable outdir=$outputdirday is not expanding to the value of path which is my output directory.

I have declared the awk variable and did not use shell variable inside awk script. But its still giving me error: cannot open Customer_Create_Records.dat.

Where am i going wrong with this script? When i copy complete path instead of 'outdir', it created new files and moved the record but doesnt work right when used in a variable. How can i create new files in the directory i want through awk?

A.M.
  • 33
  • Awk can read files, don't pipe cat into it, it is unnecessary.Also try quoting the variable. – 123 Jun 07 '16 at 12:48
  • Are you sure your shell variable $outputdirday is correctly defined at that point? – Michael Vehrs Jun 07 '16 at 12:57
  • this is how i defined it : datetime=date +\%Y\%m\%d logdatetime=date +\%Y\%m\%d foldermonth=date +\%B_\%Y folderday=date +\%dinputdir="~/sboper/Standalone/fsplit/GLMS" outputdir="~/sboper/StandAlone/input/$foldermonth" outputdirday="~/sboper/CMStandAlone/input/$foldermonth/GLMS_Daily/$folderday" – A.M. Jun 07 '16 at 13:06
  • @123 - it was giving me syntax error when i used filename and gave another file to read as input - cat removed that error. Also I tried quoting the variable, still same results. – A.M. Jun 07 '16 at 13:37

3 Answers3

3

Wow.  Where do I begin?

  • You need to learn how to debug.  How to debug a bash script? is a highly upvoted Stack Exchange reference on the topic, with contributions from several of our most prolific members.  You can find other information on this site, and lots more on the web.  For instance,
    • Simplify.
      • Get rid of folderday, foldermonth and the `date …` assignments; just construct outputdirday with constant values.
      • Avoid confusingly similar variable names.  For example, outdir is equal to outputdirday, but outputdir is something different.
      • Eliminate datetime, logdatetime, and outputdir, which are not even used in the script.
      • Eliminate redundancy.  You show two awk commands that are identical except for a trivial difference.  Does one of them work and one of them fail?  Do they fail in different ways?  I don’t think so.  Then don’t show them both.
      • Don’t use long, eight-level pathnames (e.g., ~/sboper/StandAlone/input/$foldermonth/GLMS_Daily/$folderday/Customer_Create_Records.dat) unless you really really need to.
    • For presentation purposes (i.e., when presenting your problem to others), avoid confusing naming conventions, like naming your output directory …/input/….
    • For presentation purposes, try to avoid lines that are wider than the screen (so people have to scroll to read them in their entirety).  Most systems have ways to let you split commands and programs into multiple lines.
    • Do you really have two directories — one called Standalone and one called StandAlone — in your ~/sboper directory?
    • If the end result isn’t what you want, find out what’s happening in the middle of the process.  Display intermediate values.  For example,
      • outputdirday.  When I run your script and do echo "$outputdirday" before the awk, I get
        ~/sboper/StandAlone/input/June_2016/GLMS_Daily/08

        I.e., because the ~ was in quotes, it didn’t get expanded to my home directory.  I had to change the assignment to

        outputdirday="$HOME/sboper/StandAlone/input/$foldermonth/GLMS_Daily/$folderday"

        to get it to work.

      • outdir.  You say it isn’t get set correctly, but have you tried to print it?  When I do, I get the same value as I do for outputdirday.
      • filename.  This is where (part of) the problem is; see below.
  • When you’re asking for help, you should
    • Simplify the problem, as discussed above.
    • Describe what your command is supposed to be doing.
    • Show what your input file looks like.
    • Show what you expect your output to look like.
    • Identify the versions of the software you are using.  While I believe that I understand what’s going wrong with your script, I can’t reproduce your errors exactly (so, I guess I’m running a different version of awk).  It probably isn’t really important, but you don’t even say what operating system you’re on.


Here’s the important part:

↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓

  • If, as I suggested above, you had broken the main action block in your awk script down into shorter lines, you might have realized that it was equivalent to
    {
          if ( $12 == "A" )
                filename="outdir/Customer_Create_Records.dat"
          print >> filename
          close(filename)
    }

    As Michael Vehrs pointed out in his answer, this sets filename conditionally, and then writes every input line to filename unconditionally.  As he suggests, you need to do something like

    {
          if ( $12 == "A" ) {
                filename="outdir/Customer_Create_Records.dat"
                print >> filename
                close(filename)
          }
    }
  • If, as I suggested above, you had printed the value of filename after setting it, you would have seen that it was
    outdir/Customer_Create_Records.dat

    because "outdir" (in quotes) means the string outdir, and not the value of the variable outdir.  You need to do

    filename = outdir "/Customer_Create_Records.dat"

↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑

Almost as important:

↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓

  • You should always quote your shell variable references (e.g., "$outputdirday", "$inputdir", and "$sourcefile") unless you have a good reason not to, and you’re sure you know what you’re doing.  That doesn’t appear to be the cause of this problem, but eventually it will get you into trouble.
  • Just for clarity, you might want to change `…` to $(…) — see this, this, and this.

↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑

  • While you should always quote your shell variable references unless you have a good reason not to, the same does not apply to backslashes.  You should use them only when you have a reason to.  For example, date +%Y%m%d and date +%B_%Y, etc., work just fine.
  • As I suggested earlier, for what you appear to be doing, you don’t need two awk invocations.  You can do the whole thing with

    awk -v outdir="$outputdirday" '
        BEGIN { FS = "~"
                filenameA = outdir "/Customer_Create_Records.dat"
                filenameM = outdir "/Customer_Modify_Records.dat"
              }
              {
                if ($12 == "A") print >> filenameA
                if ($12 == "M") print >> filenameM
              }
        ' "$inputdir/$sourcefile"
    

    or

    awk -v outdir="$outputdirday" '
        BEGIN { FS = "~"
                filenameA = outdir "/Customer_Create_Records.dat"
                filenameM = outdir "/Customer_Modify_Records.dat"
              }
        $12 == "A" { print >> filenameA}
        $12 == "M" { print >> filenameM}
        '    "$inputdir/$sourcefile"
    

    As far as I can tell, you don’t need to close the file(s) at all; files are typically closed automatically when a program exits.

  • Thanks a lot for all the effort you have taken to answer my queries, i am a shell script beginner and your post will definitely help. Just FYI - this was a small snippet from complete script so some of the details may have annoyed you. – A.M. Jun 10 '16 at 13:25
2

Your question is not very clear to me, but I think you may intend:

if( $12 == "A" ) {
    filename="outdir/Customer_Create_Records.dat" ;
    print >> filename;close(filename)
}

Without the curly braces, the "then" clause of the conditional ends at the first semicolon. As a consequence, you might attempt to write lines to your output file before its name has been defined.

  • Congratulations; you’ve found part of the problem (which I missed).  But, IMO, you haven’t explained it as well as you could; somebody could read your answer and believe that it has something to do with splitting the statements into separate lines. Please [edit] your answer to make it clearer and more complete. – G-Man Says 'Reinstate Monica' Jun 09 '16 at 01:49
1

Problem was both with my awk syntax and the input file i was providing to awk. "$inputdir/$sourcefile" - this was not expanding into a single filename.

Thanks everyone who posted answer- it helped me understand mistakes. the resulting code looks like below (this is a snippet from a bigger script so all variables i declared might not be related just to the snippet)

    timestamp=`date +\%Y\%m\%d\%H\%M\%S`
    logfile=/udd001/sboper/CMStandAlone/input/logs/SBSA_GLMS_CUST_$timestamp.log
    datetime=`date +\%Y\%m\%d`
    logdatetime=`date +\%Y\%m\%d`
    foldermonth=`date +\%B_\%Y`
    folderday=`date +\%d`
    inputdir="/udd001/sboper/CMStandalone/fsplit/GLMS"
    outputdir="/udd001/sboper/CMStandAlone/input/$foldermonth"
    outputdirday="/udd001/sboper/CMStandAlone/input/$foldermonth/GLMS_Daily/$folderday"
    inputfile=$( ls -1 /udd001/sboper/CMStandalone/fsplit/GLMS/GCP1_cdf_001_$datetime*.txt )
    sourcefile=`basename $inputfile`

    if [ ! -d "$outputdirday" ]; then
            mkdir -p "$outputdirday"
            fi
                    if [ -f "$inputdir/$sourcefile" ]
                    then
                    cp "$inputdir/$sourcefile" "$outputdirday"

                    echo "output directory : $outputdirday"
                     awk -v outdir="$outputdirday" '
                        BEGIN { FS = "~"
                        filenameA = outdir"/SBSA_Amdocs_Customer_Create_Records.dat"
                        filenameM = outdir"/SBSA_Amdocs_Customer_Modify_Records.dat"
                        }
                        $12 == "A" { print >> filenameA}
                        $12 == "M" { print >> filenameM}
                        '    "$inputdir/$sourcefile

"

        fi        
A.M.
  • 33
  • 1
    (1) *You should always quote your shell variable references (e.g., "$inputdir", "$inputfile", "$outputdirday", and "$sourcefile") unless you have a good reason not to, and you’re sure you know what you’re doing.* (2) You should probably use backslashes only when you have a reason to. For example, date +%Y%m%d and date +%B_%Y, etc., work just fine. (3) You should avoid parsing the output from ls (i.e., doing $(ls …)). (4) Your new code defines a variable called filename and then never uses it. (5) When you post code, please indent it properly. – G-Man Says 'Reinstate Monica' Jun 11 '16 at 03:29