As noted by steeldriver
the width of c would never be longer then your limit:
c = substr($0, 1, 5)
Length of c
would never be > 5.
Beyond that it is blank / empty because there is a syntax error in the awk script. This should be printed to shell unless you do something like 2>/dev/null
That not longer apply after latest update. But from what I can see this was not corrected by you. Just to be clear:
if( (length(c) > p && NR > 1 )
# ^
# +--- Never closed.
Beside that your edit also reviles more. You do not need \
to continue script on next line. That is:
Using semicolons at end of statements is OK, but for the code to be more readable do not mix. Either use ;
at end of all statements, or none, unless, of course, you for some reason have more then one statement in a line. So:
Not:
printf "%s: %d", $1, $2;
++foo
++bar;
printf "%s: %d", $3, $4
But:
printf "%s: %d", $1, $2
++foo
++bar
printf "%s: %d", $3, $4
Or (not widely used from what I have seen):
printf "%s: %d", $1, $2;
++foo;
++bar;
printf "%s: %d", $3, $4;
It is also the concept of using substr()
of $0
and trimming that by sub()
.
The default separator of awk is <space>. This is treated differently then other character delimiters. That is: multiple blanks are concatenated into one separator. Thus both lines in:
A B C
A B C
Result in:
$1 == A
$2 == B
$3 == C
As for issue at hand you could possibly do something like this:
awk \
-v width_max=5 \
-v field_validate=1 \
'
BEGIN {
err_count = 0
}
$1 == "header" {
next
}
NF < field_validate || length($field_validate) > width_max {
printf "%s:%d:%d:%s\n", FILENAME, NF, FNR, $0 > "/dev/stderr"
++err_count
}
END {
printf "%d", err_count
}
' sample
Note that you would perhaps put the NF
check as a separate check. Something like:
NF != field_count {
# NF does not match with required fields
}
Where field_count
is a defined variable.
Simple example script you can look at in regards to FS, NF etc.
awk -v field_count=3 \
'
NF != field_count {
printf "NF mismatch %d != %d\n", NF, field_count
}
{
printf "<%s><%s><%s>\n", $1, $2, $3
}
' <<EOF
AA BB CC
AA BB CC
AA BB CC
AA BB
AA BB CC DD
EOF
if
. Aside from that, why would the length ofsubstr($0,m,p)
ever be greater than p? – steeldriver May 18 '21 at 23:36gawk -o- 'script'
on your awk scripts before posting if you don;t know how to format them reasonably. – Ed Morton May 19 '21 at 11:46the variable error_count is always giving me blank instead of zero
- I can now see that you have no variable namederror_count
in your code and if you meant the awk variable namedcount
instead it's impossible for that to print as blank since you set it to0
with-v count=0
on the command line and only ever increment it in your code, and if you meant the shell variableerr_count
instead, it's also impossible for that to be null since its set to the value printed by the awk command which will always be numeric unless the awk command fails to open the input or similar – Ed Morton May 19 '21 at 14:40echo "$error_count"
after your code runs but you have no such variable and actually meant to doecho "$err_count"
instead. – Ed Morton May 19 '21 at 14:45c=substr($0,m,p)
is creating a stringc
of lengthp
, thensub(" +$", "", c)
is removing any spaces fromc
, and thenlength(c) > p
is testing if the resultingc
is longer than lengthp
. It simply cannot be.c
MUST be lengthp
or less as it starts out as lengthp
and then you may remove chars from it but you never add chars to it. Start with 10 apples and then remove 0 to 10 of them and then see if you now have 11 or more apples. – Ed Morton May 19 '21 at 14:50echo "$error_count"
or similar. If you had included that I expect you'd have got an answer immediately when you asked the question. It's important to include the line where the failure occurs when you ask a question in future otherwise people are just guessing at what the problem might be. I posted my comment as an answer now. – Ed Morton May 19 '21 at 18:20