0

I am trying to make a little bash script, that calls a command on each of my files in have in a file, found with the find command.

I want to be able to keep track of where the script stopped (it tends to crash) so i can take back from there. I managed to read my file, get the lines,... But currently i'm stuck at the for loop. I want to do a C style for loop, starting at the the last line i stopped at, increment by one, and do it as long as i'm smaller than the number of lines. I got this :

#!/bin/bash

LINES=$(wc -l < file.txt) LASTLINE=$(grep -P '### Stop marker ###' file.txt | wc -l) STARTFROM=$(($LINES - $LASTLINE))

for ((i = STARTFROM; i < LINES; i++)); do echo "we are processing file number $i" file=sed -n $i'p' file.txt ocrmypdf [some stuff] -input $file done

Here is an excerpt of what my file.txt looks like inside

./input_folder/hard_blurry.pdf
./input_folder/l_ordre_malte.pdf
### Stop marker ###
./input_folder/single_page.pdf
./input_folder/very_hard.pdf

When i run this i get... nothing. Bash doesn't enter the loop at all. I tried setting ints directly, and it worked, which tells me that the vars are bein read as string.

I tried all these ways to write my var :

for ((i = STARTFROM; i < LINES; i++));
for ((i = $((STARTFROM)); i < $((LINES)); i++));
for ((i = $(echo STARTFROM); i < $(echo LINES); i++));

and nothing works. I'm suprised that no erros are thrown as well. My Os is ubuntu 20.0.4

Its content is the path to the files i want to work with.

Any ideas ? Thanks

Orsu
  • 111
  • Actually, this is not the way I would do it in a shell script. Are you aware that you are scanning the file several times for grepping, for line counting and finally for the actual task? I'd suggest to do a while loop over reading the file, set a flag on the Stop marker line and perform "some stuff" after the marker was set. – Philippos Apr 19 '21 at 06:53
  • 1
    LASTLINE is always 1 or how many "Stop Makers" do you have? – pLumo Apr 19 '21 at 07:07
  • I don't really understand what you're trying to do. Please provide some example input and expected output. What are you doing @ //some stuff ? Also make sure to read What is the XY Problem? and tell us about your "little bash script" which fails. – pLumo Apr 19 '21 at 07:08
  • @pLumo last line is supposed to be the last occurrence of my flag. So an int, at most as big as the line number. "some stuff" can be anything; my problem is that i don't enter the loop. Currently it's echo "i is $i" – Orsu Apr 19 '21 at 07:18
  • @Philippos How would you do it ? please suggest me something, if it's a problem you've already encountered. For the grep, is it really repeated ? I tried to only get the value returned by it, since i don't actually need to grepmore than once each time i run the script. I'm also concerned about huge files – Orsu Apr 19 '21 at 07:24
  • 4
    LASTLINE=$(grep -P '### Stop marker ###' file.txt | wc -l) is not the line number, but the number of Stop markers in your file. And -P is not needed, you have a fixed string. – pLumo Apr 19 '21 at 07:33
  • You should echo your variables before the for loop... $LASTLINE might be 0 (e.g. because your Stop Marker is not found. Then you'd have LINE == STARTFROM, which would explain that the loop is not entered. – pLumo Apr 19 '21 at 07:40
  • 3
    Why is using a shell loop to process text considered bad practice?. Use perl or awk or python (or anything except shell) for processing text files, shell is precisely the wrong tool for the job....shell is a great wrapper for running other programs to process text, but terrible at processing text itself. – cas Apr 19 '21 at 07:41
  • BTW, to get the line number where a string occurs in a file, you can use grep -n. e.g. grep -n -F '### Stop marker ###' file.txt | head -n 1 | cut -d: -f1. But it's much easier to use awk. e.g. awk '/### Stop marker ###/ {print NR; exit}' file.txt – cas Apr 19 '21 at 07:49
  • @cas i think you're right. I was trying to refine a script to save some time, but i'll probably be better off using python directly – Orsu Apr 19 '21 at 07:50
  • 1
    If you want to track errors in your bash scripts using: set -o errexit (abort on nonzero exitstatus), set -o nounset (abort on unbound variable) and set -o pipefail (don't hide errors within pipes) could be useful. I actually set those with vim as automatic header whenever I create a new *.sh file. Two of those commands are explained here if you want more info: https://ricma.co/posts/tech/tutorials/bash-tip-tricks/ . – jeremy Apr 19 '21 at 07:52
  • 1
    While I agree with the comments about incorrect shell scripting or bad shell practice, you could use set -x to print all commands the shell executes, including the values of variables in those commands. Switch debug output off with set +x. – berndbausch Apr 19 '21 at 11:47
  • 1
    @Orsu, I meant some like that: process=false; while read line; do if $process; then YOUR_PROGRAM_TO_DO_STUFF_WITH "$line"; fi; if [[ "$line" == "### Stop marker ###" ]]; then process=true; fi; done < file.txt. That said, if that "some stuff" can be done with python, use python for the file processing as well. – Philippos Apr 19 '21 at 11:50
  • 1
    Always paste your script into https://shellcheck.net, a syntax checker, or install shellcheck locally. Make using shellcheck part of your development process. – waltinator Apr 19 '21 at 17:41
  • You should be running some stuff directly from find using -exec. It is unclear what you want to achieve with your script. The computations that you seem to want to carry out makes no sense from the point of view of wanting to continue processing from some stop point. – Kusalananda Apr 21 '21 at 06:56
  • @Orsu, looking at the lack of answers you're getting, I wonder, could you [edit] the question to add a short sample of the input file your script reads, and what exactly it's supposed to do with it? With all the important parts included, like those stop markers. Just some dummy data in place of the parts you already know how to do, of course. – ilkkachu Apr 21 '21 at 16:04
  • @ilkkachu sure, i'm doing it – Orsu Apr 21 '21 at 16:20

2 Answers2

2
LASTLINE=$(grep -P '### Stop marker ###' file.txt | wc -l)

This will tell you how many lines match that pattern, but not where they are. If there's one marker in the file, this returns 1. You'd need to use something like grep -n (--line-number) to get the line numbers.

file=sed -n $i'p' file.txt

That should probably be file=$(sed ...), i.e. with a command substitution to capture the output of sed. But, if you do that in a loop, you read the whole file for each iteration of the loop, a silly waste, and will take ages if the file is long.

That's the sort of thing the question Why is using a shell loop to process text considered bad practice? that was linked here earlier refers to. Note that the bad practice is processing, modifying text. Running commands based on some data in a file using the shell is just fine; the shell exists to run commands.

So, just loop over the file once and detect the stop marker in the shell:

#!/bin/bash
i=0
while IFS= read -r line; do 
    if [[ $line == '### Stop marker ###' ]]; then
        break;
    fi
    i=$((i + 1))
    echo "line $i, do some stuff with '$line'"
done < file.txt

([[ .. ]] is a Ksh-ism. It could be replaced with case in a POSIX shell.)

Or, have some external text processing tool deal with the stop marker and have the shell just run the commands:

#!/bin/sh
i=0
< file.txt sed -n -e '/### Stop marker ###/q' -e p |
while IFS= read -r line; do
    i=$((i + 1))
    echo "line $i, do some stuff with '$line'"
done

If you really wanted to do a for (i = 0; i < end; i++) style loop, you could read the whole file into an array first, but unless you need random access to the lines, that's completely unnecessary. Streaming through the file is far more natural.

ilkkachu
  • 138,973
-2

you can treid like this seq to get range of var:

#!/bin/bash
LINES=1
LASTLINE=10
for i in $(seq $LINES $LASTLINE )
do
 echo $i
done

output:

1
2
3
4
5
6
7
8
9
10
nextloop
  • 166
  • 1
    Doesn't work in bash, though it does in ksh93 and zsh (which are not allowed by the Q). There are at least dozens of existing Qs/As that cover this, so you evidently haven't read any of them or tested. – dave_thompson_085 Apr 21 '21 at 02:38
  • yes,in bash {$var1..$var2} not work, you shoud be use seq to generate number – nextloop Apr 21 '21 at 15:47
  • This answer describes a loop structure but doesn't really address OP question. – bu5hman Apr 23 '21 at 09:14