4

I'm using some old Fortran code that uses some peculiar memory handling. To make the long story short, it runs on my local machine but fails on the remote one. This is why I would like to ssh to my local computer run the code and copy the results back to the cluster I'm running my calculations on.

I already found exactly the same question on this forum:

EDIT #1

After the comment by @Anthon, I corrected my script, unfortunately new error occurred. NOTE: I am using ssh keys so no passwords are needed.

My new script:

#! /bin/bash
# start form the machine_where_the_resutlst_are_needed

ssh usr@machene_wehere_i_run_the_code /home/run_dir_script/run.sh inp 8

# do the work by running a script. 8 jobs are run by sending them 
# to the background, 

scp -p usr@machene_wehere_i_run_the_code:/home/run_dir_script/results \
  user@machine_where_the_resutlst_are_needed:~/

echo "I am back"

My problem is that run.sh is a master script calling other shell scripts, and they don't run properly. I get the following message:

/home/run_dir_script/run.sh: line 59: /home/run_dir_script/merge_tabs.sh: No such file or directory

Minimal Example:

Here is a condensed example of what I am doing

Example run.sh

#! /usr/bin/bash

pwd
echo "Run the code"
./HELLO_WORLD

The above script is run by

ssh usr@machene_wehere_i_run_the_code /home/run_dir_script/run.sh    

For completeness the fortran code ./HELLO_WORLD

program main
write(*,*) 'Hello World'
stop
end

Compile with gfortran -o HELLO_WORLD hello_world.F90

And here is the output

/home/run_dir_script/
Run the code
/home/run_dir_script/test.sh: line 5: ./home/HELLO_WORLD: No such file or directory

Remark:

The following will run `HELLO_WORLD` on the remote machine
ssh usr@machene_wehere_i_run_the_code /home/run_dir_script/HELLO_WORLD

So calling the code directly works fine. Calling it via the script fails.

Possible Solution:

The reason why this fails is due to the fact that after ssh I land in my remote machine's $HOME.

Therefore before executing the script, I have to cd in the proper directory. The correct method, besides giving absolute path is:

Another useful remark, is that I all the variables from .bashrc are undefined. Therefore one has to be careful.

 usr@machene_wehere_i_run_the_code "cd /home/run_dir_script ; run.sh"

So this somehow works

  • What does line 59 of /home/run_dir_script/run.sh (on the remote machine) say? What is the output of ls -l /home/run_dir_script/merge_tabs.sh (on the remote machine)? – Marki Nov 16 '14 at 14:24
  • At line 59 the script merge_tabs.sh is run. Basically I can call a script on the remote machine, but this script can't call other scripts. All scripts and binaries have permissions 777, therefore, permissions should not be causing this problem – Alexander Cska Nov 16 '14 at 14:41
  • Doesn't answer the question about what ls -l /home/run_dir_script/merge_tabs.sh says. ;-) – Marki Nov 16 '14 at 14:43
  • The output is: -rwxrwxrwx 1 user users 374 Nov 14 15:41 /home/run_dir_script/merge_tabs.sh*. Actually this problem is not only for scripts but for all codes run within run.sh. I rote a small fortran code that prints "Hello World" and run.sh issued the same error. – Alexander Cska Nov 16 '14 at 14:49
  • I suppose the partition is not mounted "noexec" or the like? (Seems not as the initial script correctly executes.) "No such file or directory" is pretty clear. Are you sure there's no typo in there somewhere? As a sidenote, I'd refrain from making everything writable by everyone. – Marki Nov 16 '14 at 14:51
  • @Marki - If it was set noexec then how does the initial script run? Though I agree w/ you the error is pretty self explanatory. – slm Nov 16 '14 at 15:06
  • @AlexanderCska - does the ./HELLO_WORLD run when you simply login to the remote and execute it? – slm Nov 16 '14 at 15:13
  • Yes without any problems. It also runs if I just call it by ssh usr@machene_wehere_i_run_the_code /home/run_dir_script/HELLO_WORLD. – Alexander Cska Nov 16 '14 at 15:15
  • @AlexanderCska - change to #!/bin/bash. – slm Nov 16 '14 at 15:19
  • Still the same. #!/bin/bash -l didn't work either. – Alexander Cska Nov 16 '14 at 15:21
  • @AlexanderCska - does the script work if you swap out ./HELLO_WORLD with just ls? – slm Nov 16 '14 at 15:23
  • @slm After countless experiments I discovered that the script needs the hardcoded absolute path. So /home/run_dir_script/HELLO_WORLD should replace line 5 in run.sh. It is quite strange that the path should be hardcoded. For instance $(pwd)/HELLO_WORLD won't work. Can you try running my minimal example on some remote machine of yours. I am curious what your test would yield . – Alexander Cska Nov 16 '14 at 15:36
  • @AlexanderCska - That's something unique to your env. I can use relative paths just fine. It may be something with the fact that it's Fortran. I have no Fortran code to confirm this but suspect that it's just your setup. – slm Nov 16 '14 at 15:37
  • Fortran is not the problem. I could have generated the same errors with a bash script doing echo "Hello World", or I could have used C or Java or whatever. In this case somehow the shell can't find the path, but I have no idea why. – Alexander Cska Nov 16 '14 at 16:01

2 Answers2

3

I would attempt to put the arguments to ssh in double quotes.

ssh usr@machene_wehere_i_run_the_code "/home/run_dir_script/run.sh inp 8"

Also based on that error message it sounds like the script cannot find this script:

/home/run_dir_script/run.sh: line 59: /home/run_dir_script/merge_tabs.sh: No such file or directory

Also I'd block the scp from happening if the ssh doesn't return a successful status:

ssh usr@machene_wehere_i_run_the_code "/home/run_dir_script/run.sh inp 8"
status=$?

if $status; then
  scp -p usr@machene_wehere_i_run_the_code:/home/run_dir_script/results \
    user@machine_where_the_resutlst_are_needed:~/
fi

Bottom line problem though is that there's an issue with your script locating the subordinate scripts on the remote system. There may be variables that are set when you login and run your script, vs. when you login via ssh and run your script.

For these I would compare the output of env using both methods.

slm
  • 369,824
1

There is nothing after on the line after ssh -X usr@machene_wehere_i_run_the_code in your code. So that command logs in on machene_wehere_i_run_the_code and does nothing.

In the example ssh call in the accepted answer of the question you quote there is an extra parameter:

ssh user@host path_to_script

and the path_to_script is missing in yours.

Anthon
  • 79,293
  • OK your comment solved part of the problem. I changed the script according to your prescription and I was able to get my run.sh going. I also edited my question. Unfortunately run.sh calls other scripts inside and they don't run properly. – Alexander Cska Nov 16 '14 at 14:13
  • @AlexanderCska What are the permissions on the HELLO_WORD file? – Anthon Nov 16 '14 at 15:33
  • 1
    @AlexanderCska You should not change your question except for providing additional detail to get the original problem solved if there are other issues, start a new question. You just invalidate any answers (correct or not) that answered the orginal problem by changing the question (which includes extending a question with "now I have the following problem"). – Anthon Nov 16 '14 at 15:38
  • I apologize for this. Well HELLO_WORLD is a fortran code and by default has permissions -rwxr-xr-x. Anyway I fiugred out that the the absolute path is needed, that is /home/run_dir_script/HELLO_WORLD. I also figured out that I have to hardcode the path, therefore, $(pwd)/HELLO_WORLD won't work. – Alexander Cska Nov 16 '14 at 16:06
  • @AlexanderCska Not a big problem, just something to watch. It is kind of strange that you need to give the absolute path, I have no explanation for that, that is why I did not suggest it. – Anthon Nov 16 '14 at 16:15
  • Hi, I overlooked something. When I ssh usr@machene_wehere_i_run_the_code /home/run_dir_script/run.sh, It lands in the $HOME and looks for everything there. Of course my stuff is located in /home/run_dir_script/. When using $(pwd) I get /home/ which is wrong. Therefore I need to tell the ssh command to go in the proper directory: usr@machene_wehere_i_run_the_code "cd /home/run_dir_script ;run.sh". Bim bam boom, and it works. – Alexander Cska Nov 16 '14 at 16:30
  • @AlexanderCska But then the output from the script as in your question cannot be correct, (/home/run_dir_script/\n Run the code) or am I missing something? Anyway good it is resolved. – Anthon Nov 16 '14 at 16:52
  • Yes, I corrected it. I made a mistake when copy-pasting. Actually I would like to use the opportunity and ask you one last question. Is it possible to execute long running codes like this. My code takes about 5 min to run. And it just stops in few seconds. – Alexander Cska Nov 16 '14 at 17:34
  • @AlexanderCska that should work. Is the connection broekn when it stops? Is it waiting for interaction? Try to start the program in a tmux session, to see if it is some interaction/stdin input that is expected and of course not working over the network. – Anthon Nov 16 '14 at 18:50
  • Hi, the was undefined variables. I source a script in my .bashrc. Unfortunately it looks as if the remote run via ssh is not executing the .bashrc Therefore I added an extra command at the beginning to source the file and it all works, clean as a whistle. Last but not least, thank you for your help. Actually it was the main precursor that guided me on the path to success. – Alexander Cska Nov 16 '14 at 19:00