2

Our school HPC does not have a scheduler. So there is nothing like job queue. Hence, I cannot automate parallel job submission by qsub or sbatch.

What I have been using to "submit" a job is by using screen: type screen, then press Enter, then type ./runMyJob.sh, then press CTRL+a followed by d to detach.

Now I wish to automate/script the process of starting several parallel screen sessions, then running a job in each session, and finally detaching all the screen sessions.

As you can see, during the manual operations, I pressed Enter and CTRL+a followed by d. How do I script these key-pressing operations?

P.S.: any suggestion that helps achieve parallel job submission in a HPC without a scheduler is very much welcomed!

2 Answers2

3

Don't think of it in terms of pressing keys, think of it in terms of accomplishing a task. Pressing keys is a way to do things interactively. The way to accomplish things automatically is to script them.

To start a job in screen and detach it immediately, run

screen -md ./runMyJob.sh

If you want to make your jobs easier to find, you can pass the option -S to set a session name.

For example, you can write the following script, which uses the name of the job executable as the session name:

#!/bin/sh
screen -md -S "${1##*/}" "$@"

Call it something like submit, put it in a directory on your PATH (Single-user binary installation location? and How to add home directory path to be discovered by Unix which command? may help), make it executable (chmod +x ~/bin/submit). To start a job, run

submit ./runMyJob.sh

For parallel execution, you may want to investigate GNU parallel.

Note that a job submission framework does more that start jobs. It also arranges for them to run where there is available CPU time and memory, and to send logs to the submitters. You should request that your administrators set up a proper job submission framework.

  • This is truly enlightening. But I got me@pc:~/code> screen -md ./myJob.sh Must run suid root for multiuser support. – Sibbs Gambling Oct 08 '14 at 12:55
  • @FarticlePilter This error doesn't make sense for this command line since it doesn't invoke multiuser support. What is in your ~/.screenrc, and in your system screenrc (typically /etc/screenrc or /etc/screen/screenrc)? – Gilles 'SO- stop being evil' Oct 08 '14 at 15:16
  • Strangely, I do not have ~/.screenrc, but I do have /etc/screenrc. It says `# this is the global screenrc file. Handle with care.

    termcapinfo xterm* G0:is=\E[?4l\E>:ti@:te@ termcapinfo linux me=\E[m:AX` Oh, btw, I am doing this on a HPC server.

    – Sibbs Gambling Oct 08 '14 at 15:21
  • @FarticlePilter Ah, got it. You actually ran my script and not screen -md ./myJob.sh, right? There mustn't be a / in the session name. I've fixed my script to use only the base name of the executable as the session name and not the full path. – Gilles 'SO- stop being evil' Oct 08 '14 at 16:24
0

Enter, Ctrl-a and d generate normal ASCII codes.

So a possible solution could be a program that creates an unnamed pipe (pipe()), then fork()s a child process which first binds the read end of the pipe to its standard input, then executing screen in the child process (execve() or similar). If that program is started, you can write the command lines required to start a process to the write end of the created pipe.

How you put the tasks into that program is another topic. You could, for example, write a small scheduler yourself (something like a queue of jobs and some code that starts at most N processes in parallel).

== EDIT ==

For Linux (or maybe some UN*Xes, too), the program could look like the following:

#include <sys/types.h>
#include <sys/linux.h>
#include <unistd.h>

int main(void) {
    int fds[2] = {0};
    pid_t pid = 0;

    pipe(fds);               /* Creates an unnamed pipe */
    pid = fork();            /* Forks a new process */
    if (pid == 0) {
        static char const *argv[] = {"/usr/bin/screen", NULL};
                             /* Note: The array might need to be changed,
                              *       depending on your requirements
                              *       (eg. command-line arguments)
                              */
        dup2(fds[0], stdin); /* Binds the read end of the pipe to stdin */
        execve(argv[0], argv, NULL);
        /* If you reach this point, your execve() failed */
    }
    sleep(1);                /* Waits for the child process to execute
                              * screen */
    char const data[] = "./MyJob.sh\n\x00d";
                             /* Note: You must replace the '\x00' by the
                              *       ASCII code of C-a!
                              */
    write(fds[1], data, sizeof(data));
                             /* Writes the name of the job along with the
                              * control codes to the child process
                              */
    int retcode = 0;
    waitpid(pid, &retcode, 0);
                             /* Waits for the child process to terminate */
                             /* Note: WEXITSTATUS(retcode) gets the exit
                              *       status of the child process */
    return 0;
}

This program shall illustrate the idea, it lacks the necessary error handling.

Abrixas2
  • 878
  • Sorry but I didn't get it. Could you please write a sample, assuming my job is MyJob.sh? Thanks a lot – Sibbs Gambling Oct 04 '14 at 16:36
  • @FarticlePilter: Added example. Please note that I have not tested that code, so it might not work without some changes (additional to those needed to do proper error handling). – Abrixas2 Oct 04 '14 at 17:01