1

I have lots of dump files in 2 different directories that need to be uploaded using rsync, I want to run rsync for each dump file simultaneously to save time.

Dump Files:

Dir1:

-  /u05/expdpdump/exppdb/dir1/NoTDE_CDB_FULL1_01.dmp
-  /u05/expdpdump/exppdb/dir1/NoTDE_CDB_FULL1_02.dmp
-  /u05/expdpdump/exppdb/dir1/NoTDE_CDB_FULL1_03.dmp
-  /u05/expdpdump/exppdb/dir1/NoTDE_CDB_FULL1_04.dmp

Dir2:

-  /u05/expdpdump/exppdb/dir2/NoTDE_CDB_FULL2_01.dmp
-  /u05/expdpdump/exppdb/dir2/NoTDE_CDB_FULL2_02.dmp
-  /u05/expdpdump/exppdb/dir2/NoTDE_CDB_FULL2_03.dmp
-  /u05/expdpdump/exppdb/dir2/NoTDE_CDB_FULL2_04.dmp

Following rsync Command to run for each dumpfile from each directory in the background:

rclone sync /u05/expdpdump/exppdb/NoTDE_CDB_FULL_01.dmp NoTDE_Mig1:IC_dbbackup_config_datapump_xxxxxx

I managed to get the desired output to run rsync for each file, however, i want to track the start and end time for each file as well as overall elapsed time for the rsync jobs running in background. using time in for loop will provide only elapsed time for each file, however i also need overall elapsed time of Start and End Time after completing all jobs


dumpdir1="/u05/expdpdump/exppdb/dir1"

for i in $dumpdir1/*.dmp
do

    echo " time rclone sync  $i NoTDE_Mig1:IC_dbbackup_config_datapump_v00rcfh_iad3p2 &"
done

dumpdir2="/u05/expdpdump/exppdb/dir2"

for i in $dumpdir2/*.dmp
do
    echo " time rclone sync $i NoTDE_Mig1:IC_dbbackup_config_datapump_v00rcfh_iad3p2 &"
done```


  • @user1133275 why would this be a duplicate of a question that explains the differences between parallel and xargs? Yes, both of those tools could be used here, but the question is about tracking the start and end time. – terdon Mar 29 '19 at 11:18

1 Answers1

0

Something like this:

# `time` will give the total time
time parallel -j0 --joblog mylog rclone sync {} NoTDE_Mig1:IC_dbbackup_config_datapump_v00rcfh_iad3p2 ::: /u05/expdpdump/exppdb/dir{1,2}/*.dmp
# the log contains the time per file
cat mylog 

Depending on your disks you can get better performance if you do not run all commands in parallel. You limit the number of jobs running in parallel to 5 by replacing -j0 with -j5.

Ole Tange
  • 35,514
  • Thanks Ole, does your suggestion provides "elapsed time" for each file as well as overall elapsed time for all rclone jobs? I see 'prefixing time' in for loop will provide me elapsed time for each file, however, i am afraid, whether i can track start time and end time of overall. – CoolChap007 Mar 27 '19 at 17:11
  • I get following error

    -bash: parallel: command not found

    – CoolChap007 Mar 27 '19 at 17:18
  • 2
    @CoolChap007 so install it then! – Chris Davies Mar 27 '19 at 20:11