I am running my shell script on machineA
which copies the files from machineB
and machineC
to machineA
.
If the file is not there in machineB
, then it should be there in machineC
for sure. So I will try to copy file from machineB
first, if it is not there in machineB
then I will go to machineC
to copy the same files.
In machineB
and machineC
there will be a folder like this YYYYMMDD
inside this folder -
/data/pe_t1_snapshot
So whatever date is the latest date in this format YYYYMMDD
inside the above folder - I will pick that folder as the full path from where I need to start copying the files -
so suppose if this is the latest date folder 20140317
inside /data/pe_t1_snapshot
then this will be the full path for me -
/data/pe_t1_snapshot/20140317
from where I need to start copying the files in machineB
and machineC
. I need to copy around 400
files in machineA
from machineB
and machineC
and each file size is 2.5 GB
.
Earlier, I was trying to copy the files one by one in machineA
which is really slow. Is there any way, I can copy "three" files at once in machineA
using threads in bash shell script?
Below is my shell script which copies the file one by one in machineA
from machineB
and machineC
.
#!/usr/bin/env bash
readonly PRIMARY=/export/home/david/dist/primary
readonly FILERS_LOCATION=(machineB machineC)
readonly MEMORY_MAPPED_LOCATION=/data/pe_t1_snapshot
PRIMARY_PARTITION=(0 548 272 4 544 276 8 556 280 12 552 284 16 256 564 20 260 560 24 264 572) # this will have more file numbers around 200
dir1=$(ssh -o "StrictHostKeyChecking no" david@${FILERS_LOCATION[0]} ls -dt1 "$MEMORY_MAPPED_LOCATION"/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] | head -n1)
dir2=$(ssh -o "StrictHostKeyChecking no" david@${FILERS_LOCATION[1]} ls -dt1 "$MEMORY_MAPPED_LOCATION"/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] | head -n1)
## Build your list of filenames before the loop.
for n in "${PRIMARY_PARTITION[@]}"
do
primary_files="$primary_files :$dir1"/t1_weekly_1680_"$n"_200003_5.data
done
if [ "$dir1" = "$dir2" ]
then
find "$PRIMARY" -mindepth 1 -delete
rsync -avz david@${FILERS_LOCATION[0]}"${primary_files}" $PRIMARY/ 2>/dev/null
rsync -avz david@${FILERS_LOCATION[1]}"${primary_files}" $PRIMARY/ 2>/dev/null
fi
So I am thinking instead of copying one file at a time, why not just copy "three" files at once and as soon these three files are done, I will move to another three files in the list to copy at same time?
I tried opening three putty instances and was copying one file from those three instances at the same time. All the three files were copied in ~50 seconds so that was fast for me. And because of this reason, I am trying to copy three files at once instead of one file at a time.
Is this possible to do? If yes, then can anyone provide an example on this? I just wanted to give a shot and see how this is working out.
@terdon helped me with the above solution but I wanted to try copying three files at once to see how it will behave.
Update:-
Below is the simplified version of the above shell script. It will try to copy files from machineB
and machineC
into machineA
as I am running the below shell script on machineA
. It will to try copy file numbers which are present in PRIMARY_PARTITION
.
#!/usr/bin/env bash
readonly PRIMARY=/export/home/david/dist/primary
readonly FILERS_LOCATION=(machineB machineC)
readonly MEMORY_MAPPED_LOCATION=/data/pe_t1_snapshot
PRIMARY_PARTITION=(0 548 272 4 544 276 8 556 280 12 552 284 16 256 564 20 260 560 24 264 572) # this will have more file numbers around 200
dir1=/data/pe_t1_snapshot/20140414
dir2=/data/pe_t1_snapshot/20140414
## Build your list of filenames before the loop.
for n in "${PRIMARY_PARTITION[@]}"
do
primary_files="$primary_files :$dir1"/t1_weekly_1680_"$n"_200003_5.data
done
if [ "$dir1" = "$dir2" ]
then
# delete the files first and then copy it.
find "$PRIMARY" -mindepth 1 -delete
rsync -avz david@${FILERS_LOCATION[0]}"${primary_files}" $PRIMARY/
rsync -avz david@${FILERS_LOCATION[1]}"${primary_files}" $PRIMARY/
fi
ls
output. – l0b0 Apr 21 '14 at 08:41readonly
declarations... – l0b0 Apr 21 '14 at 08:45machineB
andmachineC
intomachineA
so I need to download three files at a time frommachineB
andmachineC
into machine. If the files are not there inmachineB
then it should be there inmachineC
for sure, So I will try to copy file frommachineB
first, if it is not there inmachineB
then I will go tomachineC
to copy the same files. Let me know if anything is not clear. – arsenal Apr 21 '14 at 18:51