3

I have a text file with newline delimited strings. My problem is to process each line as follows: shuffle the order of tokens by using space as a delimiter.

For example:

Input: A B C

Output: C A B

Running the command/script repeatedly should of course provide different order.

My current solution (for a single text line):

$ cat <file> | tr " " "\n" | shuf | tr "\n" " "

Is there a nice (a better) command line combo to process a text file with multiple lines?

αғsнιη
  • 41,407

7 Answers7

6

POSIXly, you could do it with awk relatively efficiently (certainly more efficiently than running at least one GNU shuf utility for each line of the input) as:

awk '
  BEGIN {srand()}
  {
    for (i = 1; i <= NF; i++) {
      r = int(rand() * NF) + 1
      x = $r; $r = $i; $i = x
    }
    print
  }' < your-file

(note that in most awk implementations, running that same command twice in the same second is likely to give you the same result as the default random seed as used with srand() is generally based on the current epoch time in seconds).

4

Your original command can be simplified to

shuf -e A B C | tr "\n" " " && echo ""

or

shuffled=( $(shuf -e A B C) ) ; echo ${shuffled[*]}

Which I think is a little less hacky and is also faster from my rudimentary tests.

If you have a file at ~/test which contains

A B C
D E F

You can shuffle and echo each line with the following command

while IFS= read -r line; do shuffled=( $(shuf -e $line) ) ; echo ${shuffled[*]} ; done < ~/test

or in script form:

#!/bin/bash
while IFS= read -r line
    do shuffled=( $(shuf -e $line) )
    echo ${shuffled[*]}
done < ~/test

Where you might want to replace ~/test with $1 to pass arguments to the script.

result:

B C A
G E F

How this works:

shuf -e splits on spaces as well as newlines.. but only because it will treat A B C as three arguments.

so shuf -e A B C will shuffle A B and C but shuf -e "A B C" will not shuffle A B and C

We can use this to read each line into an array and then print it out again with echo.

while IFS= read -r line;

Reads in each line into $line when it is passed with < to this loop.

do shuffled=( $(shuf -e $line) )

Makes an array out of each line in the $shuffled variable, by literally expanding shuf -e $line to shuf -e A B C.

echo ${shuffled[*]}

echos our array, by default printing each element with spaces in between

< ~/test

feeds lines from ~/test into our loop.

Zhenhir
  • 149
3

Given

$ cat file
A B C
D E F
G H I J

then using shuffle from perl's List::Util module:

$ perl -MList::Util=shuffle -alpe '$_ = join " ", shuffle @F' file
C B A
E D F
I J G H

With bash read -a and shuf (but very inefficient as it runs 3 utilities per line, 2 of which not builtin):

$ while read -ra arr; do shuf -e -- "${arr[@]}" | paste -sd ' ' -; done < file
A C B
F E D
J I G H
steeldriver
  • 81,074
1

To pass the parameters as one line:

shuf -e one two three four is what you need.

shuf -e $(cat <file>) | tr "\n" " " for a file with one line, as in your example.

For multiple lines:

while read line; do shuf -e $line | tr "\n" " " && echo \n; done < <file>

Quora Feans
  • 3,866
1

While like @steeldriver, I would use a proper text processing tool like perl to do the job, I'll mention a hacky way with the zsh shell:

while read -rA words; do
  print -r -- /(e['reply=($words)']noe['REPLY=$RANDOM'])
done < your-file

It's a bit of a hack. We end up using filename generation so as to be able to use the o glob qualifier which lets us implement arbitrary sorting orders.

Here, we're globbing / (which we know always exists), use a the e glob qualifier to replace it with the content of our array, and then do a numerical ordering based on the REPLY=$RANDOM expression.

0

Here's one way to do it using the "much maligned" c-shell:

% foreach line ( "`cat input.txt`" )
     set tokens = ( $line:x )
     foreach ran_idx ( `seq $#tokens | shuf` )
         printf '%s\n' ${tokens[$ran_idx]:q}
     end
end
0

Here's a simpler one. Put your string into an array and use shuf to shuffle

SA=($"A B C")
shuf -e ${SA[@]}
doca
  • 275