0

I have a lot of files named like this:

n2+_PiU_w4_5348757.out
n2+_PiU_w2_5348755.out
n2+_PiU_w1_5348742.out
n2+_PiU_w1_5348729.out
n2+_PiU_w1_5348696.out
n2+_PiU_st3_w3_part6_5630814.out
n2+_PiU_st3_w3_part6_5630721.out
n2+_PiU_st3_w3_part5_5630720.out
n2+_PiU_st3_w3_part4_5630813.out

The point is, their names can be completely different and I need to sort them by the number before .out, i.e. by their ID.

I had a look on some similar questions (Sort based on the third column, Linux sort last column), but I'm not able to used sed or awk for my needs.

Would you, please, provide some way to sort them? Preferably using bash.

don_crissti
  • 82,805
Eenoku
  • 1,165
  • 1
  • 12
  • 22
  • 1
    Based on the fact that you've accepted Roman's answer it looks like you just wanted to process a text file (its content doesn't really matter). As you can see, the other answer assumes you wanted to process file names (via shell glob). Next time please be more explicit. And btw, bash is not a text editor. – don_crissti Mar 03 '18 at 21:05
  • @don_crissti I wanted to process file names - I didn't know how to sort them, but I'm able to list filenames and send them via pipe. – Eenoku Mar 03 '18 at 22:17

3 Answers3

1

With recent (> 4.0) GNU awk, using an associative array keyed on the numeric (second-to-last) field:

printf '%s\0' * | gawk '
  BEGIN {
    RS="\000"; FS="[_.]"; 
    PROCINFO["sorted_in"]="@ind_num_asc"
  } 
  {
    a[$(NF-1)]=$0
  } 
  END {
    for (k in a) print a[k]
}' 

ex.

printf '%s\0' * | gawk 'BEGIN{RS="\000"; FS="[_.]"; PROCINFO["sorted_in"]="@ind_num_asc"} {a[$(NF-1)]=$0} END {for (k in a) print a[k]}' 
n2+_PiU_w1_5348696.out
n2+_PiU_w1_5348729.out
n2+_PiU_w1_5348742.out
n2+_PiU_w2_5348755.out
n2+_PiU_w4_5348757.out
n2+_PiU_st3_w3_part5_5630720.out
n2+_PiU_st3_w3_part6_5630721.out
n2+_PiU_st3_w3_part4_5630813.out
n2+_PiU_st3_w3_part6_5630814.out

Similarly, using a perl hash:

printf '%s\0' * | perl -F'[_.]' -0ne '
  $h{$F[$#F-1]} = $_ }{ for $k (sort { $a <=> $b } keys %h) {print "$h{$k}\n"}
'
steeldriver
  • 81,074
1

With zsh globs:

$ printf '%s\n' *_<->.out(noe'(REPLY=${REPLY##*_})')
n2+_PiU_w1_5348696.out
n2+_PiU_w1_5348729.out
n2+_PiU_w1_5348742.out
n2+_PiU_w2_5348755.out
n2+_PiU_w4_5348757.out
n2+_PiU_st3_w3_part5_5630720.out
n2+_PiU_st3_w3_part6_5630721.out
n2+_PiU_st3_w3_part4_5630813.out
n2+_PiU_st3_w3_part6_5630814.out
  • <->: any sequence of digit (<x-y> with no bound)
  • (...): glob qualifier
  • n: numerical order
  • oe'(code)': order based on the evaluation of code:
  • REPLY=${REPLY##*_}: the sort key is the part after the last _
0

awk + sort + cut combination:

awk -F'_' '{ $0=$NF OFS $0 }1' files_list.txt | sort | cut -d' ' -f2-
  • -F'_' - field separator
  • $NF - last field (e.g. 5348696.out)
  • $0=$NF OFS $0 - prepend the current record $0 with the last field $NF value for further straightforward sorting (e.g. 5348757.out n2+_PiU_w4_5348757.out)
  • cut -d' ' -f2- - filtering fields starting from the 2nd

The output:

n2+_PiU_w1_5348696.out
n2+_PiU_w1_5348729.out
n2+_PiU_w1_5348742.out
n2+_PiU_w2_5348755.out
n2+_PiU_w4_5348757.out
n2+_PiU_st3_w3_part5_5630720.out
n2+_PiU_st3_w3_part6_5630721.out
n2+_PiU_st3_w3_part4_5630813.out
n2+_PiU_st3_w3_part6_5630814.out