I have a file called data
whose contents are
id,col1,col2
0,-0.3479417882673812,0.5664382596767175
1,-0.26800930980980764,0.2952025161991604
2,-0.4159790791116641,-1.3375045524610152
3,-0.7859665489205871,-0.6428101880909471
4,-1.3922759043388822,-1.676262144826317
5,-1.2471867496427498,-0.4912119581361516
6,1.443385383041667,1.6974039491263593
7,-2.058899802821969,2.0607628464079917
8,-0.10641338441541626,0.035929568275064216
9,-0.517273684861199,-0.6184800988804992
10,-0.9934859021679552,1.0577312348984502
11,0.5923834706792905,-0.6693757541250825
12,0.8657741917554445,-0.6876271057571398
13,-1.2061097548360489,-0.7402582563022937
14,0.78768021182158,-0.38607117005262315
Sorting numerically (-n
) on the first column gives
$ sort -nk1 -t"," data
0,-0.3479417882673812,0.5664382596767175
id,col1,col2
1,-0.26800930980980764,0.2952025161991604
2,-0.4159790791116641,-1.3375045524610152
3,-0.7859665489205871,-0.6428101880909471
4,-1.3922759043388822,-1.676262144826317
5,-1.2471867496427498,-0.4912119581361516
7,-2.058899802821969,2.0607628464079917
8,-0.10641338441541626,0.035929568275064216
9,-0.517273684861199,-0.6184800988804992
10,-0.9934859021679552,1.0577312348984502
13,-1.2061097548360489,-0.7402582563022937
6,1.443385383041667,1.6974039491263593
11,0.5923834706792905,-0.6693757541250825
12,0.8657741917554445,-0.6876271057571398
14,0.78768021182158,-0.38607117005262315
This absolutely bizarre to me. I read in the man page that -n
is supposed to be numerical sort. Why would id
be placed in-between numbers? How is it that 10
is larger than 9
, but smaller than 6
, all the while 11
being greater than them all?
The -g
seems to work as I want (and as I think is natural), but this -n
option totally escapes me. What's this about? I think it can be related to locale, but once I specify the delimiter as being ,
, I don't think that would explain it.
sort -t, -n -k1,1
is not working for me, it's placing0
aboveid
. Also, does your answer explain why10
is larger than9
, but smaller than6
, all the while11
being greater than them all? It's genuine question, I'm not able to answer this myself from reading your answer. – user347221 Apr 15 '19 at 16:18-t","
to use it as the field delimiter. – Barmar Apr 15 '19 at 16:5761.4433853830416671
in the input file? I see6,1.443385383041667,1.6974039491263593
. – Barmar Apr 15 '19 at 16:59-t","
after the key specification instead of before it? – Barmar Apr 15 '19 at 17:00-t,
but sort on the full line (-k1
which is superflous as that's the default) instead of the first field (-k1,1
).6,1.443385383041667
is interpreted bysort -n
as61.4433853830416671
because that,
thousand separator is ignored. – Stéphane Chazelas Apr 15 '19 at 17:10