cmd:sort 

Usage 

file * | sort -t: -b -k 2       # FS=':' ignore blank, sort on the 2nd field
file * | sort -t: -b +1         # ditto, sort on the 2nd field, deprecated
[Note]

the -b is most important for -k !!!, without it, it would be very confusing.

cat passwd | sort -u -t : -k 5

sort by the name field, uniq

Help 

-f     fold lower case to upper case characters in keys
-n     compare according to string numerical value, imply -b
-r     reverse the result of comparisons
Field Separator Options
   The treatment of field separators can be altered  using  the
   following options:
-b     ignore leading blanks in sort fields or keys
-t char   Use char as the field separator  character.
Sort Key Options
   Sort keys can be specified using the options:
+POS1 [-POS2]
       start  a key at POS1, end it *before* POS2 (obsolescent) field
       numbers and character offsets are numbered starting with  zero
       (contrast with the -k option)
-k POS1[,POS2]
       start a key at POS1, end it *at* POS2 field numbers and
       character offsets are numbered starting with one (contrast
       with zero-based +POS form)

Fields & blanks 

Try to sort on the device.

$ cat $tf.3  | sort -b -k2.8,2.8 -k2.9n
cache   /dev/hda13      /lfs/cache
export.img      /dev/hda10      /export.img
os1     /dev/hda6       /lfs/os1
os3     /dev/hda8       /lfs/os3
rh8     /dev/hda3       /lfs/rh8
toBurn  /dev/hda12      /lfs/toBurn
bp2     /dev/hdd2       /lfs/bp2
export  /dev/hdd7       /export
vars1   /dev/hdd5       /vars

— wrong order.

$ cat $tf.3 | cut -f 2 | sort -k1.8,1.8 -k1.9n
/dev/hda3
/dev/hda6
/dev/hda8
/dev/hda10
/dev/hda12
/dev/hda13
/dev/hdd2
/dev/hdd5
/dev/hdd7

— right order, meaning the .8 and .9 are right.

$ cat $tf.3  | sort -k2.9,2.9 -k2.10n
rh8     /dev/hda3       /lfs/rh8
os1     /dev/hda6       /lfs/os1
os3     /dev/hda8       /lfs/os3
export.img      /dev/hda10      /export.img
toBurn  /dev/hda12      /lfs/toBurn
cache   /dev/hda13      /lfs/cache
bp2     /dev/hdd2       /lfs/bp2
vars1   /dev/hdd5       /vars
export  /dev/hdd7       /export

— right order, why there is one more shift for field 2 than 1?

$ cat $tf.3 | sort -t"$t" -k2.8,2.8 -k2.9n
rh8     /dev/hda3       /lfs/rh8
os1     /dev/hda6       /lfs/os1
os3     /dev/hda8       /lfs/os3
export.img      /dev/hda10      /export.img
toBurn  /dev/hda12      /lfs/toBurn
cache   /dev/hda13      /lfs/cache
bp2     /dev/hdd2       /lfs/bp2
vars1   /dev/hdd5       /vars
export  /dev/hdd7       /export

— right order

Any way to sort this? 

Example 1. : for \t delimiter, has to specify explicitly!

Newsgroups: comp.unix.shell

 > > Is there any way to sort the following files into the right order?
 > > thanks
 > >
 > > syntaxref11.html
 > > syntaxref1110.html
[...]
 > > syntaxref1115.html
 > > syntaxref112.html
[...]
 > > syntaxref119.html
 >
 > ls |sort -t. -n +0.9
 > or
 > ls |sort -t. -n -k 1.10
 >
 > The -t. tells sort that the field separator character is a fullstop, the -n
 > specifies an arithmetic sort and +0.9 tells sort that the "key" is plus 0
 > fields (ie the 1st field) and plus 9 characters within the field. Similarly
 > the -k 1.10 tells sort that the "key" is the 1st field and the 10 character
 > within that 1st field.

Here is a summary of the solutions so far:

  • Use sed or awk, to pick out the number field for sorting then rebuild the string again, as suggested by Charles, and Michael. this is a two step solution.
  • Use sort -n +0.9, as suggested by Chris, The problem is that using +0.9 is a deprecated
  • Use sort -t. -n -k 1.10, as suggested by Richard. It's the winner solution. My 2c comment is that the -t. is not necessary *for this task*, but is definitively the total solution for something more difficult.

Perl solution: 

perl -e 'sub d {($_[0] =~ /(\d+)/)[0]} print sort {d($a) <=> d($b)} <>' <infile >outfile

Good for about 50 files, but for more, you'd want to cache the regex part using a Schwartzian Transform or Orcish maneuver or GSR sort.

Randal L. Schwartz

gnu sort, field selection bug? 

I think I have found a bug in sort field selection algorithm. The following are the examples:

$ ls -1 sfa*
sfa1001ext
sfa1002ext
sfa100ext
sfa10ext
sfa1ext
sfa200ext
sfa20ext
sfa2ext
sfa300ext
sfa30ext
sfa3ext

The goal is sort on the number. Before we get into it, let's look at some warm up exercises first:

$ ls sfa* | sort -k 1.4
sfa1001ext
sfa1002ext
sfa100ext
sfa10ext
sfa1ext
sfa200ext
sfa20ext
sfa2ext
sfa300ext
sfa30ext
sfa3ext
$ ls sfa* | sort -n -k 1.4
sfa1ext
sfa2ext
sfa3ext
sfa10ext
sfa20ext
sfa30ext
sfa100ext
sfa200ext
sfa300ext
sfa1001ext
sfa1002ext

— so far so good

$ ls sfa* | sort -n -k 1.4,1.4
sfa1001ext
sfa1002ext
sfa100ext
sfa10ext
sfa1ext
sfa200ext
sfa20ext
sfa2ext
sfa300ext
sfa30ext
sfa3ext

— There is the bug! The output should be the same as previous one.

My sort comes along with RH8:

$ sort --v
sort (textutils) 2.0.21
Written by Mike Haertel and Paul Eggert.

The impat of this bug will make it impossible to sort in certain circumstances. For example:

$ ls -1 sfb*
sfb10-B
sfb10000-A
sfb10001-A
sfb10002-A
sfb11-B
sfb12-B
sfb8-B
sfb9-B
sfb9998-A
sfb9999-A

We need to sort the list first by the characters then by the numbers. There is no way to do it with the current sort program:

$ ls -1 sfb* | sort -t- -k2,2 -n -k1.4,1.4
sfb10-B
sfb10000-A
sfb10001-A
sfb10002-A
sfb11-B
sfb12-B
sfb8-B
sfb9-B
sfb9998-A
sfb9999-A

— this should be the right way, but the result is wrong.

gnu sort, field selection bug? 

> We need to sort the list first by the characters then by the
> numbers. There is no way to do it with the current sort program:
>
> $ ls -1 sfb* | sort -t- -k2,2 -n -k1.4,1.4
> sfb10-B
> sfb10000-A
> sfb10001-A
> sfb10002-A
> sfb11-B
> sfb12-B
> sfb8-B
> sfb9-B
> sfb9998-A
> sfb9999-A
>
>   -- this should be the right way, but the result is wrong.

I think your problem is where you're applying a global "n" option.

Try ls -1 sfb* | sort -t- -k2 -k1.4n

sfb9998-A
sfb9999-A
sfb10000-A
sfb10001-A
sfb10002-A
sfb8-B
sfb9-B
sfb10-B
sfb11-B
sfb12-B

which is, I think, what you want.

documented on: Wed 04-14-99 02:24:46