Re: [Nolug] Sorting lists of domains

From: Dustin Puryear <dpuryear_at_usa.net>
Date: Wed, 18 Jun 2003 22:36:45 -0500
Message-Id: <5.1.0.14.0.20030618220107.031ffb90@pop.netaddress.com>

At 07:00 PM 6/18/2003 -0500, you wrote:
>Doesn't work on my test set:
>
>h.net
>3.e.edu
>peaf.fde.a.com
>c.mil

Ah yes. The output of the shell script for your data set is:

c5.a.com
peaf.fde.a.com
x1.a.com
d.com
3.e.edu
2.b.gov
c.mil
4.f.net
h.net
g.uk

The script is not handling the d.com properly. The script reverses the FQDN
like so:

net.h
edu.e.3
com.a.fde.peaf
mil.c
com.d
com.a.x1
net.f.4
uk.g
gov.b.2
com.a.c5

And then sorts that. Doing it using this method ensures that the TLD gets
sorted first. Thinking about it further I realize that I can sort the TLD
and not worry about reversing the rest. My new solution can still handle an
arbitrary number of subdomains, but handles all domains properly.
Unfortunately, it's big!

#!/bin/sh

PATH=/bin:/usr/bin

reverse() {
         rm $2
         for s in `cat $1`; do
                 dotcnt=`echo $s | sed -e 's/[^.*]//g' | wc -c | sed -e 's/
*//g'`
                 echo "`echo $s | cut -d. -f$dotcnt`.`echo $s | cut -d.
-f1-\`expr $dotcnt - 1\``" >> $2
         done
}

unreverse() {
         rm $2
         for s in `cat $1`; do
                 echo "`echo $s | cut -d. -f 2-`.`echo $s | cut -d. -f
1`" >> $2
         done
}

reverse text text.1
sort text.1 > text.2
unreverse text.2 text.3

I'd like to see a solution that cuts this by a few lines. Also, there has
to be a better way to determine $dotcnt.

---
Dustin Puryear <dustin@puryear-it.com>
Puryear Information Technology
Windows, UNIX, and IT Consulting
http://www.puryear-it.com
___________________
Nolug mailing list
nolug@nolug.org
Received on 06/18/03

This archive was generated by hypermail 2.2.0 : 12/19/08 EST