At 07:00 PM 6/18/2003 -0500, you wrote:
>Doesn't work on my test set:
>
>h.net
>3.e.edu
>peaf.fde.a.com
>c.mil
Ah yes. The output of the shell script for your data set is:
c5.a.com
peaf.fde.a.com
x1.a.com
d.com
3.e.edu
2.b.gov
c.mil
4.f.net
h.net
g.uk
The script is not handling the d.com properly. The script reverses the FQDN
like so:
net.h
edu.e.3
com.a.fde.peaf
mil.c
com.d
com.a.x1
net.f.4
uk.g
gov.b.2
com.a.c5
And then sorts that. Doing it using this method ensures that the TLD gets
sorted first. Thinking about it further I realize that I can sort the TLD
and not worry about reversing the rest. My new solution can still handle an
arbitrary number of subdomains, but handles all domains properly.
Unfortunately, it's big!
#!/bin/sh
PATH=/bin:/usr/bin
reverse() {
rm $2
for s in `cat $1`; do
dotcnt=`echo $s | sed -e 's/[^.*]//g' | wc -c | sed -e 's/
*//g'`
echo "`echo $s | cut -d. -f$dotcnt`.`echo $s | cut -d.
-f1-\`expr $dotcnt - 1\``" >> $2
done
}
unreverse() {
rm $2
for s in `cat $1`; do
echo "`echo $s | cut -d. -f 2-`.`echo $s | cut -d. -f
1`" >> $2
done
}
reverse text text.1
sort text.1 > text.2
unreverse text.2 text.3
I'd like to see a solution that cuts this by a few lines. Also, there has
to be a better way to determine $dotcnt.
--- Dustin Puryear <dustin@puryear-it.com> Puryear Information Technology Windows, UNIX, and IT Consulting http://www.puryear-it.com ___________________ Nolug mailing list nolug@nolug.orgReceived on 06/18/03
This archive was generated by hypermail 2.2.0 : 12/19/08 EST