From: Alfie Costa (agcosta@gis.net)
Date: Mon Feb 28 2000 - 19:30:22 CET
On 23 Feb 00, at 1:37, Alfie Costa <mulinux@sunsite.auc.dk> wrote:
> Ugly kludges in progress...
A new less ugly and debugged /bin/file is attached to this message. Based on
timing how long "file *" takes in various directories, it's around 2-10 times
faster than the original rustic 'file'. Some notes...
It's got a "-d" switch, which with good luck shouldn't be necessary. To peruse
the debug output, try:
file -d foo 2> debug.txt
less debug.txt
The search order, for which magic numbers or strings to try first, has been
changed. This version of 'file' looks for script files first, because those
and text files are what Midnight Commander's F3 is probably most used for.
Checking for plain text files is still slowest of all though, which can
probably be improved.
Same as last time, no temporary files are written.
TextString() does its compares using variables. At first it might seem as if
this might cause trouble in cases like this:
A="teststring"
B="test\0\0string" # the \0's are supposed to be nulls.
/bin/ash would remove the nulls and say these strings are the same. However,
since 'dd' is used to get the data, and 'dd' uses a count of how many bytes to
get, what would really happen is:
A="teststring"
B="test\0\0stri" # note the "ng" is missing
Then /bin/ash would shorten the original 12 byte 'B' to the 10-byte"teststri",
which is not the same as 'A'. So it really is safe to compare text strings.
TestOctal(). The trick to compare binary data is to do it one byte at a time,
and to add some extra character to the variable so that null strings don't
cause any trouble. Example:
null=\\000 # set the variable null to '\000' (octal)
foo=.`echo -e $null` # set foo to a period, followed by whatever ASCII
# char was in null.
koo="$foo" # duplicate foo for a demo
Test "$foo" = "$koo" # compare them
echo $? # outputs a zero if true.
Other than nulls, variables seem to be able to hold the other 255 possible
characters just fine.
Spaces... Spaces are disgustingly important when assigning variables and
comparing them. Unlike some other computer languages, if the spacing isn't
perfect, things go wrong. This hasn't even anything to do with quotes.
Examples:
foo=5 # good, no spaces when assigning a variable.
Test $foo=5 # bad, this won't work...
echo $? # not what it should be.
Test $foo = 5 # good, Test and [ ... ] must have spaces.
echo $? # OK now.
A surprising case:
Test 6=7 # bad, no spaces
echo $?
Test 6 = 7 # OK
echo $?
Hope this is useful...
#!/bin/ash
# rustic `file` (by M. Andreoli)
# [ with dd
# (2/28/00 provincial gentrification by A. Costa)
#Syntax
opt=$1
case Z$opt in
Z-d) set -x;shift;; # debug mode...
Z-h|Z) echo "Usage (mu-file): file [files]" ; exit ;;
*)
esac
# Functions
# compare with string
TestString()
{
f=$1
offset=$2; n=$3 ; string1="$4"
string2=`dd if=$f skip=$offset bs=1c count=$n 2>/dev/null`
test "$string1" = "$string2"
}
# compare with octal \\0x \\0y \\0z ...
TestOctal()
{
f=$1
offset=$2; n=$3
shift 3
for code in $@
do
p=.`echo -e $code` # note the leading period to tame the nulls
m=.`dd if=$f skip=$offset bs=1c count=1 2>/dev/null`
test "$m" != "$p" && return 1
offset=`expr $offset + 1`
done
}
for f in "$@"
do
# special
[ -d "$f" ] && echo "$f: directory" && continue
[ -L "$f" ] && echo "$f: symbolic link" && continue
[ ! -f "$f" ] && echo "$f: not existent" && continue
# script
if TestString $f 0 1 '#' # be lazy
then
TestString $f 0 9 '#!/bin/sh' && echo "$f: Bourne shell script text" && continue
TestString $f 0 10 '#!/bin/ash' && echo "$f: ash script text" && continue
TestString $f 0 11 '#!/bin/bash' \
&& echo "$f: Bash shell script text" && continue
TestString $f 0 10 '#!/bin/csh' && echo "$f: C shell script text" && continue
TestString $f 0 10 '#!/bin/ksh' && echo "$f: Korn shell script text" && continue
TestString $f 0 11 '#!/bin/perl' && echo "$f: perl command text" && continue
TestString $f 0 7 '#!/bin/' && echo "$f: script text" && continue
fi
# Linux
TestString $f 1 3 ELF \
&& echo -e "$f: ELF executable" && continue
TestString $f 1080 1 'S' \
&& echo "$f: Linux/i386 ext2 filesystem [probable :(]" && continue
TestString $f 4086 10 'SWAP-SPACE' \
&& echo "$f: Linux/i386 swap file" && continue
# compressed
TestString $f 257 5 ustar && echo "$f: TAR archive" && continue
TestOctal $f 0 2 \\037 \\0213 && echo "$f: gzip compress data" && continue
TestString $f 0 2 BZ && echo "$f: bzip compressed data" && continue
TestString $f 0 2 PK && echo "$f: Zip archive data" && continue
TestString $f 0 4 'Rar!' && echo "$f: RAR archive data" && continue
# text
TestString $f 0 5 '%PDF-' && echo "$f: PDF document" && continue
TestString $f 0 2 '%!' \
&& echo "$f: PostScript document text" && continue
TestOctal $f 0 2 \\0367 \002 && echo "$f: TeX DVI file" && continue
# Audio
TestString $f 0 4 MThd && echo "$f: Standard MIDI data" && continue
if TestString $f 0 4 RIFF
then
echo -n "$f: Microsoft RIFF"
if TestString $f 8 4 WAVE
then
echo ", WAVE audio data"
else
echo
fi
continue
fi
# image
TestOctal $f 0 2 \\0377 \\0330 && echo "$f: JPEG image data" && continue
TestString $f 0 4 GIF8 && echo "$f: GIF image data" && continue
TestString $f 0 2 BM && echo "$f: PC bitmap data" && continue
TestOctal $f 0 4 \\0115 \\0115 \\0 \\052 \
&& echo "$f: TIFF image data, big-endian" && continue
TestOctal $f 0 4 \\0111 \\0111 \\052 \\0 \
&& echo "$f: TIFF image data, little-endian" && continue
# binary
TestString $f 0 2 'MZ' && echo "$f: MS-DOS executable (EXE)" && continue
# HP48
TestString $f 0 7 'HPHP48-' && echo "$f: HP48 binary" && continue
TestString $f 0 5 '%%HP:' && echo "$f: HP48 text" && continue
echo "$f: ASCII text or data"
done
---------------------------------------------------------------------
To unsubscribe, e-mail: mulinux-unsubscribe@sunsite.auc.dk
For additional commands, e-mail: mulinux-help@sunsite.auc.dk
This archive was generated by hypermail 2.1.6 : Sat Feb 08 2003 - 15:27:13 CET