Thursday, March 31, 2016

Shell scripting can be a very powerful tool, but critics often accuse it of being slow... and to some extent, the critics are correct.  This is because most Posix compliant shell scripts (#!/bin/sh ...) rely heavily on external tools like sed, grep, cut, awk, tr, etc... and many bash scripts (#!/bin/bash ...) are written by beginners with minimal knowledge of builtins.

Even some widely known projects have fallen victim to the shortcomings of the Posix shell ... remember hotplug.   In many cases, calls to external programs (which on my system take ~2-3ms each), can be replaced by much faster shell builtins with only a couple more lines of code.

  I won't get into the use of bashisms in scripts that purport to be Posix; instead I will just cover some common builtins that are in most modern shells (not necessarily Posix) that can speed up your scripts by as much as 3000% by replacing calls to external programs on files < ~1000 lines.


grep replacement - all Posix sh compliant shells

grep "$string" "$filename"

becomes:

while read LINE || "$LINE"; do
  case "$LINE" in
    *"$string"*)echo "$LINE";;
  esac
done < "$filename"


sed replacement - bash and busybox ash/hush

sed "s/match/replace/g" "$filename"

becomes:

while read LINE || "$LINE"; do
  echo "${LINE/match/replace}";;
done < "$filename"

Notes:
For shells without ${parameter/match/replace} you can instead use iterations of substring manipulation common to the majority of shells  ${parameter#match}
 ${parameter##match}
 ${parameter%match}
 ${parameter%%match}

Keep in mind the shell uses "globbing" instead of regex.
Similar methods can be used to replace tr as well.


cut replacement - all Posix sh compliant shells

cut -d $separator -f 2 $filename

becomes:

while IFS=$separator read F1 F2 REMAINDER || "$F1"; do
  echo "${F2}";;
done < "$filename"

Are you noticing a pattern here?  Many 

Monday, December 19, 2011

simple script debugging

#DEBUG=:    #turns off debugging
DEBUG="eval echo \$LINENO:"        #turns on debugging messages

$DEBUG some message with appropriate $variables

Monday, October 31, 2011

localizing shell scripts without bashisms, gettext or ... anything

I will explain the basic code that is used first, just in case - experts proceed to code block.

1. First we will deal with variable usage.
to print a variable VAR you can use echo

echo $VAR
But lets say we don't know if VAR will be empty, and then we want to use some default

if you have done a lot of scripting, you may already be thinking something like:

[ "$VAR" ] && echo $VAR || echo "default message"
which is just a shorter way of doing:
if test [ "$VAR" ]
then
echo $VAR
else
echo default message
fi
both of which are superfluous, but you will see it a lot when instead you can do this:

echo ${VAR:-default message}


2. Next we will try to understand substring manipulation
Assume we have a variable LANG=en_US
and we only need the part before the "_" (the actual language)

In many scripts you will see something like this:
myLANG=`echo $LANG |cut -d "_" -f1 `
cut is not a shell builtin, so it will slow down the script (not drastically but it will) and can usually be avoided if you use this instead
myLANG=${LANG%_*}
3. sourcing a file - you can include multiple entire text files worth of code as simply as:

#this file contains all of my functions
. /usr/local/share/myprogram/myfunctions
Now we will get to the whole point; localizing our bash/shell script using only variables, substring manipulation and sourcing a file.  In this case we will just use 2 languages and a single variable, but this can be expanded to as many variables and languages as you would like.

cat /usr/share/locale/en/myprog
VAR="Hello World"

cat /usr/share/locale/es/myprog
VAR="Hola Mundo"

cat /usr/sbin/myprog
#!/bin/sh

LANGPATH=/usr/share/locale/${LANG%_*}/myprog
[ -f $LANGPATH ] && . $LANGPATH
echo ${VAR:-Hello World}

Note that the "en"  locale is kind of redundant in this case
You can use the en locale as a template and eliminate the default string in your variable by  instead doing this:

[ -f $LANGPATH ] && . $LANGPATH || . /usr/share/locale/en/myprog

or simply declare all the english variables within the script prior to loading locales:

VAR="Hello World"
[ -f $LANGPATH ] && . $LANGPATH

Friday, October 28, 2011

Stop waiting on wait

If you have gone through one of your scripts and concluded that the only way to make sure that all necessary child processes have completed is to either use "wait" or sleep <really large number> I have something for you. If not, just ignore this, its probably for a few niche scripters anyways.

Here is a function I call wait_pids in an example script:

 #!/bin/sh  
 #copyright 2011 Brad Conroy - redistributable under the UIUC license  
 #wait_pids is a function to replace wait when you only need to wait for some  
 #not all child processes (ex. speeding up init, or other custom scripts)  
   
 wait_pids(){  
 #this squeezes separators into spaces  
 PIDS=`echo $@`  
   
 #string replacement is not posix but is in most shells, so use it vs. sed  
 PIDS="[ -d /proc/"${PIDS// / ] || [ -d /proc/}" ]"  
   
 #each process gets a directory with its process id in /proc  
 while (`eval $PIDS`) ; do  
   
 #we don't like the taste of cpus, so lets not eat them - feel free to tweak  
 usleep 1000  
   
 #uncomment/tweak the next line if you want to see an indicator while waiting  
 #printf .  
 done  
 }  
   
 #just a test program to fork so we can get a test pid  
 xmessage 11111111111111111111111 &  
 #the $! gives the process id of the last command  
 FIRST=$!  
   
 #a second program so we know whether it works for multiple processes  
 xmessage 2222222222222222222222222 &  
 SECOND=$!  
   
 wait_pids $FIRST $SECOND  

Note that any amount of code can be in between the child processes

Friday, October 21, 2011

unbloated resources in C

Here is a list of alternative libraries written in C, mostly with liberal (BSDish) licenses 

Ssl/encryption ... libtomcrypt
Imaging ... stb_image (nothings.org) or nanojpeg+lodepng+webp
Ecmascript (aka javascript) ... see-3.1.1424.tar.gz (currently unmaintained)
OpenGL ... tinyGL <<== SDL implementation
Html5 ... hubbub
Css ... libcss
Svg ... libtinysvg
Lua ... stua (nothings.org)
Freetype ... stb_freetype
Tcl ... jimtcl
Ogg ... stb_ogg
Gcc ... llvm+clang or tinycc (lgpl)
Perl ... microperl (distributed with perl)
Python ... tinypy
GUI ... sdl, agarpicogui, anttweakbar
Gnu-utils ... Google's toolbox, asmutils (gpl2), busybox (gpl2), embutils (gpl2), toybox (gpl2)...
Video ... Webmtheora
glibc...
bionic, musl (lgpl), uclibc (lgpl), dietlibc(gpl2), newlibc  or a bsdlibc...

If you really want to use C++ without the bloat of libstdc++, try one of these standard template libraries: 
... libcxx, uclibc++, stlport, eastl, ustl, stdcxx, ... the sgi stl 

more to follow 

Thursday, October 20, 2011

getting an ip address

I sometimes need to know my local IP address for sharing data on my local network.
here is a oneliner that does the trick:


grep ":" /proc/net/arp |awk '{print $1}'
And a faster way as a function using only shell builtins:
get_ip(){
while read A || [ "${A}" ]; do
    case "${A}" in
        [0-9]*)
            echo "${A%% *}"
        ;;
    esac
done </proc/net/arp
}
The second one seems like it would be slower, because it is more code.  Right?  Not really, when you account for grep and awk as "code".  They take a not-insignificant time to be found, load and then run, but lets be honest, you won't notice this if you are running it by itself from a prompt.  It could however come in handy in speeding up a boot process on a connected device.


But what if we have a local dynamic ip address and want to know what our "real" ip address is.
Kieth Hatfield has posted a server side php script and shell script to do just that here:
http://www.keithscode.com/blog-items/wan-ip-from-a-bash-script.html

the server side php is pretty simple:
<?php echo $_SERVER['REMOTE_ADDR']; ?>
 and the shell script is:
wget -q -O - http://some.url.com 2>/dev/null
 (Note: Keith's demo is at http://ip.keithscode.com)
 There are several other places to get this information such as:
http://automation.whatismyip.com/n09230945.asp    (NOTE: once every 300 seconds)
checkip.dyndns.org (NOTE: has html formatting)