Thursday, March 31, 2016

Shell scripting can be a very powerful tool, but critics often accuse it of being slow... and to some extent, the critics are correct.  This is because most Posix compliant shell scripts (#!/bin/sh ...) rely heavily on external tools like sed, grep, cut, awk, tr, etc... and many bash scripts (#!/bin/bash ...) are written by beginners with minimal knowledge of builtins.

Even some widely known projects have fallen victim to the shortcomings of the Posix shell ... remember hotplug.   In many cases, calls to external programs (which on my system take ~2-3ms each), can be replaced by much faster shell builtins with only a couple more lines of code.

  I won't get into the use of bashisms in scripts that purport to be Posix; instead I will just cover some common builtins that are in most modern shells (not necessarily Posix) that can speed up your scripts by as much as 3000% by replacing calls to external programs on files < ~1000 lines.


grep replacement - all Posix sh compliant shells

grep "$string" "$filename"

becomes:

while read LINE || "$LINE"; do
  case "$LINE" in
    *"$string"*)echo "$LINE";;
  esac
done < "$filename"


sed replacement - bash and busybox ash/hush

sed "s/match/replace/g" "$filename"

becomes:

while read LINE || "$LINE"; do
  echo "${LINE/match/replace}";;
done < "$filename"

Notes:
For shells without ${parameter/match/replace} you can instead use iterations of substring manipulation common to the majority of shells  ${parameter#match}
 ${parameter##match}
 ${parameter%match}
 ${parameter%%match}

Keep in mind the shell uses "globbing" instead of regex.
Similar methods can be used to replace tr as well.


cut replacement - all Posix sh compliant shells

cut -d $separator -f 2 $filename

becomes:

while IFS=$separator read F1 F2 REMAINDER || "$F1"; do
  echo "${F2}";;
done < "$filename"

Are you noticing a pattern here?  Many 

1 comment: