In my .profile
, I call a script to determine the width of a string on a terminal. I use this when logging in on the console of a machine where I don't trust the system-set LC_CTYPE
, or when I log in remotely and can't trust LC_CTYPE
to match the remote side. My script queries the terminal, rather than calling any library, because that was the whole point in my use case: determine the encoding of the terminal.
This is fragile in several ways:
- it modifies the display, so it isn't very nice user experience;
- there's a race condition if another program displays something at the wrong time;
- it locks up if the terminal doesn't respond. (A few years ago I asked how to improve on this, but it hasn't been much of an issue in practice so I never got around to switching to that solution. The only case I encountered of a terminal that doesn't respond was a Windows Emacs accessing remote files from a Linux machine with the
plink
method, and I solved it by using the plinkx
method instead.)
This may or may not match your use case.
#! /bin/sh
if [ z"$ZSH_VERSION" = z ]; then :; else
emulate sh 2>/dev/null
fi
set -e
help_and_exit () {
cat <<EOF
Usage: $0 {-NUMBER|TEXT}
Find out the width of TEXT on the terminal.
LIMITATION: this program has been designed to work in an xterm. Only
xterm and sufficiently compatible terminals will work. If you think
this program may be blocked waiting for input from the the terminal,
try entering the characters "0n0n" (digit 0, lowercase letter n,
repeat).
Display TEXT and erase it. Find out the position of the cursor before
and after displaying TEXT so as to compute the width of TEXT. The width
is returned as the exit code of the program. A value of 100 is returned if
the text is wider than 100 columns.
TEXT may contain backslash-escapes: \\0DDD represents the byte whose numeric
value is DDD in octal. Use '\\\\' to include a single backslash character.
You may use -NUMBER instead of TEXT (if TEXT begins with a dash, use
"-- TEXT"). This selects one of the built-in texts that are designed
to discriminate between common encodings. The following table lists
supported values of NUMBER (leftmost column) and the widths of the
sample text in several encodings.
1 ASCII=0 UTF-8=2 latinN=3 8bits=4
EOF
exit
}
builtin_text () {
case $1 in
-*[!0-9]*)
echo 1>&2 "$0: bad number: $1"
exit 119;;
-1) # UTF8: {\'E\'e}; latin1: {\~A\~A\copyright}; ASCII: {}
text='\0303\0211\0303\0251';;
*)
echo 1>&2 "$0: there is no text number $1. Stop."
exit 118;;
esac
}
text=
if [ $# -eq 0 ]; then
help_and_exit 1>&2
fi
case "$1" in
--) shift;;
-h|--help) help_and_exit;;
-[0-9]) builtin_text "$1";;
-*)
echo 1>&2 "$0: unknown option: $1"
exit 119
esac
if [ z"$text" = z ]; then
text="$1"
fi
printf "" # test that it is there (abort on very old systems)
csi='\033['
dsr_cpr="${csi}6n" # Device Status Report --- Report Cursor Position
dsr_ok="${csi}5n" # Device Status Report --- Status Report
stty_save=`stty -g`
if [ z"$stty_save" = z ]; then
echo 1>&2 "$0: \`stty -g' failed ($?)."
exit 3
fi
initial_x=
final_x=
delta_x=
cleanup () {
set +e
# Restore terminal settings
stty "$stty_save"
# Restore cursor position (unless something unexpected happened)
if [ z"$2" = z ]; then
if [ z"$initial_report" = z ]; then :; else
x=`expr "${initial_report}" : "\\(.*\\)0"`
printf "%b" "${csi}${x}H"
fi
fi
if [ z"$1" = z ]; then
# cleanup was called explicitly, so don't exit.
# We use `trap : 0' rather than `trap - 0' because the latter doesn't
# work in older Bourne shells.
trap : 0
return
fi
exit $1
}
trap 'cleanup 120 no' 0
trap 'cleanup 129' 1
trap 'cleanup 130' 2
trap 'cleanup 131' 3
trap 'cleanup 143' 15
stty eol 0 eof n -echo
printf "%b" "$dsr_cpr$dsr_ok"
initial_report=`tr -dc \;0123456789`
# Get the initial cursor position. Time out if the terminal does not reply
# within 1 second. The trick of calling tr and sleep in a pipeline to put
# them in a process group, and using "kill 0" to kill the whole process
# group, was suggested by Stephane Gimenez at
# https://unix.stackexchange.com/questions/10698/timing-out-in-a-shell-script
#trap : 14
#set +e
#initial_report=`sh -c 'ps -t $(tty) -o pid,ppid,pgid,command >/tmp/p;
# { tr -dc \;0123456789 >&3; kill -14 0; } |
# { sleep 1; kill -14 0; }' 3>&1`
#set -e
#initial_report=`{ sleep 1; kill 0; } |
# { tr -dc \;0123456789 </dev/tty; kill 0; }`
if [ z"$initial_report" = z"" ]; then
# We couldn't read the initial cursor position, so abort.
cleanup 120
fi
# Write some text and get the final cursor position.
printf "%b%b" "$text" "$dsr_cpr$dsr_ok"
final_report=`tr -dc \;0123456789`
initial_x=`expr "$initial_report" : "[0-9][0-9]*;\\([0-9][0-9]*\\)0" || test $? -eq 1`
final_x=`expr "$final_report" : "[0-9][0-9]*;\\([0-9][0-9]*\\)0" || test $? -eq 1`
delta_x=`expr "$final_x" - "$initial_x" || test $? -eq 1`
cleanup
# Zsh has function-local EXIT traps, even in sh emulation mode. This
# is a long-standing bug.
trap : 0
if [ $delta_x -gt 100 ]; then
delta_x=100
fi
exit $delta_x
The script returns the width in its return status, clipped to 100. Sample usage:
widthof -1
case $? in
0) export LC_CTYPE=C;; # 7-bit charset
2) locale_search .utf8 .UTF-8;; # utf8
3) locale_search .iso88591 .ISO8859-1 .latin1 '';; # 8-bit with nonprintable 128-159, we assume latin1
4) locale_search .iso88591 .ISO8859-1 .latin1 '';; # some full 8-bit charset, we assume latin1
*) export LC_CTYPE=C;; # weird charset
esac