====================================================================== Top Ten Reasons not to use the C shell ====================================================================== Written by Bruce Barnett With MAJOR help from Peter Samuelson Chris F.A. Johnson and of course Tom Christiansen September 22, 2001 In the late 80's, the C shell the most popular interactive shell. The Bourne shell was too "bare-bones." The Korn shell had to be purchased, and the Bourne Again shell wasn't created yet. I've used the C shell for years, and on the surface it has a lot of good points. It has arrays (the Bourne shell only has one). It has test(1), basename(1) and expr(1) built-in, while the Bourne shell needed external programs. UNIX was hard enough to learn, and spending months to learn two shells seemed silly when the C shell seemed adequate for the job. So many have decided that since they were using the C shell for their interactive session, why not use it for writing scripts. THIS IS A *BIG* MISTAKE. Oh - it's okay for a 5-line script. The world isn't going to end if you use it. However, many of the posters on USENET treat it as such. I've used the C shell for very large scripts and it worked fine in most cases. There are ugly parts, and work-arounds. But as your script grows in sophistication, you will need more work-arounds and eventually you will find yourself bashing your head against a wall trying to work around the problem. I know of many people who have read Tom Christiansen's essay about the C shell (http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/ ), and they were not really convinced. A lot of Tom's examples were really obscure, and frankly I've always felt Tom's argument wasn't as convincing as it could be. So I decided to write my own version of this essay - as a gentle argument to a current C shell programmer from a former C shell fan. [Note - since I compare shells, it can be cofusing. If the line starts with a "%" then I'm using the C shell. If in starts with a "$" then it is the Bourne shell. ------------------------------------- Top Ten reasons not to use the C shell ------------------------------------- 1. The Ad Hoc Parser 2. Multiple-line quoting difficult 3. Quoting can be confusing and inconsistent 4. If/while/foreach/read cannot use redirection 5. Getting input a line at a time 6. Aliases are line oriented 7. Limited file I/O redirection 8. Poor management of signals and subprocesses 9. Fewer ways to test for missing variables 10. Inconsistent use of variables and commands. 1. The Ad Hoc Parser The biggest problem of the C shell it its ad hoc parser. Now this information won't make you immediately switch shells. But it's the biggest reason to do so. Many of the other items listed are based on this problem. Perhaps I should elaborate. The parser is the code that converts the shell commands into variables, expressions, strings, etc. High-quality programs have a full-fledged parser that converts the input into tokens, verifies the tokens are in the right order, and then executes the tokens. The C shell does not do this. It parses as it executes. You can have expressions in many types of instructions: % if ( expression ) % set variable = ( expression ) % while ( expression ) They should be treated the same. They are not. You may find out that % if ( 1 ) but % if(1) doesn't work. You never know when you will find a new bug. As I write this (September 2001) I ported a C shell script to another UNIX system. (It was my .login script, okay? Sheesh!) Anyhow I got an error "Variable name must begin with a letter" somewhere in the dozen files used when I log in. I finally traced the problem down to the following "syntax" error: % if (! $?variable ) ... Adding a space before the "!" character fixed the "error." The examples in the manual page don't mention that spaces are required. Sigh... Most of the flaws are due to the ad hoc parser. For instance, % if ( $?A ) set B = $A If variable A is defined, then set B to $A. Sounds good. The problem? If A is not defined, you get "A: Undefined variable." If you want to check a Bourne shell script for syntax errors, use "sh -n." This doesn't execute the script. but it does check all errors. What a wonderful idea. Does the C shell have this feature? Of course not. Errors aren't found until they are EXECUTED. For instance, the code % if ( $zero ) then % while % end % endif will execute with no complains. However, if $zero becomes one, then you get the syntax error: while: Too few arguments. In other words, you can have a script that works fine for months, and THEN gets a syntax error. Your customers will love this "professionalism." And we are just getting warmed up. It's a time bomb, gang... Tick... Tick... Tick... 2. Multiple-line quoting difficult The C shell complaints if strings are longer than a line. If you are typing at a terminal, and only type one quote, it's nice to have an error instead of a strange prompt. However, for shell programming - it stinks like a bloated skunk. Here is a simple 'awk' script that adds one to the first value of each line. I broke this simple script into three lines, because many awk scripts are several lines long. I could put it on one line, but that's not the point. Cut me some slack, okay? #!/bin/awk -f {print $1 + \ 2; } Calling this from a Bourne shell is simple: #!/bin/sh awk ' {print $1 + \ 2; } ' They look the SAME! What a novel concept. Now look at the C shell version. #!/bin/csh -f awk '{print $1 + \\ 2 ;\ }' An extra slash is needed. One line has two slashes, and the second has one. Suppose you want to set the output to a variable. Sounds simple? Perhaps. Look how it changes: #!/bin/csh -f set a = `echo 7 | awk '{print $1 + \\\ 2 ;\\ }'` Now you need three slashes! And the second line only has two. Keeping track of those backslashes can drive you crazy when you have large awk and sed scripts. And you can't simply cut and paste scripts from different shells - if you use the C shell. Also note that if you WANT to include a newline in a string, strange things happen: % set a = 'a \ b' % echo $a a b The newline goes away. Suppose you really want a newline in the string. Will another backslash work? % set a = 'a \\ b' % echo $a a \ b That didn't work. Suppose you decide to quote the variable: % set a = 'a \ b' % echo "$a" Unmatched ". Syntax error!? How bizarre. There is a solution - use the :q quote modifier. % set a = 'a \ b' % echo $a:q a b This can get VERY complicated when you want to make aliases include backslash characters. More on this later. Heh. Heh. 3. Quoting can be confusing and inconsistent The Bourne shell has three types of quotes: "........" - only $, `, and \ are special. '.......' - Nothing is special (this includes the backslash) \. - The next character is not special (Exception: a newline) That's it. Very few exceptions. The C shell is another matter. Take the backslash. The Bourne shell uses the backslash to escape everything except the newline. In the C shell, it also escapes the backslash and the dollar sign. Type typing: echo "\$HOME" and it will print \/home/barnett So there is no way to escape a variable in a double quote. WHat about single quotes? Well, in this case the "!" character is special, as is the "~" character. Using single quotes (the strong quotes) the command % echo '!1' will give you the error 1: Event not found. A backslash is needed because the single quotes don't work. Now suppose you type % set a = "~" % echo $a % echo '$a' % echo "$a" The echo commands output THREE different values. So no matter what type of quotes you use, there are exceptions. Those exceptions can trive you mad. And then ther's dealing with spaces. If you call a C shell script, and pass it an argument with a space: % myscript "a b" c Now guess what the following script will print. #!/bin/csh -f echo $# set b = ( $* ) echo $#b It prints "2" and then "3". Double quotes don't help. It's time to use the fourth form of quoting - which is only useful when displaying (not set) the value: % set b = ( $*:q ) Got it? It gets worse. Try to pass backslashes to an alias You need billions and billions of them. Okay. I exagerate. A little. But look at Dan Berstein's two aliases used to get quoting correct in aliases: % alias quote "/bin/sed -e 's/\\!/\\\\\!/g' \\ -e 's/'\\\''/'\\\'\\\\\\\'\\\''/g' \\ -e 's/^/'\''/' \\ -e 's/"\$"/'\''/'" % alias makealias "quote | /bin/sed 's/^/alias \!:1 /' \!:2*" Larry Wall calls this backshashitis. What a royal pain. Tick.. Tick.. Tick.. 4. If/while/foreach/read cannot use redirection The Bourne shell allows complex commands to be combined with pipes. The C shell doesn't. Suppose you want to choose an argument to grep. Example: % if ( $a ) then % grep xxx % else % grep yyy % endif No problem as long as the text you are grepping is piped into the script. But what if you want to create the data stream in the script? Suppose you change the first line to be % cat $file | if ($a ) then Guess what? The file $file is COMPLETELY ignored. Instead, the script use standard input. The only standard input the "if" command can use MUST be specified outside of the script. Therefore what can be done in one Bourne shell file has to be done in several C shell scripts - because a single script can't be used. The 'while' command is the same way. For instance the following command outputs the time with hyphens between the numbers instead of colons: $ date | tr ':' ' ' | while read a b c d e f g $ do $ echo The time is $d-$e-$f $ done You can use < as well as pipes. In other words, *ANY* command in the Bourne shell can have the datastream redirected. That's because it has a REAL parser [rimshot]. Speaking of which. the Bourne shell allows you to combine several lines onto a single line as long as semicolons are placed between. This includes complex commands. For example - the following is perfectly fine with the Bourne shell: $ if true;then grep a;else grep b; fi This has several advantages. Commands in a makefile - see make(1) - have to be on one line. Trying to put a C shell "if" command in a makefile is painful. Also - if your shell allows you to recall and edit previous commands, then you can use complex commands and edit them. The C shell allows you to repeat only the first part of a complex command, like the single line with the "if" statement. It's much nicer recalling and editing the entire complex command. But that's for interactive shells, and outside the scope of this essay. 5. Getting input a line at a time Suppose you want to read one line from a file. This simple task is very difficult for the C shell. The C shell provides one way to read a line: % set ans = $< The trouble it - this ALWAYS reads from standard input. If a terminal is attached to standard input, then it reads from the terminal. If a file is attached to the script, then it reads the file. But what do you do if you want to specify the filename in the script? You can use "head -1" to get a line. but how do you read the next line? You can create a temporary file, and read and delete the first line. How ugly and extremely inefficient. Now what if you want to read a file, and ask the user something during this? As an example - suppose you want to read a list of files from a pipe, and ask the user what to do with some of them? Can't do this with the C shell - $< reads from standard input. Always. The Bourne shell does allow this. Simply use $ read ans