Why doesn't shell automatically fix “useless use of cat”?

General Tech Bugs & Fixes 2 years ago

0 2 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Bugs & Fixes related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (2)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

Many people use oneliners and scripts containing code along the lines

cat "$MYFILE" | command1 | command2 > "$OUTPUT"

The first cat is often called "useless use of cat" because technically it requires starting a new process (often /usr/bin/cat) where this could be avoided if the command had been

< "$MYFILE" command1 | command2 > "$OUTPUT"

because then shell only needs to start command1 and simply point its stdin to the given file.

Why doesn't the shell do this conversion automatically? I feel that the "useless use of cat" syntax is easier to read and shell should have enough information to get rid of useless cat automatically. The cat is defined in POSIX standard so shell should be allowed to implement it internally instead of using a binary in path. The shell could even contain implementation only for exactly one argument version and fallback to binary in path.

profilepic.png
manpreet 2 years ago

 

"Useless use of cat" is more about how you write your code than about what actually runs when you execute the script. It's a sort of design anti-pattern, a way of going about something that could probably be done in a more efficient manner. It's a failure in understanding of how to best combine the given tools to create a new tool. I'd argue that stringing several sed and/or awk commands together in a pipeline also could be said to be a symptom of this same anti-pattern.

Fixing instances of "useless use of cat" in a script is a primarily matter of fixing the source code of the script manually. A tool such as ShellCheck can help with this by pointing them out:

$ cat script.sh
#!/bin/sh
cat file | cat
$ shellcheck script.sh

In script.sh line 2:
cat file | cat
    ^-- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.

Getting the shell to do this automatically would be difficult due to the nature of shell scripts. The way a script executes depends on the environment inherited from its parent process, and on the specific implementation of the available external commands.

The shell does not necessarily know what cat is. It could potentially be any command from anywhere in your $PATH, or a function.

If it was a built-in command (which it may be in some shells), it would have the ability to reorganise the pipeline as it would know of the semantics of its built-in cat command. Before doing that, it would additionally have to make assumptions about the next command in the pipeline, after the original cat.

Note that reading from standard input behaves slightly differently when it's connected to a pipe and when it's connected to a file. A pipe is not seekable, so depending on what the next command in the pipeline does, it may or may not behave differently if the pipeline was rearranged (it may detect whether the input is seekable and decide to do things differently if it is or if it isn't, in any case it would then behave differently).

This question is similar (in a very general sense) to "Are there any compilers that attempt to fix syntax errors on their own?" (at the Software Engineering StackExchange site), although that question is obviously about syntax errors, not useless design patterns. The idea about automatically changing the code based on intent is largely the same though.


0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.