stackademic

The leading education platform for anyone with an interest in software development.

Text Processing

Search, filter, and transform text with grep, sed, and friends

Overview

Unix treats text as a universal interface, so a small set of text tools can slice through logs, config files, and command output. grep searches, sort and uniq organize, and cut, sed, and awk extract and transform. Mastering these turns tedious manual editing into one-line commands.

Syntax / Usage

Combine searching, sorting, and field extraction, often within a pipeline.

grep "TODO" notes.txt          # print lines matching a pattern
grep -ri "error" logs/         # recursive, case-insensitive search
sort names.txt                 # sort lines alphabetically
sort names.txt | uniq          # remove adjacent duplicates
wc -l file.txt                 # count lines
cut -d, -f1 data.csv           # extract the first comma-separated field
sed 's/foo/bar/g' file.txt     # replace all "foo" with "bar"
awk '{ print $2 }' data.txt    # print the second whitespace field

Examples

Find the most common IP addresses in a log file:

awk '{ print $1 }' access.log | sort | uniq -c | sort -rn | head

Replace a value across a config file and save the result:

sed 's/localhost/127.0.0.1/g' config.env > config.local.env

Extract email columns from a CSV:

cut -d, -f3 users.csv | grep "@"

Common Mistakes

  • Expecting uniq to catch all duplicates without sorting first
  • Forgetting the g flag in sed, which then replaces only the first match per line
  • Not quoting patterns that contain spaces or special characters
  • Using the wrong delimiter in cut (default is tab, not comma)
  • Confusing grep regular expressions with shell glob patterns

See Also

command-line-pipes-and-redirection command-line-files-and-directories