đź’Ž

Sed, a stream editor
21 Oct 2013

It is always important to keep DRY - "Don't repeat yourself" - in mind, especially for those who play with large numbers of text files everyday. As a Physics PhD, I run a lot of simulations by feeding input files to packages and then analyzing generated outputs. One thing that stops me from DRY when interacting with these files is sed (stream editor). It is a marvelous utility. However I never learn its real power.

sed is short for stream editor. It works with streams of characters in a non-interactive line-by-line way. If that sounds strange, think it as a pipe. Your input flows through the pipe, where magic happens.

How sed works

When it comes to “how it works” questions, it’s always the best to read the manual:

sed operates by performing the following cycle on each line of input: first, sed reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed; each command can have an address associated to it: addresses are a kind of condition code, and a command is only executed if the condition is verified before the command is to be executed.

When the end of the script is reached, unless the -n option is in use, the contents of pattern space are printed out to the output stream, adding back the trailing newline if it was removed. Then the next cycle starts for the next input line.

Unless special commands (like D) are used, the pattern space is deleted between two cycles. The hold space, on the other hand, keeps its data between cycles (see commands h, H, x, g, G to move data between both buffers).

To summarize, sed works in the following way

  1. Read a line, remove the trailing newline, and put it in the pattern space.
  2. Execute commands if no address provided or address matches the current line.
  3. Print the content of pattern space into output stream and add back the trailing newline.
  4. Proceed to the next line.

Usage

I will not go through all the details of the power of sed. You can find it all at this wonderfull tutorial by Bruce Barnett. The usage I’m going to cover is simple yet essential options that I find most helpful.

Basic Usage

sed [options] commands [file-to-edit]

sed applies commands to the file to be edited. options are used to change sed's default behaviors.

Suppress auto printing

By default sed sends its output to the standard output and keeps the original file unchanged. Use -n option to suppress the behavior.

Edit inplace

By default sed keeps the original file unchanged. Use -i=.extension to edit the file inplace and save a backup file as file-to-edit.extension.

Lines to edit

Use line numbers before each command to limit to which lines shall the command be applied. Common addressing rules are:

  • number, only matches one line.
  • first~step, selects every stepth line starting with the line first, i.e. all lines with line number that satisfies first + n * step, where n is a non-negative integer, are selected.
  • $, selects the last line of the last file of input; will select last lines of each file when the -i or -s options are specified.
  • first, +N, matches first line and the N lines following first.
  • first, ~N, matches first line and the lines following first untill the next line whose input line number is a multiple of N.
  • /regexp/, selects any line which matches the regular expression regexp.

Til next time,
Jianfeng at 22:34