Sed, a stream editor
21 Oct 2013
It is always important to keep DRY - "Don't repeat yourself" - in mind, especially for those
who play with large numbers of text files everyday. As a Physics PhD, I run a lot of simulations
by feeding input files to packages and then analyzing generated outputs. One thing that stops me from DRY when
interacting with these files is
sed (stream editor). It is a marvelous utility. However I never learn its real power.
sed is short for stream editor. It works with streams of characters in a non-interactive line-by-line
way. If that sounds strange, think it as a pipe. Your input flows through the pipe, where magic happens.
When it comes to “how it works” questions, it’s always the best to read the manual:
sed operates by performing the following cycle on each line of input: first, sed reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed; each command can have an address associated to it: addresses are a kind of condition code, and a command is only executed if the condition is verified before the command is to be executed.
When the end of the script is reached, unless the -n option is in use, the contents of pattern space are printed out to the output stream, adding back the trailing newline if it was removed. Then the next cycle starts for the next input line.
Unless special commands (like
D) are used, the pattern space is deleted between two cycles. The hold space, on the other hand, keeps its data between cycles (see commands
Gto move data between both buffers).
sed works in the following way
- Read a line, remove the trailing newline, and put it in the pattern space.
- Execute commands if no address provided or address matches the current line.
- Print the content of pattern space into output stream and add back the trailing newline.
- Proceed to the next line.
I will not go through all the details of the power of
sed. You can find it all at
this wonderfull tutorial by Bruce Barnett. The usage
I’m going to cover is simple yet essential options that I find most helpful.
sed [options] commands [file-to-edit]
commands to the file to be edited.
options are used to change
sed's default behaviors.
Suppress auto printing
sed sends its output to the standard output and keeps the original
file unchanged. Use
-n option to suppress the behavior.
sed keeps the original file unchanged. Use
-i=.extension to edit
the file inplace and save a backup file as
Lines to edit
Use line numbers before each command to limit to which lines shall the command be applied. Common addressing rules are:
number, only matches one line.
first~step, selects every
stepth line starting with the line
first, i.e. all lines with line number that satisfies
first + n * step, where
nis a non-negative integer, are selected.
$, selects the last line of the last file of input; will select last lines of each file when the
-soptions are specified.
first, +N, matches
firstline and the
first, ~N, matches
firstline and the lines following
firstuntill the next line whose input line number is a multiple of
/regexp/, selects any line which matches the regular expression
Til next time,
Jianfeng at 22:34