summaryrefslogtreecommitdiffhomepage
path: root/sed.html.markdown
blob: 3e6d8fc8da8c7c70c6f32b5ce112512d49c002cc (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
---
category: tool
tool: sed
filename: learnsed.sed
contributors:
     - ["Diomidis Spinellis", "https://www.spinellis.gr"]

---

__Sed__ is a standard tool on every POSIX-compliant UNIX system.
It's like an editor, such as Vim, Visual Studio Code, Atom, or Sublime.
However, rather than typing the commands interactively, you
provide them on the command line or in a file.

_Sed_'s advantages over an interactive editor is that it can be easily
used to automate text processing tasks, and that it can process
efficiently huge (terabyte-sized) files.
It can perform more complex tasks than _grep_ and for many text
processing tasks its commands are much shorter than what you would
write in _awk_, _Perl_, or _Python_.

_Sed_ works by reading a line of text (by default from its standard
input, unless some files are specified as arguments), processing
it with the specified commands, and then outputting the result
on its standard output.
You can suppress the default output by specifying the `-n` command-line
argument.

```perl
#!/usr/bin/sed -f
# Files that begin with the above line and are given execute permission
# can be run as regular scripts.

# Comments are like this.

# Commands consist of a single letter and many can be preceded
# by a specification of the lines to which they apply.

# Delete the input's third line.
3d

# The same command specified the command line as an argument to sed:
# sed 3d

# For many commands the specification can consist of two addresses,
# which select an inclusive range.
# Addresses can be specified numerically ($ is the last line) or through
# regular expressions delimited by /.

# Delete lines 1-10
1,10d

# Lines can also be specified as regular expressions, delimited by /.

# Delete empty lines.
/^$/d

# Delete blocks starting with SPOILER-BEGIN and ending with SPOILER-END.
/SPOILER-BEGIN/,/SPOILER-END/d

# A command without an address is applied to all lines.

# List lines in in a visually unambiguous form (e.g. tab appears as \t).
l

# A command prefixed by ! will apply to non-matching lines.
# Keep only lines starting with a #.
/^#/!d

# Below are examples of the most often-used commands.

# Substitute the first occurence in a line of John with Mary.
s/John/Mary/

# Remove all underscore characters (global substitution).
s/_//g

# Remove all HTML tags.
s/<[^>]*>//g

# In the replacement string & is the regular expression matched.

# Put each line inside double quotes.
s/.*/"&"/

# In the matched regular expression \(pattern\) is used to store
# a pattern into a buffer.
# In the replacement string \1 refers to the first pattern, \2 to the second
# and so on. \u converts the following character to uppercase \l to lowercase.

# Convert snake_case_identifiers into camelCaseIdentifiers.
s/_\(.\)/\u\1/g


# The p (print) command is typically used together with the -n
# command-line option, which disables the print by default functionality.
# Output all lines between ``` and ```.
/```/,/```/p


# The y command maps characters from one set to another.
# Swap decimal and thousand separators (1,234,343.55 becomes 1.234.343,55).
y/.,/,./

# Quit after printing the line starting with END.
/^END/q

# You can stop reading here, and still get 80% of sed's benefits.
# Below are examples of how you can specify multiple sed commands.

# You can apply multiple commands by separating them with a newline or
# a semicolon.

# Delete the first and the last line.
1d
$d

# Delete the first and the last line.
1d;$d


# You can group commands in { } blocks.

# Convert first line to uppercase and print it.
1 {
  s/./\u&/g
  p
}

# Convert first line to uppercase and print it (less readable one-liner).
1{s/./\u&/g;p;}


# You can also stop reading here, if you're not interested in creating
# sed script files.

# Below are more advanced commands.  You typically put these in a file
# rather than specify them on a command line.  If you have to use
# many of these commands in a script, consider using a general purpose
# scripting language, such as Python or Perl.

# Append a line containing "profile();" after each line ending with ";".
/;$/a\
profile();

# Insert a line containing "profile();" before each line ending with ";".
/;$/i\
profile();

# Change each line text inside REDACTED blocks into [REDACTED].
/REDACTED-BEGIN/,/REDACTED-END/c\
[REDACTED]

# Replace the tag "<ourstyle>" by reading and outputting the file style.css.
/<ourstyle>/ {
  r style.css
  d
}

# Change each line inside REDACTED blocks into [REDACTED].
# Also write (append) a copy of the redacted text in the file redacted.txt.
/REDACTED-BEGIN/,/REDACTED-END/ {
  w redacted.txt
  c\
  [REDACTED]
}

# All operations described so far operate on a buffer called "pattern space".
# In addition, sed offers another buffer called "hold space".
# The following commands operate on the two, and can be used to keep
# state or combine multiple lines.

# Replace the contents of the pattern space with the contents of
# the hold space.
g

# Append a newline character followed by the contents of the hold
# space to the pattern space.
G

# Replace the contents of the hold space with the contents of the
# pattern space.
h

# Append a newline character followed by the contents of the
# pattern space to the hold space.
H

# Delete the initial segment of the pattern space through the first
# newline character and start the next cycle.
D

# Replace the contents of the pattern space with the contents of
# the hold space.
g

# Append a newline character followed by the contents of the hold
# space to the pattern space.
G

# Replace the contents of the hold space with the contents of the
# pattern space.
h

# Append a newline character followed by the contents of the
# pattern space to the hold space.
H

# Write the pattern space to the standard output if the default
# output has not been suppressed, and replace the pattern space
# with the next line of input.
n

# Append the next line of input to the pattern space, using an
# embedded newline character to separate the appended material from
# the original contents.  Note that the current line number
# changes.
N

# Write the pattern space, up to the first newline character to the
# standard output.
P

# Swap the contents of the pattern and hold spaces.
x

# Here is a complete example of some of the buffer commands.
# Move the file's first line to its end.
1 {
  h
  d
}

$ {
  p
  x
}


# Three sed commands influence a script's control flow

# Name this script position "my_label", to which the "b" and
# "t" commands may branch.
:my_label

# Continue executing commands from the position of my_label.
b my_label

# Branch to the end of the script.
b

# Branch to my_label if any substitutions have been made since the most
# recent reading of an input line or execution of a "t" (test) function.
t my_label

# Here is a complete example of branching
# Join lines that end with a backspace into a single space-separated one

# Name this position "loop"
: loop
# On lines ending with a backslash
/\\$/ {
  # Read the next line and append it to the pattern space
  N
  # Substitute backslash newline with a space
  s/\\\n/ /
  # Branch to the top for testing this line's ending
  b loop
}
```

Further Reading:

* [The Open Group: sed - stream editor](https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html)
  The POSIX standard regarding sed.
  Follow this for maximum portability.
* [FreeBSD sed -- stream editor](https://www.freebsd.org/cgi/man.cgi?query=sed&sektion=&n=1)
  The BSD manual page.
  This version of sed runs on BSD systems and macOS.
* [Project GNU: sed, a stream editor](https://www.gnu.org/software/sed/manual/sed.html)
  The GNU manual page. GNU sed is found on most Linux systems.
* [Lee E. McMahon: SED -- A Non-interactive Text Editor](https://wolfram.schneider.org/bsd/7thEdManVol2/sed/sed.pdf)
  The original sed documentation
* [A collection of sed resources](http://sed.sourceforge.net/)
* [The sed FAQ](http://sed.sourceforge.net/sedfaq.html)