shell笔记之sed,tr,cut,paste

sed,tr, cut, paste 的扫盲笔记。

sed

sed: stream editor .

用途

处理流的神器。

format
sed command file

这里的 command 是 ed-style 的command。「ed: Ken Thompson发明,算是 vi 的前身。详情戳 ed wiki

examples
$ cat fruits
apples I have two apples.
bananas I have three bananas.
cherries I have lots of cherries.
cucumber I have a cucumber.
pear I have no pear.
watermelon I like watermelon.

$ sed 's/like/have/' fruits   # 将每一行出现的第一个 like 替换成 have
apples I have two apples.
bananas I have three bananas.
cherries I have lots of cherries.
cucumber I have a cucumber.
pear I have no pear.
watermelon I have watermelon.

$ sed 's/apples/lemons/g' fruits  # g stands for global, 全局替换 apples 为 lemons
lemons I have two lemons.
bananas I have three bananas.
cherries I have lots of cherries.
cucumber I have a cucumber.
pear I have no pear.
watermelon I like watermelon.

使用sed并不会改变文件原来的内容,如果想要将标准输出用于原文件,可以这样来:

$ cat fruits
apples I have two apples.
bananas I have three bananas.
cherries I have lots of cherries.
cucumber I have a cucumber.
pear I have no pear.
watermelon I like watermelon.

$ sed 's/like/have/' fruits > temp
$ mv temp fruits
$ cat fruits
apples I have two apples.
bananas I have three bananas.
cherries I have lots of cherries.
cucumber I have a cucumber.
pear I have no pear.
watermelon I have watermelon.
-n option

sed默认输出所有的行,可以使用 -n 来指定输出的行。

例:

$ sed -n '1,2p' fruits                            # 输出第1、2行
apples I have two apples.
bananas I have three bananas

$ sed -n '1,$p' fruits                            # 输出第1到最后一行,即所有行
apples I have two apples.
bananas I have three bananas.
cherries I have lots of cherries.
cucumber I have a cucumber.
pear I have no pear.
watermelon I have watermelon.

$ sed -n '/pear/p' fruits                     # 输出包含 pear 的行
pear I have no pear.

$ sed -n l fruits       # n 后面跟的是字母l,输出所有行,同时将分隔符转成\xx[octal value]一并打印出来
apples I have two apples.$
bananas I have three bananas.$
cherries I have lots of cherries.$
cucumber I have a cucumber.$
pear I have no pear.$
watermelon I have watermelon.$

常见的字符及对应的octal value可以参看这里 AsciiTable

delete lines

使用 d 可以删除行,这里d不是option,而是command。「当然,使用d并没有真的删除原文件,只是输出中删除了指定的行」

例:

$ sed '2,3d' fruits                            # 删除第2,3行
apples I have two apples.
cucumber I have a cucumber.
pear I have no pear.
watermelon I have watermelon.

$ sed '/watermelon/d' fruits    # 删除含有 watermelon 的行
apples I have two apples.
bananas I have three bananas.
cherries I have lots of cherries.
cucumber I have a cucumber.
pear I have no pear.

tr

用途

用于转换或删除(-d option),比较常用的是大小写转换。

format
tr from-chars to-chars

一些特殊字符,比如tab,可以使用对应的octal value来代替。

examples

我们把fruits 文件内容都换成大写:

$ tr '[a-z]' '[A-Z]' < fruits
APPLES I HAVE TWO APPLES.
BANANAS I HAVE THREE BANANAS.
CHERRIES I HAVE LOTS OF CHERRIES.
CUCUMBER I HAVE A CUCUMBER.
PEAR I HAVE NO PEAR.
WATERMELON I HAVE WATERMELON.

$ tr '[a-z]' '[A-Z]' < fruits | tr '[A-Z]' '[a-z]'
apples i have two apples.
bananas i have three bananas.
cherries i have lots of cherries.
cucumber i have a cucumber.
pear i have no pear.
watermelon i have watermelon.

$ date | tr '' '\12'  # 将空白符转换为换行符
Sat
Aug
10
18:40:01
CST
2019
-s option

s stands for squeeze, 挤,榨。意思是转换后,to-chars里面连续出现的多个都会被”挤”成一个char。

比较常见的是移除不必要的空格。

例:

$ cat lotsspaces
so    you   just   get
this    one!
Open         the    door, please!

$ tr -s ' ' ' ' <  lotsspaces
so you just get
this one!
Open the door, please!
-d option

使用 -d, tr 可以用来删除输入流中指定的字符。

format长这样:

tr -d from-chars

例:

$ cat lotsspaces
so    you   just   get
this    one!
Open         the    door, please!

$ tr -d ' ' < lotsspaces   # 删除空白符
soyoujustget
thisone!
Openthedoor,please!

这里需要记住,tr 仅可以作用于单个字符(single character) ,如果需要转换多个字符,比如将 apples 换成 lemons,建议使用sed.

cut

用途

从文件或者标准输出中提取指定的区域 (extract various fields of data from file or the output of a command)。

format
cut -cchars file
examples
$ cat food

pizza    4    good
chesses    5    not so bad
coco    10    great!
chips    99    yeah!
sweets    9    I like them!

$ cat -c1-6 food                 #提取前6个字符
pizza
chesse
coco    1
chips
sweets

$ cut -c1-6,10- food    #提取前6个字符 和 第10个以后的所有字符
pizza    ood
chesse    not so bad
coco    1reat!
chips    yeah!
sweetsI like them!
-d -f options

-d: 指定界定符(specifying the field separator delimiter)

-f:指定想要抽取出的区域(the field you want to extract)

看个例子就懂了。

$ cat users                                 # 从 /etc/passwd 里面切出来的部分内容
nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false
root:*:0:0:System Administrator:/var/root:/bin/sh
daemon:*:1:1:System Services:/var/root:/usr/bin/false
_uucp:*:4:4:Unix to Unix Copy Protocol:/var/spool/uucp:/usr/sbin/uucico
_taskgated:*:13:13:Task Gate Daemon:/var/empty:/usr/bin/false
_networkd:*:24:24:Network Services:/var/networkd:/usr/bin/false
_installassistant:*:25:25:Install Assistant:/var/empty:/usr/bin/false

$ cut -d: -f1 users                # 可以获得用户名
nobody
root
daemon
_uucp
_taskgated
_networkd
_installassistant

-d 可以后跟其他的分隔符,但是一些特殊的分隔符,比如tab, 就会抛错:bad delimiter, 这种情况下,可以考虑改用sed。

paste

用途

与cut 刚好相反,paste 用来连接多个区域。

format
paste files
examples
$ cat students
张三
李四
王五
赵六
$ cat scores
89
90
100
89

$ paste students scores             # 默认用 Tab 分隔,可以用sed -n l 查看分隔符
张三    89
李四    90
王五    100
赵六    89

$ paste students scores | sed -n l # 分隔符为 \t

张三\t89$
李四\t90$
王五\t100$
赵六\t89$
-d option

同 cut, -d用于指定 分隔符,默认为tab。 继续上面的例子:

$ paste -d':' students scores
张三:89
李四:90
王五:100
赵六:89

$ paste -d'>' students scores
张三>89
李四>90
王五>100
赵六>89

注意 -d指定的分隔符只能是单个,如果指定了多个,会默认取第一个。

$ paste -d'>++>>' students scores
张三>89
李四>90
王五>100
赵六>89

为了安全性,建议分隔符用单引号包起来。不信试试下面这个 : P

$ paste -d> students scores
$ cat students
-s option

将给定文件的每一行连接起来。(paste lines together)

看例子:

$ paste -s students                  # 默认分隔符为 tab
张三    李四    王五    赵六

$ paste -d';' -s students  # 指定分隔符为 ;
张三;李四;王五;赵六

$ ls | paste -d';' -s -   # 当输入来自标准输出时,需要用 -
food;scores;students;users

参考

ed wiki

AsciiTable

shell programming in Unix Linux and OSX