且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用sed将一行中的第一个单词与最后一个单词进行比较?

更新时间:2023-09-01 18:55:04

使用 awk.

$ awk '$1==$NF' file
abc2 1 def2 3 abc2
 cd eabc1d rq12345 cd
a a

通过 sed,

$ sed -n '/^ *\([^[:space:]]\+\)\b.* \1 *$/p' file
abc2 1 def2 3 abc2
 cd eabc1d rq12345 cd
a a

正则表达式解释:

^ - 断言我们在开始.

^ - Asserts that we re at the start.

* - 匹配零个或多个空格字符.

<space>* - Matches zero or more space characters.

\(...\) - 称为捕获组.与捕获组中存在的模式匹配的字符将存储在相应的组索引中.我们可以稍后通过反向引用来引用这些字符.

\(...\) - Called capturing group. Characters which are matched by the pattern present inside the capturing group will be stored inside the corresponding group index. We could refer these characters later through back-referencing.

[^[:space:]] 匹配一个非空格字符.[^[:space:]]\+ 匹配一个或多个非空格字符.\([^[:space:]]\+\) 现在匹配的字符被第一个捕获组捕获.

[^[:space:]] matches a non-space character. [^[:space:]]\+ matches one or more non-space characters. \([^[:space:]]\+\) now the matched characters are captured by the first capturing group.

\b 称为词边界,它匹配一个单词字符和一个非单词字符.这会强制 [^[:space:]]\+ 匹配上例中的空格.

\b called word boundary which matches between a word character and a non-word character. This forces the [^[:space:]]\+ to match upto a space in the above example.

.* 匹配任意字符零次或多次.

.* matches any character zero or more times.

\1\1这里指的是组索引1里面的字符.\1保证第一个字符前必须有一个空格.

<space>\1, \1 here refers to the characters inside group index 1. <space>\1 ensures that there must be a space exists before first character.

* 匹配零个或多个空格.

<space>* matches zero or more spaces.

$ 断言我们到了最后.

请注意,如果输入包含除空格字符以外的非单词字符,则上述 sed 可能会失败.

Note that the above sed may fail if the input contain non-word characters except space characters.