正規表示法(Regular expression)

房東：阿龍
發表時間：2007-05-08

正規表示法(Regular expression)

使用萬用字元的行為就稱為glbbing,bash所能了解的萬用字元雖然不多,但是足以應付glbbing.

不過若是要將glob概念延伸應用到任何一般文字形式上(檔案,文字串流,程式的字串變數等等)

就必須要靠正規表示法(regular expressions簡寫成regex)才行了

處理regular expressions的工具也很多,如:grep,sed,awk,perl和Python語言等等,

但在LPI來說grep,sed是比較重要的

用法:

定位符(anchor):用來描述位置的資訊

Regex
說明

^
比對一列之首,^字符只有在正規表示法的最左邊時才會有意義

$
比對一列之尾,$字符只有在正規表示法的最右邊時才會有意義

[root@localhost ynie]# grep -n ‘^kenny’ test 在test檔中列出以kenny為首的每

一列,並加上列號

[root@localhost ynie]# grep -n ‘kenny$’ test 在test檔中列出以kenny為尾的每

一列,並加上列號

[root@localhost ynie]# grep -c ‘^$’ test 在test檔中,找出空白列,並計算出有幾列

[root@localhost ynie]# grep ‘^kenny&’ test 在test檔中找出包含單字kenny的每

一列

字符集:將欲比對的字符放到群組與範圍內

Regex
說明

[abc]
相符於字符a或b或c的其中一字符

[a-z]
相符於a~z範圍的字符

[^abc]
反向比對,不包含a或b或c的

[^a-z]
反向比對,不包含a~z範圍的

\
比對單字

.(一個點號)
換行以外的任一字元

\
用倒斜線避開有特殊意義的字符

[root@localhost ynie]# grep ‘[Kk]enny’ file 在file檔案中找相符Kenny或kenny

的每一列

[root@localhost ynie]#grep ‘[0-9][0-9]’ file 在file檔案中找包含連續兩位數字

的每一列

[root@localhost ynie]#grep ‘^[^0-9]’ file 在file檔案中,找尋非以數字開頭的每

一列

[root@localhost ynie]#grep ‘[^A]$’ file 在file檔案中,找尋非以A結尾的每一列

[root@localhost ynie]#grep ‘\<[Kk]enny\>’ file 在file檔案中,找尋有Kenny

或kenny的單字的每一列

[root@localhost ynie]#grep ‘a.b’ file 在file找尋a與b中間有字符的每一列,

除了空白之外,任一字符都符合,也不可以

只有ab

[root@localhost ynie]#grep ‘\.’ File 在File檔案中找有句點的每一列,這裡用

倒斜線把句點的特殊功能給避掉

修飾符

Regex
說明

*(星號)
用來表示前一字符出現零次或多次

?(問號)
用來表示前一字符出現零或一次,但是只有在grep使用-E時才有作用喔!

+(加號)
用來表示前一字符出現一次或多次,但是只有在grep使用-E時才有作用

\{n,m\}
用來表示正規表示法所描述的字符出現的字符範圍,\{n\}表示出現n次,

\{n,\}表示至少出現n次,而\{n,m\}為出現n到m次

|
交替比對,可比對之前的正規表示法或是之後的正規表示法,但是只有在grep 使用-E時才有作用

[root@localhost ynie]#grep ‘abc*’ file 在file檔案中找尋ab,abc,abcc,abccc

等等,c出現零次或多次

[root@localhost ynie]#grep -E ‘abc?’ file 在file檔案中找尋ab或abc,問號表示

前一字符出現零次或一次

[root@localhost ynie]#grep -E ‘abc+’ file 在file檔案中找尋abc或是abc或

abcccc等等,+表示找尋前一字符

一次或多次

[root@localhost ynie]#grep -E ‘[0-9]’+ file 在file檔中找尋至少有一個數字

以上的每一列

[root@localhost ynie]#grep ‘^a\{1,3\}$’ file 在file檔中找尋包含a,aa,aaa的

每一列, 出現一到三次

[root@localhost ynie]#grep ‘[0-9]\{1,3\}’ file 在file檔中找尋出現數字1到3

位的每一列

常用的regex

　

相符於任一字母

[A-Za-z]

相符與字母或數字以外任一符號

[^A-Za-z0-9]

相符於一個大寫字母後面跟著零或多個小寫字母

[A-Z][a-z]*

相符於美國社會安全碼123-45-4567的(3位數接破折號接2位數接破折號接4位數)

[0-9]\{3\}-[0-9]\{2\}\-[0-9]\{4\}

跟萬用字元比較

萬用字符
說明

*(星號)
Match anything,簡單的說就是符合零個或是多個字符,例如:a*的話可以是a,aa,abc,abcd,就是a後面接任意的字符或是不接都符合a*

?(問號)
符合單一個字元,例如a?的話,後面只能接一個字符,如aa,ab,ac等等,a後面接超過一個字符或是沒接字符的話,都不符合喔!

[characters]
符合中括號裡面所列的字符,如a[bc]的話,ab符合,ac符合,ad就不符合了

[!characters]
通常有!(驚嘆號)都是反向比對,就是只要不包含到中括號裡面的字符,例如a[!bc],就是只要不是ab或是ac,其他都符合,像ad就符合a[!ab]了

[a-z]
中括號所標示的一個範圍,不包含-本身喔!例如說a[0-9]就是a0,a1,a2…..一直到a9都符合,倘偌要a後面要任意英文字母的話要a[a-zA-Z]降子寫

[!a-z]
反向比對中括號裡面的字符範圍

grep

語法:

grep [option] regex [file]

說明:

找出檔案內容與regex相符的每列文字,在同時指定多個檔案的時候

,grep會在每列文字上前置檔名以利區別(加-h可關閉前置檔名功能)

.

常用選項

c, --count(計算)

Suppress normal output; instead(代替) print a count of matching(配)

lines for each input file.With the -v, --invert(反轉)-match option (see

below), count non-matching lines.

僅計算相符的列數,不會顯示出每列的內容

-h, --no-filename

Suppress the prefixing(加字首) of filenames on output when

multiple(複合的) files are searched(搜索者).

顯示相符的每一列,但在同時指定多個檔案時,不會前置檔名

-i, --ignore(忽略)-case

Ignore case distinctions(性質) in both the PATTERN and the input

files.

忽略大小寫,例如說kenny相符於字串kenny與KENNY

-n, --line-number

Prefix each line of output with the line number within its input

file.

相符的每一列前置列號,若同時指定多個檔案則會前置列號與檔名

-v, --invert-match

Invert(反轉) the sense(意義) of matching, to select non-matching lines.

顯示內容與正規表示法不同的每一列

-E, --extended-regexp

Interpret(解釋) PATTERN as an extended regular expression (see below).

正規表示法的語法越來越多,grep也擴充成新的命令egrep(代表

Extended grep),而grep –E就是egrep

sed

語法:sed 定址與編輯命令檔案

定址:用於決定說要處理哪些文字

1. 使用列號,如n,m到表n到m列

2. /regex/

編輯命令:編輯命令會在定址之後(如果有定址的話),除非指定引數,否則編輯命令通常是一個字母

dà刪除整列文字

sà進行文字替代其語法是:

s/pattern(模範)/replacement(替代)/flages(旗標)

s的旗標包括:gà如果有g 的話就會每列都替換,而不只有相符每列的第一個

nà代換與pattern相符的第n筆文字;預設是1

yà轉譯文字,類似command tr.(字符必須相等)

[root@localhost ynie]#sed ‘/^A/d’ file 刪除file中以A為首的每列(沒有定

址,只有編輯命令)

[root@localhost ynie]#sed ‘y/abc/xyz file 將file檔中的abc全部換成xyz(沒

有定址,只有編輯命令)

[root@localhost ynie]#sed ‘1,3d’ file 刪除file檔中的第一到第三列

[root@localhost ynie]#sed s/^$/ok/g file 將file檔案中的空白列全部換ok

[root@localhost ynie]#sed s/”//g file 將file的空白列全部換成雙引號

sed –f script檔

How could you get the following information: Which GID has the default group of user foo?
Choose the best answer.

a. defgrp foo

b. defgrp -n foo

c. grep foo /etc/passwd | cut -d: -f4

d. getuserinfo -gid foo

e. grep foo /etc/group | cut -d: -f3

Ans:c

How could you display any line of text from the file foo which starts with an upcase letter?
Choose the best answer.

a. grep [A-Z] foo

b. grep "[A-Z]" foo

c. grep "$[A-Z]" foo

d. grep "^[A-Z]" foo

e. grep "+[A-Z]" foo

Ans:d

The user bertha has marked an important line of one of her textfiles with an asterix (*). But now she forgot the name of the file. How could you find
this file, assuming it is located in berthas home directory? Choose the best answer.

grep * /home/bertha/*
grep \* /home/bertha/*
grep "/*" /home/bertha/*
grep --key=asterix /home/bertha/*
grep 0x2A /home/bertha/*
Ans:b

Which of the following commands could be used to search for a particular(特別的) term(措詞) inside a textfile without opening the file?
Choose every correct answer.

a. grep

b. vi

c. ex

d. less

e. sed

Ans:a,e

How could you display all lines of text from the file foo which are not empty(空的)?
Choose the best answer.

a. grep ".*" foo

b. grep -v ^$ foo

c. grep -v ^\r\n foo

d. grep -v \r\n foo

e. grep -v "[]" foo

Ans:b

What command would you use to copy all files inside the current directory which names begin with a number to /tmp
Choose the best answer.

a. cp [:num:]* /tmp

b. cp [0-9]* /tmp

c. cp [0-9] /tmp

d. cp [0-9*] /tmp

e. cp [0-9].* /tmp

Ans:b

You have created a really long letter and after you are done you notice(佈告) that you used the name "Jake" many time but forget to captitalize(用大寫書寫) it in many instances(情形), which command would replease "Peter" with "Jake" in all instance and generate(產生) a new file for printing?

a. sed 's/Jake/Peter' letter > newletter

b. sed '/Peter/Jake' letter > newletter

c. sed s/Peter/Jake letter < newletter

d. sed 's/Peter/Jake' letter > newletter

Ans:d

What statement concerning the following wildcard(萬用字元) is correct?
[A-Z]\*
Choose the best answer.

a. All files beginning with an uppercase(大寫字母) letter followed by one *

b. All files beginning with an non numeric letter

c. All files beginning with an uppercase letter followed by the backslash(反斜線)

d. All files without numbers in their names

e. All files beginning with one of the letters A, Z or -

Ans:a

Which one of the following file globs matches "Book" and "book" ,but not "book.net" and to "bookmark" ?

a. [B/book]

b. [Bb]ook

c. \B\book

d. ?ews

e. [Bb]ook?

Ans:b

The regular expression to find all lines beginning with book is

a. /book*/

b. /book$/

c. /^book/

d. /book/

Ans:c

The regular expression to find 60*30 is

a. 60*30$

b. 60\*30

c. ^60*30

d. 60*30

Ans:b

贊助網站

廣利不動產-板橋在地生根最實在--新板特區指名度最高、值得您信賴的好房仲
完整房訊，房屋、店面熱門精選物件，廣利不動產優質仲介，房屋租賃、買賣資訊透明，交易真安心！
廣利不動產-新板特區指名度最高、值得您信賴的好房仲
您的托付,廣利用心為您服務

1 樓住戶：阿保
發表時間：2007-05-08

正規表示式

規則及運算符號
「*」符合 0 項以上

「|」符合 0 或 1 項以上

「+」符合 1 項以上

「?」符合 0 到 1 項

「( )」組合及排定運算順序

「[ ]」可接受出現的字元定義符號

「{ }」設定長度

「/ /」宣告 PCRE 正規表達式

「^ $」起始與結尾字符

「.」萬用字元，代表任何文字

「」特殊字元 ^.$()|*+?{ 前面必須加上此轉移字元

正規表達式範例
{2,4}、{3}、{3,}
分別代表 2-4 個字元、3個字元、3個以上字元

[a-z]
代表小寫英文

[A-Z]
代表大寫英文

[^A-Z]
代表大寫英文字母以外

[A-Za-z0-9_]
代表接受大小寫英數及符號

[A-Za-z]
代表大小寫英文

[0-9]
代表數字

[^0-9]
代表數字以外

[0-9A-Za-z]
代表英文大小寫及數字

[^A-Za-z0-9]
代表英文大小寫及數字以外

PCRE 正規表達式
d
代表數字，等於 [0-9]

D
代表數字以外，等於 [^0-9]

w
代表包含底線的英文大小寫及數字，等於 [A-Za-z0-9_]

W
代表包含底線英文大小寫及數字以外，等於 [^A-Za-z0-9_]

b
代表一個單詞邊界，也就是指單詞和空格間的位置。
例如， yab 等於 “nahoya” 中的 ya，但不等於 “nahoyabe” 中的 ya

B
代表非單詞邊界。
例如，yaB 等於 “nahoyabe” 中的 ya，但不等於 “nahoya” 中的 ya

s
代表非字元的對象，如空白及 Tab，等於 [ fnrtv]

S
代表非字元的對象以外，等於 [^ fnrtv]

n
代表換行字元

t
代表 TAB

/
代表反斜線

2 樓住戶：ynie
發表時間：2008-03-02

看到自己之前網頁的東西被一字不漏的貼過來，
謝謝您的瀏覽^^

3 樓住戶：小傑
發表時間：2008-03-02

你的網頁是否方便貼上來呢@@?

　共 3 人回應　　選擇頁數【第1 頁】

姓名：

佈告內容：

其他選項: