Simple Regex

Find

  1. Finditer
    reg = ‘\\s(\\w*?)=’               # word between space and =
    line = ‘<lalala=”x” lala=”y” la=”z”>’
    .for match in re.finditer(reg, line, re.S):
    .print(match.group(1), end=””)
  2. Findall
    reg = ‘”(.*?)”‘                  # between ” and ”
    line = ‘<lalala=”x” lala=”y” la=”z”>’
    .ret = re.findall(reg, line)

 

Notes:
. wildcard/any character
* repeated  more than 1
? occurrence 0 or 1

*? is used to make it lazy (match the shortest one) because there are many ” in the input string line. If ? is not used, it will match the longest possible answer (from the first ” to the last “)

Quantifier, by default: greedy
e.g. for string “123”,  regex \d+ will match 123, instead of 1
(additional note: + is for occurrence more than 1)

lazy = as few as possible = shortest match = reluctant
by adding ?
string: 123EEE
reg = \w*?E
match = 123E

*? :

http://www.rexegg.com/regex-quantifiers.html

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s