Skip to content Skip to sidebar Skip to footer

Python Re Vs Html5 Re

it seems that i got a bug in python: (Python 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)] on win32) >>> re.match('0[5-7][5-9][0-9]{7}', '0775123456')

Solution 1:

This is by design. re.match matches at the beginning of a string as opposed to re.search, which matches anywhere in a string. Extra characters after the string are ignored. See http://docs.python.org/library/re.html#match for more details.

Other languages that also use regular expressions, such as grep and perl, act the same. Regular expressions are primarily used for searching text.

If you want to perform an exact match you have to specify the dollar sign as you yourself also noted.

Solution 2:

As others have said... not a bug, your regex doesn't make it clear why '$' seems to fix it, but this example should:

import re

print'food: ',
print re.match('fo{2}d', 'food')     # match found!

print'fooood: ',
print re.match('fo{2}d', 'fooood')   # no match!

anything after the {} will give you the behavior you want, if you want the string to end after the digits, then $ is the appropriate anything to add there.

Solution 3:

The documentation for re.match() states:

If zero or more characters at the beginning of string match the regular expression pattern…

(Emphasis mine.)

This means that all the characters in the string after the match completes are ignored. For instance, the following would also work:

>>>re.match("0[5-7][5-9][0-9]{7}", "0775123456abc")
<_sre.SRE_Match at 0x10ee2e8b8>

Regular expressions don't really specify where, if anywhere, a match has to be anchored. For instance, Python also has re.search() which by default isn't anchored at the beginning either. You can always explicitly specify anchoring using the ^ and $ metacharacters.

Post a Comment for "Python Re Vs Html5 Re"