Python matching regex multiple times in a row (not the findall way) -
this question not asking finding 'a' multiple times in string etc.
what match:
[ a-za-z0-9]{1,3}\. regexp multiple times, 1 way of doing using |
'[ a-za-z0-9]{1,3}\.[ a-za-z0-9]{1,3}\.[ a-za-z0-9]{1,3}\.[ a-za-z0-9]{1,3}\.|[ a-za-z0-9]{1,3}\.[ a-za-z0-9]{1,3}\.[ a-za-z0-9]{1,3}\.|[ a-za-z0-9]{1,3}\.[ a-za-z0-9]{1,3}\.' so matches regexp 4 or 3 or 2 times. matches stuff like:
a. v. b. m a.b. is there way make more coding like?
i tried doing
([ a-za-z0-9]{1,3}\.){2,4} but functionality not same expected. 1 matches:
regex.findall(string) [u' b.', u'b.'] string is:
a. v. b. split them a.b. split somethinf words. more words, ten is there way this? goal match possible english abbreviations , names mary j. e. things sentence tokenizer recognizes sentence punctuation not.
i want match of this:
u.s. , c.v.a.b. , a. v. p.
first of regex work expect :
>>> s="aa2.jhf.jev.d23.llo." >>> import re >>> re.search(r'([ a-za-z0-9]{1,3}\.){2,4}',s).group(0) 'aa2.jhf.jev.d23.' but if want match sub strings u.s. , c.v.a.b. , a. v. p. need put whole of regex in capture group :
>>> s= 'a. v. b. split them a.b. split somethinf words. say' more >>> re.findall(r'(([ a-za-z0-9]{1,3}\.){2,4})',s) [('a. v. b.', ' b.'), ('m a.b.', 'b.')] then use list comprehension first matches :
>>> [i[0] in re.findall(r'(([ a-za-z0-9]{1,3}\.){2,4})',s)] ['a. v. b.', 'm a.b.']
Comments
Post a Comment