regex - Why the * regular expression indicates what can or cannot be it's previous character -
take example found in blog, "how searching apple word spelled wrong in given file apple misspelled ale, aple, appple, apppple, apppppple etc. find patterns
grep 'ap*le' filename
readers should observe above pattern match ale word * indicates 0 or more of previous character occurrence."
now it's saying "ale" accept when having ap*le
, isn't "ap" , "le" fixed?
the *
quantifier meaning 0 or more times for previous pattern -- in case single literal p
. can state same *
quantifier:
ap{0,}le
the interesting question 'what previous pattern?' helpful put pattern in group aid understand of 'previous pattern' is.
consider wanting find of:
ale, aple, appple, apppple, apppppple, able, abbbbbbble
your first try might be:
/ap|b*le/ ^ literal 'p' first alternative #wrong regex use 'ap' ^ or ^ literal 'b'
what want in case is:
/a(?:p|b)*le/
if not want match ale
, match aple, appple, apppple, apppppple, use +
instead of *
means 1 or more:
/ap+le/
and equivalent /ap{1,}le/
and if want match aple, appple , leave out variants more 3 'p's use additional max quantifier:
/ap{1,3}le/
all variants above match apple
correctly spelled. if aple, appple, , not match apple, use alteration:
/a(?:p|p{3})le/
Comments
Post a Comment