regex - Why the * regular expression indicates what can or cannot be it's previous character -


take example found in blog, "how searching apple word spelled wrong in given file apple misspelled ale, aple, appple, apppple, apppppple etc. find patterns

grep 'ap*le' filename 

readers should observe above pattern match ale word * indicates 0 or more of previous character occurrence."

now it's saying "ale" accept when having ap*le, isn't "ap" , "le" fixed?

the * quantifier meaning 0 or more times for previous pattern -- in case single literal p. can state same * quantifier:

 ap{0,}le 

the interesting question 'what previous pattern?' helpful put pattern in group aid understand of 'previous pattern' is.

consider wanting find of:

 ale, aple, appple, apppple, apppppple, able, abbbbbbble 

your first try might be:

 /ap|b*le/      ^     literal 'p' first alternative  #wrong regex use 'ap'     ^   or      ^  literal 'b' 

demo

what want in case is:

 /a(?:p|b)*le/ 

demo

if not want match ale , match aple, appple, apppple, apppppple, use + instead of * means 1 or more:

/ap+le/ 

and equivalent /ap{1,}le/

demo

and if want match aple, appple , leave out variants more 3 'p's use additional max quantifier:

/ap{1,3}le/ 

all variants above match apple correctly spelled. if aple, appple, , not match apple, use alteration:

/a(?:p|p{3})le/ 

demo


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -