python - regex to match words of length specified within string -


i trying parse text output samtools mpileup. start string

s = '.$......+2ag.+2ag.+2aggg' 

whenever have + followed integer n, select n characters following integer , replace whole thing *. test case have

'.$......+2ag.+2ag.+2aggg' ---> '.$......*.*.*gg'  

i have regex \+[0-9]+[acgtnacgtn]+ results in output .$......*.*.* , trailing g's lost well. how select n characters n not known ahead of time specified in string itself?

the repl argument in re.sub can string or function.

so, can complex things function replacements:

def removechars(m):     x=m.group()     n=re.match(r'\+(\d+).*', x).group(1) # digit part     return '*'+x[1+len(n)+int(n):] 

solves problem:

>>> re.sub(r'\+[0-9]+[acgtnacgtn]+', removechars, s) '.$......*.*.*gg' 

Comments

Popular posts from this blog

shopping cart - Page redirect not working PHP -

php - How to modify a menu to show sub-menus -

python - Installing PyDev in eclipse is failed -