How to split /t without create a new line using Python -


i have lists of text in 1 folder :

my o name o o alex b . o  o o o london b . o 

this codes:

import re  def read_file(filename):  file = open(filename).read().strip().split("\n\n") lines = [] line in file:   lines.append(re.split(r'\t|\n', line))  return lines  train_sents = read_file(("train.txt"))  train_sents [0] 

the output is:

 [ 'my',  'o',  'name',  'o',  "is',  'o',  'alex',  'b',  '.',  'o'] 

my question is..is possible split \t without splitting new line? example output like:

[('my', 'o'),  ('name', 'o'),  ("is', 'o'),  ('alex', 'b'),  ('.', 'o')] 

just split each line:

with open(filename) f:     print([tuple(line.split()) line in f]) [('my', 'o'), ('name', 'o'), ('is', 'o'), ('alex', 'b'), ('.', 'o')] 

to separate lines empty lines append last sublist or else add new list if meet empty line:

with open(infile) f:     l = [[]]     line in f:         if line.strip():             l[-1].append(tuple(line.split()))         else:             l.append([]) print(l[0]) print(l[1])  [('my', 'o'), ('name', 'o'), ('is', 'o'), ('alex', 'b'), ('.', 'o')] [('i', 'o'), ('am', 'o'), ('from', 'o'), ('london', 'b'), ('.', 'o')] 

you use itertools.groupby grouping using empty lines delimiter:

from itertools import groupby open(infile) f:      print([list(map(str.split, v))        k, v in groupby(f, key=lambda x: x.strip() != "") if k])   [[['my', 'o'], ['name', 'o'], ['is', 'o'], ['alex', 'b'], ['.', 'o']], [['i', 'o'], ['am', 'o'], ['from', 'o'], ['london', 'b'], ['.', 'o']]] 

you can map tuple if necessary.


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -