csv - Python Error; UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' -


i trying extract data json file contains tweets , write csv. file contains kinds of characters, i'm guessing why error message:

unicodeencodeerror: 'ascii' codec can't encode character u'\u2026'

i guess have convert output utf-8 before writing csv file, have not been able that. have found similar questions here on stackoverflow, not i've not been able adapt solutions problem (i should add not familiar python. i'm social scientist, not programmer)

import csv import json  fieldnames = ['id', 'text']  open('my_source_file', 'r') f, open('my_output', 'a') out:      writer = csv.dictwriter(                     out, fieldnames=fieldnames, delimiter=',', quoting=csv.quote_all)      line in f:         tweet = json.loads(line)         user = tweet['user']         output = {             'text': tweet['text'],             'id': tweet['id'],         }         writer.writerow(output) 

you need encode text utf-8:

for line in f:     tweet = json.loads(line)     user = tweet['user']     output = {         'text': tweet['text'].encode("utf-8"),         'id': tweet['id'],     }     writer.writerow(output) 

the csv module not support writing unicode in python2:

note version of csv module doesn’t support unicode input. also, there issues regarding ascii nul characters. accordingly, input should utf-8 or printable ascii safe; see examples in section examples.


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -