Python 3.4 reading from CSV formats -


ok have code in python im importing csv file problem there columns in csv file aren't basic numbers. there 1 column text in format "int, ext" , there column in o'clock format "0:00 11:59" format. have third column normal number distance in "00.00" format.

my question how go plotting distance vs o'clock , basing whether 1 int or ext changing colors of dots scatterplot.

my first problem having how make program read oclock format. , text formats csv.

any ideas or suggestions? in advance

here sample of csv im trying import

ml  int  .10  534.15  0:00 ml  ext  .25  654.23  3:00 ml  int  .35  743.12  6:30 

i want plot 4th column x axis , 5th column y axis want color code scatter plot dots red or blue depending if 1 int or ext

here sample of code have far

import matplotlib.pyplot plt matplotlib import style import numpy np  style.use('ggplot')  a,b,c,d = np.loadtxt('numbers.csv',                 unpack = true,                 delimiter = ',')    plt.scatter(a,b)     plt.title('charts') plt.ylabel('y axis') plt.xlabel('x axis')  plt.show() 

reading in example csv using pandas:

import pandas pd import matplotlib.pyplot plt import datetime  data = pd.read_csv('data.csv', sep='\t', header=none) print data 

prints:

    0    1     2       3     4 0  ml  int  0.10  534.15  0:00 1  ml  ext  0.25  654.23  3:00 2  ml  int  0.35  743.12  6:30 

then separate 'int' 'ext':

ints = data[data[1]=='int'] exts = data[data[1]=='ext'] 

change them datetime , grab distances:

int_times = [datetime.datetime.time(datetime.datetime.strptime(t, '%h:%m')) t in ints[4]] ext_times = [datetime.datetime.time(datetime.datetime.strptime(t, '%h:%m')) t in exts[4]] int_dist = [d d in ints[3]] ext_dist = [d d in exts[3]] 

then plot scatter plot 'int' , 'ext' each:

fig, ax = plt.subplots() ax.scatter(int_dist, int_times, c='orange', s=150) ax.scatter(ext_dist, ext_times, c='black', s=150) plt.legend(['int', 'ext'], loc=4) plt.xlabel('distance') plt.show() 

enter image description here

edit: adding code answer question in comments regarding how change time 12 hour format (ranging 0:00 11:59) , strip seconds.

import pandas pd import matplotlib.pyplot plt import numpy np  data = pd.read_csv('data.csv', header=none) ints = data[data[1]=='int'] exts = data[data[1]=='ext'] int_index = data[data[1]=='int'].index ext_index = data[data[1]=='ext'].index time = [t t in data[4]] int_dist = [d d in ints[3]] ext_dist = [d d in exts[3]]  fig, ax = plt.subplots() ax.scatter(int_dist, int_index, c='orange', s=150) ax.scatter(ext_dist, ext_index, c='black', s=150) ax.set_yticks(np.arange(len(data[4]))) ax.set_yticklabels(time) plt.legend(['int', 'ext'], loc=4) plt.xlabel('distance') plt.ylabel('time') plt.show() 

enter image description here


Comments

Popular posts from this blog

jquery - How do you format the date used in the popover widget title of FullCalendar? -

Bubble Sort Manually a Linked List in Java -

asp.net mvc - SSO between MVCForum and Umbraco7 -