Python 3.4 reading from CSV formats -
ok have code in python im importing csv file problem there columns in csv file aren't basic numbers. there 1 column text in format "int, ext" , there column in o'clock format "0:00 11:59" format. have third column normal number distance in "00.00" format.
my question how go plotting distance vs o'clock , basing whether 1 int or ext changing colors of dots scatterplot.
my first problem having how make program read oclock format. , text formats csv.
any ideas or suggestions? in advance
here sample of csv im trying import
ml int .10 534.15 0:00 ml ext .25 654.23 3:00 ml int .35 743.12 6:30
i want plot 4th column x axis , 5th column y axis want color code scatter plot dots red or blue depending if 1 int or ext
here sample of code have far
import matplotlib.pyplot plt matplotlib import style import numpy np style.use('ggplot') a,b,c,d = np.loadtxt('numbers.csv', unpack = true, delimiter = ',') plt.scatter(a,b) plt.title('charts') plt.ylabel('y axis') plt.xlabel('x axis') plt.show()
reading in example csv using pandas:
import pandas pd import matplotlib.pyplot plt import datetime data = pd.read_csv('data.csv', sep='\t', header=none) print data
prints:
0 1 2 3 4 0 ml int 0.10 534.15 0:00 1 ml ext 0.25 654.23 3:00 2 ml int 0.35 743.12 6:30
then separate 'int' 'ext':
ints = data[data[1]=='int'] exts = data[data[1]=='ext']
change them datetime , grab distances:
int_times = [datetime.datetime.time(datetime.datetime.strptime(t, '%h:%m')) t in ints[4]] ext_times = [datetime.datetime.time(datetime.datetime.strptime(t, '%h:%m')) t in exts[4]] int_dist = [d d in ints[3]] ext_dist = [d d in exts[3]]
then plot scatter plot 'int' , 'ext' each:
fig, ax = plt.subplots() ax.scatter(int_dist, int_times, c='orange', s=150) ax.scatter(ext_dist, ext_times, c='black', s=150) plt.legend(['int', 'ext'], loc=4) plt.xlabel('distance') plt.show()
edit: adding code answer question in comments regarding how change time 12 hour format (ranging 0:00 11:59) , strip seconds.
import pandas pd import matplotlib.pyplot plt import numpy np data = pd.read_csv('data.csv', header=none) ints = data[data[1]=='int'] exts = data[data[1]=='ext'] int_index = data[data[1]=='int'].index ext_index = data[data[1]=='ext'].index time = [t t in data[4]] int_dist = [d d in ints[3]] ext_dist = [d d in exts[3]] fig, ax = plt.subplots() ax.scatter(int_dist, int_index, c='orange', s=150) ax.scatter(ext_dist, ext_index, c='black', s=150) ax.set_yticks(np.arange(len(data[4]))) ax.set_yticklabels(time) plt.legend(['int', 'ext'], loc=4) plt.xlabel('distance') plt.ylabel('time') plt.show()
Comments
Post a Comment