Working with datetime in Python -


i have file has following format:

20150426010203 name1  20150426010303 name2 20150426010307 name3 20150426010409 name1 20150426010503 name4 20150426010510 name1 

i interested in finding time differences between appearances of name1 in list , calculating frequency of such appearances (for example, delta time = 1s appeared 20 time, delta time = 30s appeared 1 time etc). second problem how find number of events per minute/hour/day.

i found time differences using

pd.to_datetime(pd.series([time])) 

to convert each string datetime format , placed values in list named 'times'. iterated through list:

new=[x - times[i - 1] i, x in enumerate(times)][1:] 

and resulting list this:

dtype: timedelta64[ns], 0   00:00:50 dtype: timedelta64[ns], 0   00:00:10 dtype: timedelta64[ns], 0   00:00:51 dtype: timedelta64[ns], 0   00:00:09 dtype: timedelta64[ns], 0   00:00:50 dtype: timedelta64[ns], 0   00:00:11 

any further attempt calculate frequency results in 'typeerror: 'series' objects mutable, cannot hashed' error. , not sure find how calculate number of events per minute or other time unit.

obviously, don't have lot of experience datetime in python, pointers appreciated.

use resample , sum number of events per time period - examples below

i gather want intervals individuals (name1: 1st 2nd event interval; , his/her 2nd 3rd event interval). need group name , difference times each group. in dataset, name1 has more 1 event, , 2 events necessary person-centric interval.

quick , dirty ...

# --- data dataframe can play ... #     first, put data in multi-line string (i read file #     if had in file - purposes string do). data = """ time name 20150426010203 name1  20150426010303 name2 20150426010307 name3 20150426010409 name1 20150426010503 name4 20150426010510 name1""" #    second use stringio , pandas.read_csv pretend #    reading file. stringio import stringio # import io in python 3 df = pd.read_csv(stringio(data), header=0, index_col=0, sep='\s+') #    third, because pandas did not recognise date-time format #    of column made index, force string #    converted pandas timestamp come datetimeindex. df.index = pd.to_datetime(df.index, format='%y%m%d%h%m%s')  # number of events per minute df['event'] = 1 # sum events per time-period dfepm = df.resample('1min', how=sum)  # number of events per hour dfeph = df.resample('1h', how=sum)  # time differences name del df['event'] # don't need anymore df['time'] = df.index df['time_diff_by_name'] = df.groupby('name')['time'].diff() 

Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -