python - Calculating Active dates based on gap length using Pandas Dataframes -


i'm relatively new pandas, , trying figure out best way of calculating information is, appreciated. have dataframe looks so:

id     activity_date 1      2015-01-01       1      2015-01-02       1      2015-01-03       2      2015-01-02       2      2015-01-05      3      2015-01-10       

and want calculate following information "how many days each account active?", understand count information, want apply following restriction, "if there n days between activity dates, count days before gap".

for example, n = 5 following should return count of days active 4, not 6

id     activity_date 1      2015-01-01       1      2015-01-02       1      2015-01-04 1      2015-01-06 1      2015-01-14 1      2015-01-15 

after understanding want simpler, calculate whether difference between current , previous rows larger 5 days giving boolean series, use filter df , use index value perform slicing:

in [57]:  inactive_index = df[df['activity_date'].diff() > pd.timedelta(5, 'd')] inactive_index out[57]:    id activity_date 4   1    2015-01-14  in [18]:  inactive.index out[18]: int64index([4], dtype='int64') in [58]:  df.iloc[:inactive.index[0]] out[58]:    id activity_date 0   1    2015-01-01 1   1    2015-01-02 2   1    2015-01-04 3   1    2015-01-06 

Comments

Popular posts from this blog

shopping cart - Page redirect not working PHP -

php - How to modify a menu to show sub-menus -

python - Installing PyDev in eclipse is failed -