python - Adding a data frame column with len() of another column's values -


i'm having problem trying character count column of string values in column, , haven't figured out how efficiently.

for index in range(len(df)):     df['char_length'][index] = len(df['string'][index])) 

this apparently involves first creating column of nulls , rewriting it, , takes long time on data set. what's effective way of getting like

'string'     'char_length' abcd          4 abcde         5 

i've checked around quite bit, haven't been able figure out.

pandas has vectorised string method this: str.len(). create new column can write:

df['char_length'] = df['string'].str.len() 

for example:

>>> df   string 0   abcd 1  abcde  >>> df['char_length'] = df['string'].str.len() >>> df   string  char_length 0   abcd            4 1  abcde            5 

this should considerably faster looping on dataframe python for loop.

many other familiar string methods python have been introduced pandas. example, lower (for converting lowercase letters), count counting occurrences of particular substring, , replace swapping 1 substring another.


Comments

Popular posts from this blog

jquery - How do you format the date used in the popover widget title of FullCalendar? -

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -