python 2.7 - Grouping data by value ranges -


i have csv file shows parts on order. columns include days late, qty , commodity.

i need group data days late , commodity sum of qty. days late needs grouped ranges.

>56 >35 , <= 56 >14 , <= 35 >0 , <=14 

i hoping use dict how.

{'red':'>56,'amber':'>35 , <= 56','yellow':'>14 , <= 35','white':'>0 , <=14'} 

i looking result this

        red  amber  yellow  white strsub  56   60     74      40 botdwg  20   67     87      34 

i new pandas don't know if possible @ all. provide advice.

thanks

suppose start data:

df = pd.dataframe({'id': ('strsub botdwg'.split())*4,                    'days late': [60, 60, 50, 50, 20, 20, 10, 10],                    'quantity': [56, 20, 60, 67, 74, 87, 40, 34]}) #    days late      id  quantity # 0         60  strsub        56 # 1         60  botdwg        20 # 2         50  strsub        60 # 3         50  botdwg        67 # 4         20  strsub        74 # 5         20  botdwg        87 # 6         10  strsub        40 # 7         10  botdwg        34 

then can find status category using pd.cut. note default, pd.cut splits series df['days late'] categories half-open intervals, (-1, 14], (14, 35], (35, 56], (56, 365]:

df['status'] = pd.cut(df['days late'], bins=[-1, 14, 35, 56, 365], labels=false) labels = np.array('white yellow amber red'.split()) df['status'] = labels[df['status']] del df['days late'] print(df) #        id  quantity  status # 0  strsub        56     red # 1  botdwg        20     red # 2  strsub        60   amber # 3  botdwg        67   amber # 4  strsub        74  yellow # 5  botdwg        87  yellow # 6  strsub        40   white # 7  botdwg        34   white 

now use pivot dataframe in desired form:

df = df.pivot(index='id', columns='status', values='quantity') 

and use reindex obtain desired order rows , columns:

df = df.reindex(columns=labels[::-1], index=df.index[::-1]) 

thus,

import numpy np import pandas pd  df = pd.dataframe({'id': ('strsub botdwg'.split())*4,                    'days late': [60, 60, 50, 50, 20, 20, 10, 10],                    'quantity': [56, 20, 60, 67, 74, 87, 40, 34]}) df['status'] = pd.cut(df['days late'], bins=[-1, 14, 35, 56, 365], labels=false) labels = np.array('white yellow amber red'.split()) df['status'] = labels[df['status']] del df['days late'] df = df.pivot(index='id', columns='status', values='quantity') df = df.reindex(columns=labels[::-1], index=df.index[::-1]) print(df) 

yields

        red  amber  yellow  white id                                strsub   56     60      74     40 botdwg   20     67      87     34 

Comments

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

javascript - Clean way to programmatically use CSS transitions from JS? -

android - send complex objects as post php java -