python - Pandas DataFrame from WB WDI data: combine year columns into "year" variable and merge rows -
i have dataset (.tsv file) following columns. (it's world bank's new wdi all-in all-time single-download dataset. nice!)
country countrycode varname 1960 1961 1962 afghanistan afg gdp 5.6 5.7 5.8 afghanistan afg gini .77 .78 .75 afghanistan afg educ 8.1 8.2 8.3 afghanistan afg pop 888 889 890 albania alb gdp 6.6 6.7 6.8 albania alb gini .45 .46 .47 albania alb educ 6.2 6.3 6.4 albania alb pop 777 778 779
i need pandas dataframe ['gdp','gini','edu','pop'] columns, along ['country', 'countrycode', 'year']. values "year" columns! , i'd there 1 row each country-year combination.
for instance, columns , first row be
country countrycode year gdp gini educ pop afghanistan afg 1960 5.6 .77 8.1 888
this seems complex pivot or opposite-of-"melt", cannot figure out.
in [59]: df out[59]: country countrycode varname 1960 1961 1962 0 afghanistan afg gdp 5.60 5.70 5.80 1 afghanistan afg gini 0.77 0.78 0.75 2 afghanistan afg educ 8.10 8.20 8.30 3 afghanistan afg pop 888.00 889.00 890.00 4 albania alb gdp 6.60 6.70 6.80 5 albania alb gini 0.45 0.46 0.47 6 albania alb educ 6.20 6.30 6.40 7 albania alb pop 777.00 778.00 779.00 in [60]: df = df.set_index(['country', 'countrycode', 'varname']) in [61]: df.columns.name = 'year' in [62]: df.stack().unstack('varname') out[62]: varname gdp gini educ pop country countrycode year afghanistan afg 1960 5.6 0.77 8.1 888 1961 5.7 0.78 8.2 889 1962 5.8 0.75 8.3 890 albania alb 1960 6.6 0.45 6.2 777 1961 6.7 0.46 6.3 778 1962 6.8 0.47 6.4 779
the latter frame mutliindex, can reset_index move multiindex regular columns.
Comments
Post a Comment