python数据分析1

collections之DataFrame和Series

DataFrame:用于把json字符串转化成表格形式

frame如果是DataFrame类型,那么可以把他看成一个表

其中frame['列名']得到的就是一列数据,也称之为Series

使用series.value_counts()可以得到数据出现的频度

 
frame
Out[64]: 
                                                   a              al   c  \
0  Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKi...  en-US,en;q=0.8  US   
1                             GoogleMaps/RochesterNY             NaN  US   
2  Mozilla/4.0 (compatible; MSIE 8.0; Windows NT ...           en-US  US   
3  Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8)...           pt-br  BR   
4  Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKi...  en-US,en;q=0.8  US   

           cy       g  gr       h          hc         hh         l  \
0     Danvers  A6qOVH  MA  wfLQtf  1331822918  1.usa.gov   orofrog   
1       Provo  mwszkS  UT  mwszkS  1308262393       j.mp     bitly   
2  Washington  xxr3Qb  DC  xxr3Qb  1331919941  1.usa.gov     bitly   
3        Braz  zCaLwp  27  zUtuOu  1331923068  1.usa.gov  alelex88   
4  Shrewsbury  9b6kNl  MA  9b6kNl  1273672411     bit.ly     bitly   

                         ll  nk  \
0   [42.576698, -70.954903]   1   
1  [40.218102, -111.613297]   0   
2     [38.9007, -77.043098]   1   
3  [-23.549999, -46.616699]   0   
4   [42.286499, -71.714699]   0   

                                                   r           t  \
0  http://www.facebook.com/l/7AQEFzjSi/1.usa.gov/...  1331923247   
1                           http://www.AwareMap.com/  1331923249   
2                               http://t.co/03elZC4Q  1331923250   
3                                             direct  1331923249   
4                http://www.shrewsbury-ma.gov/selco/  1331923251   

                  tz                                                  u  
0   America/New_York        http://www.ncbi.nlm.nih.gov/pubmed/22415991  
1     America/Denver        http://www.monroecounty.gov/etc/911/rss.php  
2   America/New_York  http://boxer.senate.gov/en/press/releases/0316...  
3  America/Sao_Paulo            http://apod.nasa.gov/apod/ap120312.html  
4   America/New_York  http://www.shrewsbury-ma.gov/egov/gallery/1341...  

In [65]: frame['tz']
Out[65]: 
0     America/New_York
1       America/Denver
2     America/New_York
3    America/Sao_Paulo
4     America/New_York
Name: tz, dtype: object

In [66]: frame['tz'].value_counts()
Out[66]: 
America/New_York     3
America/Sao_Paulo    1
America/Denver       1
Name: tz, dtype: int64


补上未知值的两个方法

clean_tz = frame['tz'].fillna("Missing")

clean_tz[clean_tz == ''] = "unknown"

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值