# 查看索引数据

```python
import pandas as pd
import numpy as np
```

## 创建数据

```python
s = pd.Series([1,2,3,np.nan,5])
date = pd.date_range('20170101',periods=7)
s=pd.DataFrame(np.random.rand(7,4),index=date,columns=['A','B','C','D'])
s
'''
A    B    C    D
2017-01-01    0.756081    0.273180    0.420455    0.632778
2017-01-02    0.823846    0.020843    0.365796    0.070497
2017-01-03    0.746543    0.182495    0.183828    0.290980
2017-01-04    0.480208    0.934172    0.558363    0.443208
2017-01-05    0.198488    0.566854    0.064392    0.106457
2017-01-06    0.639944    0.017557    0.744136    0.133560
2017-01-07    0.449300    0.792805    0.098290    0.716538
'''
```

## 查看

### 获取列

s\['A'] 和 s\[\['A']] 、s\[\['A','C']]

```python
# 获得数据，按照列 名称
print(s['A'],'\n',s['A'].shape)
'''
2017-01-01    0.362362
2017-01-02    0.261357
2017-01-03    0.567673
2017-01-04    0.971732
2017-01-05    0.163707
2017-01-06    0.505815
2017-01-07    0.739000
Freq: D, Name: A, dtype: float64 
 (7,)
'''
```

```python
print(s[['A']],'\n',s[['A']].shape)
'''
                   A
2017-01-01  0.362362
2017-01-02  0.261357
2017-01-03  0.567673
2017-01-04  0.971732
2017-01-05  0.163707
2017-01-06  0.505815
2017-01-07  0.739000 
(7, 1)
'''
```

获得数据，按照列 名称 多个

```python
print(s[['A','C']],s[['A','C']].shape)

                   A         C
2017-01-01  0.325836  0.286616
2017-01-02  0.915091  0.907306
2017-01-03  0.540354  0.533346
2017-01-04  0.359208  0.466159
2017-01-05  0.513239  0.329017
2017-01-06  0.793126  0.382535
2017-01-07  0.444599  0.583308 
(7, 2)
```

也可以直接加名称，不过这样一次只能取一个

```python
# 等价 s['A']
s.A
'''
2017-01-01    0.756081
2017-01-02    0.823846
2017-01-03    0.746543
2017-01-04    0.480208
2017-01-05    0.198488
2017-01-06    0.639944
2017-01-07    0.449300
Freq: D, Name: A, dtype: float64
'''
```

### 获取行

与列一样，也有数组形式

```python
# 前2行，**列名也算一行哦**
print(s[:2])
'''
 A         B         C         D
2017-01-01  0.756081  0.273180  0.420455  0.632778
2017-01-02  0.823846  0.020843  0.365796  0.070497'''
```

```python
# 获得数据，按照行 名称
print(s.loc['2017-01-02'],s.loc['2017-01-02'].shape)

'''
A    0.823846
B    0.020843
C    0.365796
D    0.070497
Name: 2017-01-02 00:00:00, dtype: float64 (4,)
'''
```

同时选择 行、列

**loc、iloc**

```python
# 依据 行、列切数据；（前面只是是个：范围区间，灵活位置数组选择只能用iloc）
print(s.loc['2017-01-02':'2017-01-04',['A','D']])
'''
                   A         D
2017-01-02  0.823846  0.070497
2017-01-03  0.746543  0.290980
2017-01-04  0.480208  0.443208
'''
```

```python
# 获得数据，按照：行序号，列序号
s.iloc[[1,2,3],[0,3]]
'''
                                     A                 D
2017-01-02    0.823846    0.070497
2017-01-03    0.746543    0.290980
2017-01-04    0.480208    0.443208

'''
```

### 查看头尾

```python
print(df2.head(n=2),'\n',df2.tail(n=2))
```

### 查看索引、列、numpy值

```python
print(df2.index)
print(df2.columns)
print(df2.values)

'''
Int64Index([0, 1, 2, 3], dtype='int64')
Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')
[[1.0 Timestamp('2017-01-02 00:00:00') 11 3 'test' 'foo']
 [1.0 Timestamp('2017-01-02 00:00:00') 22 3 'train' 'foo']
 [1.0 Timestamp('2017-01-02 00:00:00') 32 3 'test' 'foo']
 [1.0 Timestamp('2017-01-02 00:00:00') 44 3 'train' 'foo']]
'''
```

### 快速统计摘要

计算数值类型数据的，均值，方差等

```python
# B，E，F不是数值
df2.describe()
'''
        A                        C        D
count    4.0    4.000000    4.0
mean    1.0    27.250000    3.0
std    0.0    14.080128    0.0
min    1.0    11.000000    3.0
25%    1.0    19.250000    3.0
50%    1.0    27.000000    3.0
75%    1.0    35.000000    3.0
max    1.0    44.000000    3.0

'''
```

### 转置数据

```python
'''
        A    B                        C        D        E            F
0    1.0    2017-01-02    11    3    test    foo
1    1.0    2017-01-02    22    3    train    foo
2    1.0    2017-01-02    32    3    test    foo
3    1.0    2017-01-02    44    3    train    foo
'''

df2.columns
'''
Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')
'''

df2.T.columns

'''
Int64Index([0, 1, 2, 3], dtype='int64')
'''

df2.T

'''

    0    1    2    3
A    1    1    1    1
B    2017-01-02 00:00:00    2017-01-02 00:00:00    2017-01-02 00:00:00    2017-01-02 00:00:00
C    11    22    32    44
D    3    3    3    3
E    test    train    test    train
F    foo    foo    foo    foo

'''
```

### 轴排序

```python
'''
    A                        B        C        D        E            F
0    1.0    2017-01-02    11    3    test    foo
1    1.0    2017-01-02    22    3    train    foo
2    1.0    2017-01-02    32    3    test    foo
3    1.0    2017-01-02    44    3    train    foo
'''

# axis=0，对index排序， 根据columns:C 下降排序，默认inplace False,不对原值替换
df2.sort_index(axis=0, by='C',ascending=False,inplace=True)

# 重新排序对 df
df2
'''
    A                        B        C        D            E        F
3    1.0    2017-01-02    44    3    train    foo
2    1.0    2017-01-02    32    3    test    foo
1    1.0    2017-01-02    22    3    train    foo
0    1.0    2017-01-02    11    3    test    foo
'''
```

### 布尔选择

```python
# Series类型，满足条件的索引
index=df2['C']>25
print(index)
df2[df2['C']>25]
'''
3     True
2     True
1    False
0    False
Name: C, dtype: bool

A    B    C    D    E    F
3    1.0    2017-01-02    44    3    train    foo
2    1.0    2017-01-02    32    3    test    foo
'''
```

### isin

```python
# E 列中只要存在 ['abc','train',666]中就行的匹配
df2[df2['E'].isin(['abc','train',666])]
'''
    A                        B        C        D            E        F
3    1.0    2017-01-02    44    3    train    foo
1    1.0    2017-01-02    22    3    train    foo
'''
```

## 修改数据类型

```python
'''
     A          B     C  D      E    F
0  1.0 2017-01-02  11.0  3   test  foo
1  1.0 2017-01-02  22.0  3  train  foo
2  1.0 2017-01-02  32.0  3   test  foo
3  1.0 2017-01-02  44.0  3  train  foo
'''
df2['C']=df2['C'].astype(np.int)

'''
     A          B   C  D      E    F
0  1.0 2017-01-02  11  3   test  foo
1  1.0 2017-01-02  22  3  train  foo
2  1.0 2017-01-02  32  3   test  foo
3  1.0 2017-01-02  44  3  train  foo
'''
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://im-qianuxn.gitbook.io/pytorch/ji-suan-ji/numpy-pandas-matplotlib/pandas/suo-yin-shu-ju.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
