IT학습/Library

[pandas] loc 기본, 기술통계량

바틀비 2024. 1. 17. 20:00

In [2]:

import numpy as np
import pandas as pd
import seaborn as sns

In [3]:

sns.get_dataset_names() #seaborn에 있는 데이터 세트 이름만 가져옴

Out[3]:

['anagrams',
 'anscombe',
 'attention',
 'brain_networks',
 'car_crashes',
 'diamonds',
 'dots',
 'dowjones',
 'exercise',
 'flights',
 'fmri',
 'geyser',
 'glue',
 'healthexp',
 'iris',
 'mpg',
 'penguins',
 'planets',
 'seaice',
 'taxis',
 'tips',
 'titanic']

In [4]:

iris = sns.load_dataset('iris') #샘플 데이터 호출
iris.head()

Out[4]:

	sepal_length	sepal_width	petal_length	petal_width	species
0	5.1	3.5	1.4	0.2	setosa
1	4.9	3.0	1.4	0.2	setosa
2	4.7	3.2	1.3	0.2	setosa
3	4.6	3.1	1.5	0.2	setosa
4	5.0	3.6	1.4	0.2	setosa

loc¶

df.loc[index 조건식, column 조건식]
index 조건식: df[column명] > 7, df[column명] == 값 처럼 작성한다.
column 조건식: 컬럼명 혹은 범

In [5]:

#index는 0부터 4까지, column은 전부 추출 -> label 기반이기 때문에 end+1 같은 작업은 할 필요가 없다.
iris.loc[0 : 4, :]

Out[5]:

	sepal_length	sepal_width	petal_length	petal_width	species
0	5.1	3.5	1.4	0.2	setosa
1	4.9	3.0	1.4	0.2	setosa
2	4.7	3.2	1.3	0.2	setosa
3	4.6	3.1	1.5	0.2	setosa
4	5.0	3.6	1.4	0.2	setosa

In [6]:

#index는 1부터 5까지, column은 petal length만
iris.loc[1 : 5, ['petal_length']]

Out[6]:

	petal_length
1	1.4
2	1.3
3	1.5
4	1.4
5	1.7

In [7]:

#index는 1부터 3까지, column은 petal length부터 sepal_length까지
iris.loc[1 : 3, 'sepal_length':'petal_length']

Out[7]:

	sepal_length	sepal_width	petal_length
1	4.9	3.0	1.4
2	4.7	3.2	1.3
3	4.6	3.1	1.5

In [8]:

#index는 1부터 3까지, column은 petal length와 sepal_length만
#위의 범위와 다르다. 뒤의 column 지정은 조건식이기 때문에 [] 를 씌워줘야 한다.
iris.loc[1 : 3, ['petal_length', 'sepal_length']]

Out[8]:

	petal_length	sepal_length
1	1.4	4.9
2	1.3	4.7
3	1.5	4.6

In [9]:

tips = sns.load_dataset('tips')
print(tips.head())
print(tips.shape)

   total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4
(244, 7)

In [10]:

tips.columns #컬럼명 전체 호출

Out[10]:

Index(['total_bill', 'tip', 'sex', 'smoker', 'day', 'time', 'size'], dtype='object')

In [11]:

#total_bill이 45 이상인 index만. 그외 칼럼 전부 추출.
tips.loc[tips['total_bill'] >= 45, :]

Out[11]:

	total_bill	tip	sex	smoker	day	time	size
59	48.27	6.73	Male	No	Sat	Dinner	4
156	48.17	5.00	Male	No	Sun	Dinner	6
170	50.81	10.00	Male	Yes	Sat	Dinner	3
182	45.35	3.50	Male	Yes	Sun	Dinner	3
212	48.33	9.00	Male	No	Sat	Dinner	4

In [12]:

#다중조건
#sex가 Male이고, total bill이 25 이상인 index, columndms tip만
#조건식마다 ()를 씌운다.
#and는 &이다.

tips.loc[
(tips['sex'] == 'Male') &
(tips['total_bill'] >= 25),
['tip']
].head()

Out[12]:

	tip
5	4.71
7	3.12
23	7.58
39	5.00
44	5.60

In [17]:

result = tips.loc[(tips['sex'] == 'Male') & (tips['total_bill'] >= 25), ['tip']].head()
result.reset_index(drop = True) #index를 다시 재정렬

Out[17]:

	tip
0	4.71
1	3.12
2	7.58
3	5.00
4	5.60

In [14]:

#sex가 Male이거나, total bill이 25 이상인 index, columndms tip만
#or은 | 이다.
tips.loc[
(tips['sex'] == 'Male') |
(tips['total_bill'] >= 25),
['tip']
].head()

Out[14]:

	tip
1	1.66
2	3.50
3	3.31
5	4.71
6	2.00

In [15]:

#기술통계량

In [16]:

print(tips.loc[:, 'total_bill'].mean())
print(tips.loc[:, 'total_bill'].std())
print(tips.loc[:, 'total_bill'].median())
print(tips.loc[:, 'total_bill'].max())
print(tips.loc[:, 'total_bill'].min())

19.78594262295082
8.902411954856856
17.795
50.81
3.07

저작자표시 비영리 변경금지 (새창열림)

'IT학습 > Library' 카테고리의 다른 글

[pandas] concat, join, merge (0)	2024.01.17
[pandas] csv파일 입출력, 데이터값 수정, concat/merge (0)	2024.01.17
[pandas] DataFrame 기본연산, 기본 기술통계 (0)	2024.01.12
[pandas] Series 데이터 구조, 날짜 데이터 (0)	2024.01.12
[numpy] 배열의 조건문, numpy 조건문 (0)	2024.01.08

현재글[pandas] loc 기본, 기술통계량

jupyter, Python, pandas, Anaconda, Numpy, git, seaborn, figma, 플러그인, Selenium, 웹크롤링, 가상환경, 알고리즘, SciPy, 통계분석, matplotlib,

Today :
Yesterday :

바틀비의 타자기

[pandas] loc 기본, 기술통계량

loc¶

'IT학습 > Library' 카테고리의 다른 글

'IT학습/Library'의 다른글

티스토리툴바

« 2025/11 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

[pandas] loc 기본, 기술통계량

loc¶

'IT학습 > Library' 카테고리의 다른 글

'IT학습/Library'의 다른글

관련글

티스토리툴바