###LABEL ENCODING###

from sklearn.preprocessing import LabelEncoder
items=['냉장고','전자레인지','컴퓨터','선풍기','믹서','믹서']
encoder=LabelEncoder()
encoder.fit(items)
labels=encoder.transform(items)
print(labels)

[0 3 4 2 1 1]

print(encoder.classes_)

['냉장고' '믹서' '선풍기' '전자레인지' '컴퓨터']

###ONE-HOT ENCODING###

from sklearn.preprocessing import OneHotEncoder
import numpy as np

items=['냉장고','전자레인지','컴퓨터','선풍기','믹서','믹서']
encoder=LabelEncoder()
encoder.fit(items)
labels=encoder.transform(items)
labels=labels.reshape(-1,1)

oh_encoder=OneHotEncoder()
oh_encoder.fit(labels)
oh_labels=oh_encoder.transform(labels)
oh_labels.toarray()

array([[1., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.],
       [0., 0., 1., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 1., 0., 0., 0.]])

###GET DUMMIES###

import pandas as pd
df=pd.DataFrame({'item':['냉장고','전자레인지','컴퓨터','선풍기','믹서','믹서']})
df.head()

pd.get_dummies(df)

	item_냉장고	item_믹서	item_선풍기	item_전자레인지	item_컴퓨터
0	1	0	0	0	0
1	0	0	0	1	0
2	0	0	0	0	1
3	0	0	1	0	0
4	0	1	0	0	0
5	0	1	0	0	0

파이썬_loc으로 특정조건을 가진 칼럼의 데이터 변경 (0)	2020.09.08
파이썬_for loop 두가지 이상의 for 조건 설정 (0)	2020.09.08
파이썬_회귀분석의 성능평가 (0)	2020.09.02
파이썬_특정 칼럼의 결측치를 문자로 대체 (0)	2020.08.26
파이썬_특정 칼럼에 결측치가 포함된 행 제거 (0)	2020.08.26

데이터 분석하는 정문가

파이썬_데이터 전처리 (Encoding) Data Preprocessing, GET DUMMIES, ONE HOT ENCODING, LABEL ENCODING

'Python' 카테고리의 다른 글

티스토리툴바

	item
0	냉장고
1	전자레인지
2	컴퓨터
3	선풍기
4	믹서

파이썬_데이터 전처리 (Encoding) Data Preprocessing, GET DUMMIES, ONE HOT ENCODING, LABEL ENCODING

'Python' 카테고리의 다른 글

'Python' Related Articles

티스토리툴바