線性模型用于分類(sklearn)
摘要:本文主要展示scikit-learn中線性模型在分類問題中的使用,涉及邏輯回歸,線性判別;
00 安裝scikit-learn庫
pip install scikit-learn
01 獲取sklearn中鳶尾花數據
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, linear_model
iris=datasets.load_iris()
dex1=np.random.choice(150,size=120,replace=False)
dex2=[]
for i in range(150):
if i not in dex1:
dex2.append(i)
train_x=iris.data[dex1,:]
train_y=iris.target[dex1]
test_x=iris.data[dex2,:]
test_y=iris.target[dex2]
02 邏輯回歸
regre=linear_model.LogisticRegression(multi_class='ovr',solver='liblinear')
regre.fit(train_x,train_y)
regre.score(test_x,test_y)
regre.coef_
Out[40]:
array([[ 0.39001649, 1.4110123 , -2.14837944, -0.97686956],
[ 0.54586613, -1.70617607, 0.38138451, -1.1176497 ],
[-1.63246048, -1.17046085, 2.28632906, 2.21395272]])
regre.intercept_
Out[41]: array([ 0.25529718, 0.900473 , -1.07104004])
regre.predict(test_x)
Out[42]:
array([0, 0, 0, 0, 0, 0, 1, 2, 2, 1, 1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2,
2, 2, 2, 2, 2, 2, 2, 2])
regre.predict_proba(test_x)
regre.predict_log_proba(test_x)
regre=linear_model.LogisticRegression(multi_class='multinomial',solver='lbfgs',max_iter=102)
regre.fit(train_x,train_y)
regre.score(test_x,test_y)
regre.n_iter_
Out[43]: array([96])
03 線性判別
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, discriminant_analysis
iris=datasets.load_iris()
dex1=np.random.choice(150,size=120,replace=False)
dex2=[]
for i in range(150):
if i not in dex1:
dex2.append(i)
train_x=iris.data[dex1,:]
train_y=iris.target[dex1]
test_x=iris.data[dex2,:]
test_y=iris.target[dex2]
regre=discriminant_analysis.LinearDiscriminantAnalysis()
regre.fit(train_x,train_y)
regre.score(test_x,test_y)
Out[52]: 1.0
regre.coef_
Out[54]:
array([[ 6.34755316, 13.66153017, -16.63757493, -22.43052621],
[ -1.61851372, -4.47717549, 4.1348641 , 3.09718096],
[ -4.26521741, -8.34418599, 11.26989661, 17.27715514]])
regre.intercept_
Out[55]: array([-18.84873469, 0.66754016, -30.35506047])
regre.predict_proba(test_x)
04 總結
01 線性模型不僅僅可以用于回歸,也可以用于分類;
02 對于LogisticRegression,LinearDiscriminantAnalysis算法,屬性(變量,特征)個數就是coef_一行的個數(列數),標簽(目標,標記)分類個數就是coef_的行數,也是intercept_一行的個數;
03 對于LogisticRegression,LinearDiscriminantAnalysis算法,不僅僅能得到分類結果,還能計算樣本分類的概率;
工程師必備
- 項目客服
- 培訓客服
- 平臺客服
TOP




















