線性模型用于分類(sklearn)

摘要:本文主要展示scikit-learn中線性模型在分類問題中的使用,涉及邏輯回歸,線性判別;

00 安裝scikit-learn庫

pip install scikit-learn

01 獲取sklearn中鳶尾花數據

import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, linear_model

iris=datasets.load_iris()
dex1=np.random.choice(150,size=120,replace=False)
dex2=[]
for i in range(150):
    if i not in dex1:
        dex2.append(i)
train_x=iris.data[dex1,:]
train_y=iris.target[dex1]
test_x=iris.data[dex2,:]
test_y=iris.target[dex2]

02 邏輯回歸

regre=linear_model.LogisticRegression(multi_class='ovr',solver='liblinear')
regre.fit(train_x,train_y)
regre.score(test_x,test_y)


regre.coef_

Out[40]:
array([[ 0.39001649,  1.4110123 , -2.14837944, -0.97686956],
       [ 0.54586613, -1.70617607,  0.38138451, -1.1176497 ],
       [-1.63246048, -1.17046085,  2.28632906,  2.21395272]])

regre.intercept_
Out[41]: array([ 0.25529718,  0.900473  , -1.07104004])

regre.predict(test_x)
Out[42]:
array([0, 0, 0, 0, 0, 0, 1, 2, 2, 1, 1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2,
       2, 2, 2, 2, 2, 2, 2, 2])

regre.predict_proba(test_x)
regre.predict_log_proba(test_x)

regre=linear_model.LogisticRegression(multi_class='multinomial',solver='lbfgs',max_iter=102)
regre.fit(train_x,train_y)
regre.score(test_x,test_y)
regre.n_iter_
Out[43]: array([96])

03 線性判別

import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, discriminant_analysis
iris=datasets.load_iris()
dex1=np.random.choice(150,size=120,replace=False)
dex2=[]
for i in range(150):
    if i not in dex1:
        dex2.append(i)

train_x=iris.data[dex1,:]
train_y=iris.target[dex1]
test_x=iris.data[dex2,:]
test_y=iris.target[dex2]

regre=discriminant_analysis.LinearDiscriminantAnalysis()
regre.fit(train_x,train_y)
regre.score(test_x,test_y)
Out[52]: 1.0

regre.coef_
Out[54]:
array([[  6.34755316,  13.66153017, -16.63757493, -22.43052621],
       [ -1.61851372,  -4.47717549,   4.1348641 ,   3.09718096],
       [ -4.26521741,  -8.34418599,  11.26989661,  17.27715514]])

regre.intercept_
Out[55]: array([-18.84873469,   0.66754016, -30.35506047])

regre.predict_proba(test_x)

04 總結

01 線性模型不僅僅可以用于回歸,也可以用于分類;

02 對于LogisticRegression,LinearDiscriminantAnalysis算法,屬性(變量,特征)個數就是coef_一行的個數(列數),標簽(目標,標記)分類個數就是coef_的行數,也是intercept_一行的個數;

03 對于LogisticRegression,LinearDiscriminantAnalysis算法,不僅僅能得到分類結果,還能計算樣本分類的概率;

登錄后免費查看全文
立即登錄
App下載
技術鄰APP
工程師必備
  • 項目客服
  • 培訓客服
  • 平臺客服

TOP

8
6