Iris
最后更新于
这有帮助吗?
最后更新于
这有帮助吗?
使用 sklearn 中对特征处理功能进行说明。
IRIS数据集由Fisher在1936年整理,包含4个特征(Sepal.Length(花萼长度)、Sepal.Width(花萼宽度)、Petal.Length(花瓣长度)、Petal.Width(花瓣宽度)),特征值都为正浮点数,单位为厘米。目标值为鸢尾花的分类(Iris Setosa(山鸢尾)、Iris Versicolour(杂色鸢尾),Iris Virginica(维吉尼亚鸢尾))。
from sklearn.datasets import load_iris
# 导入数据集
iris = load_iris()
# 特征矩阵
print(type(iris.data))
print(iris.data[:5])
# 目标向量
print(type(iris.target))
print(iris.target)
<class 'numpy.ndarray'>
[[ 5.1 3.5 1.4 0.2]
[ 4.9 3. 1.4 0.2]
[ 4.7 3.2 1.3 0.2]
[ 4.6 3.1 1.5 0.2]
[ 5. 3.6 1.4 0.2]]
<class 'numpy.ndarray'>
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
使用preproccessing库的StandardScaler类对数据进行标准化的代码如下。
from sklearn.preprocessing import StandardScaler
# 标准化,返回值为标准化后的数据
StandardScaler().fit_transform(iris.data)[:10]
array([[-0.90068117, 1.03205722, -1.3412724 , -1.31297673],
[-1.14301691, -0.1249576 , -1.3412724 , -1.31297673],
[-1.38535265, 0.33784833, -1.39813811, -1.31297673],
[-1.50652052, 0.10644536, -1.2844067 , -1.31297673],
[-1.02184904, 1.26346019, -1.3412724 , -1.31297673],
[-0.53717756, 1.95766909, -1.17067529, -1.05003079],
[-1.50652052, 0.80065426, -1.3412724 , -1.18150376],
[-1.02184904, 0.80065426, -1.2844067 , -1.31297673],
[-1.74885626, -0.35636057, -1.3412724 , -1.31297673],
[-1.14301691, 0.10644536, -1.2844067 , -1.4444497 ]])
使用preproccessing库的MinMaxScaler类对数据进行区间缩放的代码如下。
from sklearn.preprocessing import MinMaxScaler
#区间缩放,返回值为缩放到[0, 1]区间的数据
MinMaxScaler().fit_transform(iris.data)[:10]
array([[ 0.22222222, 0.625 , 0.06779661, 0.04166667],
[ 0.16666667, 0.41666667, 0.06779661, 0.04166667],
[ 0.11111111, 0.5 , 0.05084746, 0.04166667],
[ 0.08333333, 0.45833333, 0.08474576, 0.04166667],
[ 0.19444444, 0.66666667, 0.06779661, 0.04166667],
[ 0.30555556, 0.79166667, 0.11864407, 0.125 ],
[ 0.08333333, 0.58333333, 0.06779661, 0.08333333],
[ 0.19444444, 0.58333333, 0.08474576, 0.04166667],
[ 0.02777778, 0.375 , 0.06779661, 0.04166667],
[ 0.16666667, 0.45833333, 0.08474576, 0. ]])
使用preproccessing库的Normalizer类对数据进行归一化的代码如下。
from sklearn.preprocessing import Normalizer
#归一化,返回值为归一化后的数据
Normalizer().fit_transform(iris.data)[:10]
array([[ 0.80377277, 0.55160877, 0.22064351, 0.0315205 ],
[ 0.82813287, 0.50702013, 0.23660939, 0.03380134],
[ 0.80533308, 0.54831188, 0.2227517 , 0.03426949],
[ 0.80003025, 0.53915082, 0.26087943, 0.03478392],
[ 0.790965 , 0.5694948 , 0.2214702 , 0.0316386 ],
[ 0.78417499, 0.5663486 , 0.2468699 , 0.05808704],
[ 0.78010936, 0.57660257, 0.23742459, 0.0508767 ],
[ 0.80218492, 0.54548574, 0.24065548, 0.0320874 ],
[ 0.80642366, 0.5315065 , 0.25658935, 0.03665562],
[ 0.81803119, 0.51752994, 0.25041771, 0.01669451]])
使用preproccessing库的Binarizer类对数据进行二值化的代码如下。
from sklearn.preprocessing import Binarizer
#二值化,阈值设置为3,返回值为二值化后的数据
Binarizer(threshold=3).fit_transform(iris.data)[:5]
array([[ 1., 1., 0., 0.],
[ 1., 0., 0., 0.],
[ 1., 1., 0., 0.],
[ 1., 1., 0., 0.],
[ 1., 1., 0., 0.]])
由于IRIS数据集的特征皆为定量特征,故使用其目标值进行哑编码(实际上是不需要的)。使用preproccessing库的OneHotEncoder类对数据进行哑编码的代码如下。
from sklearn.preprocessing import OneHotEncoder
# 哑编码,对IRIS数据集的目标值,返回值为哑编码后的数据
OneHotEncoder().fit_transform(iris.target.reshape((-1,1)))
' with 150 stored elements in Compressed Sparse Row format>
由于IRIS数据集没有缺失值,故对数据集新增一个样本,4个特征均赋值为NaN,表示数据缺失。使用preproccessing库的Imputer类对数据进行缺失值计算的代码如下。
from numpy import vstack, array, nan
from sklearn.preprocessing import Imputer
#缺失值计算,返回值为计算缺失值后的数据
#参数missing_value为缺失值的表示形式,默认为NaN
#参数strategy为缺失值填充方式,默认为mean(均值)
Imputer().fit_transform(vstack((array([nan, nan, nan, nan]), iris.data)))
array([[ 5.84333333, 3.054 , 3.75866667, 1.19866667],
[ 5.1 , 3.5 , 1.4 , 0.2 ],
[ 4.9 , 3. , 1.4 , 0.2 ],
[ 4.7 , 3.2 , 1.3 , 0.2 ],
[ 4.6 , 3.1 , 1.5 , 0.2 ],
[ 5. , 3.6 , 1.4 , 0.2 ],
[ 5.4 , 3.9 , 1.7 , 0.4 ],
[ 4.6 , 3.4 , 1.4 , 0.3 ],
[ 5. , 3.4 , 1.5 , 0.2 ],
[ 4.4 , 2.9 , 1.4 , 0.2 ],
[ 4.9 , 3.1 , 1.5 , 0.1 ],
[ 5.4 , 3.7 , 1.5 , 0.2 ],
[ 4.8 , 3.4 , 1.6 , 0.2 ],
[ 4.8 , 3. , 1.4 , 0.1 ],
[ 4.3 , 3. , 1.1 , 0.1 ],
[ 5.8 , 4. , 1.2 , 0.2 ],
[ 5.7 , 4.4 , 1.5 , 0.4 ],
[ 5.4 , 3.9 , 1.3 , 0.4 ],
[ 5.1 , 3.5 , 1.4 , 0.3 ],
[ 5.7 , 3.8 , 1.7 , 0.3 ],
[ 5.1 , 3.8 , 1.5 , 0.3 ],
[ 5.4 , 3.4 , 1.7 , 0.2 ],
[ 5.1 , 3.7 , 1.5 , 0.4 ],
[ 4.6 , 3.6 , 1. , 0.2 ],
[ 5.1 , 3.3 , 1.7 , 0.5 ],
[ 4.8 , 3.4 , 1.9 , 0.2 ],
[ 5. , 3. , 1.6 , 0.2 ],
[ 5. , 3.4 , 1.6 , 0.4 ],
[ 5.2 , 3.5 , 1.5 , 0.2 ],
[ 5.2 , 3.4 , 1.4 , 0.2 ],
[ 4.7 , 3.2 , 1.6 , 0.2 ],
[ 4.8 , 3.1 , 1.6 , 0.2 ],
[ 5.4 , 3.4 , 1.5 , 0.4 ],
[ 5.2 , 4.1 , 1.5 , 0.1 ],
[ 5.5 , 4.2 , 1.4 , 0.2 ],
[ 4.9 , 3.1 , 1.5 , 0.1 ],
[ 5. , 3.2 , 1.2 , 0.2 ],
[ 5.5 , 3.5 , 1.3 , 0.2 ],
[ 4.9 , 3.1 , 1.5 , 0.1 ],
[ 4.4 , 3. , 1.3 , 0.2 ],
[ 5.1 , 3.4 , 1.5 , 0.2 ],
[ 5. , 3.5 , 1.3 , 0.3 ],
[ 4.5 , 2.3 , 1.3 , 0.3 ],
[ 4.4 , 3.2 , 1.3 , 0.2 ],
[ 5. , 3.5 , 1.6 , 0.6 ],
[ 5.1 , 3.8 , 1.9 , 0.4 ],
[ 4.8 , 3. , 1.4 , 0.3 ],
[ 5.1 , 3.8 , 1.6 , 0.2 ],
[ 4.6 , 3.2 , 1.4 , 0.2 ],
[ 5.3 , 3.7 , 1.5 , 0.2 ],
[ 5. , 3.3 , 1.4 , 0.2 ],
[ 7. , 3.2 , 4.7 , 1.4 ],
[ 6.4 , 3.2 , 4.5 , 1.5 ],
[ 6.9 , 3.1 , 4.9 , 1.5 ],
[ 5.5 , 2.3 , 4. , 1.3 ],
[ 6.5 , 2.8 , 4.6 , 1.5 ],
[ 5.7 , 2.8 , 4.5 , 1.3 ],
[ 6.3 , 3.3 , 4.7 , 1.6 ],
[ 4.9 , 2.4 , 3.3 , 1. ],
[ 6.6 , 2.9 , 4.6 , 1.3 ],
[ 5.2 , 2.7 , 3.9 , 1.4 ],
[ 5. , 2. , 3.5 , 1. ],
[ 5.9 , 3. , 4.2 , 1.5 ],
[ 6. , 2.2 , 4. , 1. ],
[ 6.1 , 2.9 , 4.7 , 1.4 ],
[ 5.6 , 2.9 , 3.6 , 1.3 ],
[ 6.7 , 3.1 , 4.4 , 1.4 ],
[ 5.6 , 3. , 4.5 , 1.5 ],
[ 5.8 , 2.7 , 4.1 , 1. ],
[ 6.2 , 2.2 , 4.5 , 1.5 ],
[ 5.6 , 2.5 , 3.9 , 1.1 ],
[ 5.9 , 3.2 , 4.8 , 1.8 ],
[ 6.1 , 2.8 , 4. , 1.3 ],
[ 6.3 , 2.5 , 4.9 , 1.5 ],
[ 6.1 , 2.8 , 4.7 , 1.2 ],
[ 6.4 , 2.9 , 4.3 , 1.3 ],
[ 6.6 , 3. , 4.4 , 1.4 ],
[ 6.8 , 2.8 , 4.8 , 1.4 ],
[ 6.7 , 3. , 5. , 1.7 ],
[ 6. , 2.9 , 4.5 , 1.5 ],
[ 5.7 , 2.6 , 3.5 , 1. ],
[ 5.5 , 2.4 , 3.8 , 1.1 ],
[ 5.5 , 2.4 , 3.7 , 1. ],
[ 5.8 , 2.7 , 3.9 , 1.2 ],
[ 6. , 2.7 , 5.1 , 1.6 ],
[ 5.4 , 3. , 4.5 , 1.5 ],
[ 6. , 3.4 , 4.5 , 1.6 ],
[ 6.7 , 3.1 , 4.7 , 1.5 ],
[ 6.3 , 2.3 , 4.4 , 1.3 ],
[ 5.6 , 3. , 4.1 , 1.3 ],
[ 5.5 , 2.5 , 4. , 1.3 ],
[ 5.5 , 2.6 , 4.4 , 1.2 ],
[ 6.1 , 3. , 4.6 , 1.4 ],
[ 5.8 , 2.6 , 4. , 1.2 ],
[ 5. , 2.3 , 3.3 , 1. ],
[ 5.6 , 2.7 , 4.2 , 1.3 ],
[ 5.7 , 3. , 4.2 , 1.2 ],
[ 5.7 , 2.9 , 4.2 , 1.3 ],
[ 6.2 , 2.9 , 4.3 , 1.3 ],
[ 5.1 , 2.5 , 3. , 1.1 ],
[ 5.7 , 2.8 , 4.1 , 1.3 ],
[ 6.3 , 3.3 , 6. , 2.5 ],
[ 5.8 , 2.7 , 5.1 , 1.9 ],
[ 7.1 , 3. , 5.9 , 2.1 ],
[ 6.3 , 2.9 , 5.6 , 1.8 ],
[ 6.5 , 3. , 5.8 , 2.2 ],
[ 7.6 , 3. , 6.6 , 2.1 ],
[ 4.9 , 2.5 , 4.5 , 1.7 ],
[ 7.3 , 2.9 , 6.3 , 1.8 ],
[ 6.7 , 2.5 , 5.8 , 1.8 ],
[ 7.2 , 3.6 , 6.1 , 2.5 ],
[ 6.5 , 3.2 , 5.1 , 2. ],
[ 6.4 , 2.7 , 5.3 , 1.9 ],
[ 6.8 , 3. , 5.5 , 2.1 ],
[ 5.7 , 2.5 , 5. , 2. ],
[ 5.8 , 2.8 , 5.1 , 2.4 ],
[ 6.4 , 3.2 , 5.3 , 2.3 ],
[ 6.5 , 3. , 5.5 , 1.8 ],
[ 7.7 , 3.8 , 6.7 , 2.2 ],
[ 7.7 , 2.6 , 6.9 , 2.3 ],
[ 6. , 2.2 , 5. , 1.5 ],
[ 6.9 , 3.2 , 5.7 , 2.3 ],
[ 5.6 , 2.8 , 4.9 , 2. ],
[ 7.7 , 2.8 , 6.7 , 2. ],
[ 6.3 , 2.7 , 4.9 , 1.8 ],
[ 6.7 , 3.3 , 5.7 , 2.1 ],
[ 7.2 , 3.2 , 6. , 1.8 ],
[ 6.2 , 2.8 , 4.8 , 1.8 ],
[ 6.1 , 3. , 4.9 , 1.8 ],
[ 6.4 , 2.8 , 5.6 , 2.1 ],
[ 7.2 , 3. , 5.8 , 1.6 ],
[ 7.4 , 2.8 , 6.1 , 1.9 ],
[ 7.9 , 3.8 , 6.4 , 2. ],
[ 6.4 , 2.8 , 5.6 , 2.2 ],
[ 6.3 , 2.8 , 5.1 , 1.5 ],
[ 6.1 , 2.6 , 5.6 , 1.4 ],
[ 7.7 , 3. , 6.1 , 2.3 ],
[ 6.3 , 3.4 , 5.6 , 2.4 ],
[ 6.4 , 3.1 , 5.5 , 1.8 ],
[ 6. , 3. , 4.8 , 1.8 ],
[ 6.9 , 3.1 , 5.4 , 2.1 ],
[ 6.7 , 3.1 , 5.6 , 2.4 ],
[ 6.9 , 3.1 , 5.1 , 2.3 ],
[ 5.8 , 2.7 , 5.1 , 1.9 ],
[ 6.8 , 3.2 , 5.9 , 2.3 ],
[ 6.7 , 3.3 , 5.7 , 2.5 ],
[ 6.7 , 3. , 5.2 , 2.3 ],
[ 6.3 , 2.5 , 5. , 1.9 ],
[ 6.5 , 3. , 5.2 , 2. ],
[ 6.2 , 3.4 , 5.4 , 2.3 ],
[ 5.9 , 3. , 5.1 , 1.8 ]])
使用preproccessing库的PolynomialFeatures类对数据进行多项式转换的代码如下。
from sklearn.preprocessing import PolynomialFeatures
#多项式转换
#参数degree为度,默认值为2
PolynomialFeatures().fit_transform(iris.data)
array([[ 1. , 5.1 , 3.5 , ..., 1.96, 0.28, 0.04],
[ 1. , 4.9 , 3. , ..., 1.96, 0.28, 0.04],
[ 1. , 4.7 , 3.2 , ..., 1.69, 0.26, 0.04],
...,
[ 1. , 6.5 , 3. , ..., 27.04, 10.4 , 4. ],
[ 1. , 6.2 , 3.4 , ..., 29.16, 12.42, 5.29],
[ 1. , 5.9 , 3. , ..., 26.01, 9.18, 3.24]])
基于单变元函数的数据变换可以使用一个统一的方式完成,使用preproccessing库的FunctionTransformer对数据进行对数函数转换的代码如下。
from numpy import log1p
from sklearn.preprocessing import FunctionTransformer
#自定义转换函数为对数函数的数据变换
#第一个参数是单变元函数
FunctionTransformer(log1p).fit_transform(iris.data)[:10]
array([[ 1.80828877, 1.5040774 , 0.87546874, 0.18232156],
[ 1.77495235, 1.38629436, 0.87546874, 0.18232156],
[ 1.74046617, 1.43508453, 0.83290912, 0.18232156],
[ 1.7227666 , 1.41098697, 0.91629073, 0.18232156],
[ 1.79175947, 1.5260563 , 0.87546874, 0.18232156],
[ 1.85629799, 1.58923521, 0.99325177, 0.33647224],
[ 1.7227666 , 1.48160454, 0.87546874, 0.26236426],
[ 1.79175947, 1.48160454, 0.91629073, 0.18232156],
[ 1.68639895, 1.36097655, 0.87546874, 0.18232156],
[ 1.77495235, 1.41098697, 0.91629073, 0.09531018]])
from sklearn.feature_selection import VarianceThreshold
#方差选择法,返回值为特征选择后的数据
#参数threshold为方差的阈值
VarianceThreshold(threshold=3).fit_transform(iris.data)
array([[ 1.4],
[ 1.4],
[ 1.3],
[ 1.5],
[ 1.4],
[ 1.7],
[ 1.4],
[ 1.5],
[ 1.4],
[ 1.5],
[ 1.5],
[ 1.6],
[ 1.4],
[ 1.1],
[ 1.2],
[ 1.5],
[ 1.3],
[ 1.4],
[ 1.7],
[ 1.5],
[ 1.7],
[ 1.5],
[ 1. ],
[ 1.7],
[ 1.9],
[ 1.6],
[ 1.6],
[ 1.5],
[ 1.4],
[ 1.6],
[ 1.6],
[ 1.5],
[ 1.5],
[ 1.4],
[ 1.5],
[ 1.2],
[ 1.3],
[ 1.5],
[ 1.3],
[ 1.5],
[ 1.3],
[ 1.3],
[ 1.3],
[ 1.6],
[ 1.9],
[ 1.4],
[ 1.6],
[ 1.4],
[ 1.5],
[ 1.4],
[ 4.7],
[ 4.5],
[ 4.9],
[ 4. ],
[ 4.6],
[ 4.5],
[ 4.7],
[ 3.3],
[ 4.6],
[ 3.9],
[ 3.5],
[ 4.2],
[ 4. ],
[ 4.7],
[ 3.6],
[ 4.4],
[ 4.5],
[ 4.1],
[ 4.5],
[ 3.9],
[ 4.8],
[ 4. ],
[ 4.9],
[ 4.7],
[ 4.3],
[ 4.4],
[ 4.8],
[ 5. ],
[ 4.5],
[ 3.5],
[ 3.8],
[ 3.7],
[ 3.9],
[ 5.1],
[ 4.5],
[ 4.5],
[ 4.7],
[ 4.4],
[ 4.1],
[ 4. ],
[ 4.4],
[ 4.6],
[ 4. ],
[ 3.3],
[ 4.2],
[ 4.2],
[ 4.2],
[ 4.3],
[ 3. ],
[ 4.1],
[ 6. ],
[ 5.1],
[ 5.9],
[ 5.6],
[ 5.8],
[ 6.6],
[ 4.5],
[ 6.3],
[ 5.8],
[ 6.1],
[ 5.1],
[ 5.3],
[ 5.5],
[ 5. ],
[ 5.1],
[ 5.3],
[ 5.5],
[ 6.7],
[ 6.9],
[ 5. ],
[ 5.7],
[ 4.9],
[ 6.7],
[ 4.9],
[ 5.7],
[ 6. ],
[ 4.8],
[ 4.9],
[ 5.6],
[ 5.8],
[ 6.1],
[ 6.4],
[ 5.6],
[ 5.1],
[ 5.6],
[ 6.1],
[ 5.6],
[ 5.5],
[ 4.8],
[ 5.4],
[ 5.6],
[ 5.1],
[ 5.1],
[ 5.9],
[ 5.7],
[ 5.2],
[ 5. ],
[ 5.2],
[ 5.4],
[ 5.1]])
用feature_selection库的SelectKBest类结合相关系数来选择特征的代码如下。
from sklearn.feature_selection import SelectKBest
from scipy.stats import pearsonr
# 选择K个最好的特征,返回选择特征后的数据
# 第一个参数为计算评估特征是否好的函数,该函数输入特征矩阵和目标向量,输出二元组(评分,P值)的数组,数组第i项为第i个特征的评分和P值,在此定义为计算相关系数。
# 参数k为选择的特征个数
def get_pearsonr(X, y):
m = map(lambda x: pearsonr(x, y), X.T)
res = array(list(m)).T
return (res[0], res[1])
SelectKBest(get_pearsonr, k=2).fit_transform(iris.data, iris.target)
# SelectKBest(lambda X, Y: array(list(map(lambda x: pearsonr(x, Y)[0], X.T))).T, k=2).fit_transform(iris.data, iris.target)
array([[ 1.4, 0.2],
[ 1.4, 0.2],
[ 1.3, 0.2],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.7, 0.4],
[ 1.4, 0.3],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.5, 0.1],
[ 1.5, 0.2],
[ 1.6, 0.2],
[ 1.4, 0.1],
[ 1.1, 0.1],
[ 1.2, 0.2],
[ 1.5, 0.4],
[ 1.3, 0.4],
[ 1.4, 0.3],
[ 1.7, 0.3],
[ 1.5, 0.3],
[ 1.7, 0.2],
[ 1.5, 0.4],
[ 1. , 0.2],
[ 1.7, 0.5],
[ 1.9, 0.2],
[ 1.6, 0.2],
[ 1.6, 0.4],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.6, 0.2],
[ 1.6, 0.2],
[ 1.5, 0.4],
[ 1.5, 0.1],
[ 1.4, 0.2],
[ 1.5, 0.1],
[ 1.2, 0.2],
[ 1.3, 0.2],
[ 1.5, 0.1],
[ 1.3, 0.2],
[ 1.5, 0.2],
[ 1.3, 0.3],
[ 1.3, 0.3],
[ 1.3, 0.2],
[ 1.6, 0.6],
[ 1.9, 0.4],
[ 1.4, 0.3],
[ 1.6, 0.2],
[ 1.4, 0.2],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 4.7, 1.4],
[ 4.5, 1.5],
[ 4.9, 1.5],
[ 4. , 1.3],
[ 4.6, 1.5],
[ 4.5, 1.3],
[ 4.7, 1.6],
[ 3.3, 1. ],
[ 4.6, 1.3],
[ 3.9, 1.4],
[ 3.5, 1. ],
[ 4.2, 1.5],
[ 4. , 1. ],
[ 4.7, 1.4],
[ 3.6, 1.3],
[ 4.4, 1.4],
[ 4.5, 1.5],
[ 4.1, 1. ],
[ 4.5, 1.5],
[ 3.9, 1.1],
[ 4.8, 1.8],
[ 4. , 1.3],
[ 4.9, 1.5],
[ 4.7, 1.2],
[ 4.3, 1.3],
[ 4.4, 1.4],
[ 4.8, 1.4],
[ 5. , 1.7],
[ 4.5, 1.5],
[ 3.5, 1. ],
[ 3.8, 1.1],
[ 3.7, 1. ],
[ 3.9, 1.2],
[ 5.1, 1.6],
[ 4.5, 1.5],
[ 4.5, 1.6],
[ 4.7, 1.5],
[ 4.4, 1.3],
[ 4.1, 1.3],
[ 4. , 1.3],
[ 4.4, 1.2],
[ 4.6, 1.4],
[ 4. , 1.2],
[ 3.3, 1. ],
[ 4.2, 1.3],
[ 4.2, 1.2],
[ 4.2, 1.3],
[ 4.3, 1.3],
[ 3. , 1.1],
[ 4.1, 1.3],
[ 6. , 2.5],
[ 5.1, 1.9],
[ 5.9, 2.1],
[ 5.6, 1.8],
[ 5.8, 2.2],
[ 6.6, 2.1],
[ 4.5, 1.7],
[ 6.3, 1.8],
[ 5.8, 1.8],
[ 6.1, 2.5],
[ 5.1, 2. ],
[ 5.3, 1.9],
[ 5.5, 2.1],
[ 5. , 2. ],
[ 5.1, 2.4],
[ 5.3, 2.3],
[ 5.5, 1.8],
[ 6.7, 2.2],
[ 6.9, 2.3],
[ 5. , 1.5],
[ 5.7, 2.3],
[ 4.9, 2. ],
[ 6.7, 2. ],
[ 4.9, 1.8],
[ 5.7, 2.1],
[ 6. , 1.8],
[ 4.8, 1.8],
[ 4.9, 1.8],
[ 5.6, 2.1],
[ 5.8, 1.6],
[ 6.1, 1.9],
[ 6.4, 2. ],
[ 5.6, 2.2],
[ 5.1, 1.5],
[ 5.6, 1.4],
[ 6.1, 2.3],
[ 5.6, 2.4],
[ 5.5, 1.8],
[ 4.8, 1.8],
[ 5.4, 2.1],
[ 5.6, 2.4],
[ 5.1, 2.3],
[ 5.1, 1.9],
[ 5.9, 2.3],
[ 5.7, 2.5],
[ 5.2, 2.3],
[ 5. , 1.9],
[ 5.2, 2. ],
[ 5.4, 2.3],
[ 5.1, 1.8]])
用feature_selection库的SelectKBest类结合卡方检验来选择特征的代码如下:
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2
#选择K个最好的特征,返回选择特征后的数据
SelectKBest(chi2, k=2).fit_transform(iris.data, iris.target)
array([[ 1.4, 0.2],
[ 1.4, 0.2],
[ 1.3, 0.2],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.7, 0.4],
[ 1.4, 0.3],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.5, 0.1],
[ 1.5, 0.2],
[ 1.6, 0.2],
[ 1.4, 0.1],
[ 1.1, 0.1],
[ 1.2, 0.2],
[ 1.5, 0.4],
[ 1.3, 0.4],
[ 1.4, 0.3],
[ 1.7, 0.3],
[ 1.5, 0.3],
[ 1.7, 0.2],
[ 1.5, 0.4],
[ 1. , 0.2],
[ 1.7, 0.5],
[ 1.9, 0.2],
[ 1.6, 0.2],
[ 1.6, 0.4],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.6, 0.2],
[ 1.6, 0.2],
[ 1.5, 0.4],
[ 1.5, 0.1],
[ 1.4, 0.2],
[ 1.5, 0.1],
[ 1.2, 0.2],
[ 1.3, 0.2],
[ 1.5, 0.1],
[ 1.3, 0.2],
[ 1.5, 0.2],
[ 1.3, 0.3],
[ 1.3, 0.3],
[ 1.3, 0.2],
[ 1.6, 0.6],
[ 1.9, 0.4],
[ 1.4, 0.3],
[ 1.6, 0.2],
[ 1.4, 0.2],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 4.7, 1.4],
[ 4.5, 1.5],
[ 4.9, 1.5],
[ 4. , 1.3],
[ 4.6, 1.5],
[ 4.5, 1.3],
[ 4.7, 1.6],
[ 3.3, 1. ],
[ 4.6, 1.3],
[ 3.9, 1.4],
[ 3.5, 1. ],
[ 4.2, 1.5],
[ 4. , 1. ],
[ 4.7, 1.4],
[ 3.6, 1.3],
[ 4.4, 1.4],
[ 4.5, 1.5],
[ 4.1, 1. ],
[ 4.5, 1.5],
[ 3.9, 1.1],
[ 4.8, 1.8],
[ 4. , 1.3],
[ 4.9, 1.5],
[ 4.7, 1.2],
[ 4.3, 1.3],
[ 4.4, 1.4],
[ 4.8, 1.4],
[ 5. , 1.7],
[ 4.5, 1.5],
[ 3.5, 1. ],
[ 3.8, 1.1],
[ 3.7, 1. ],
[ 3.9, 1.2],
[ 5.1, 1.6],
[ 4.5, 1.5],
[ 4.5, 1.6],
[ 4.7, 1.5],
[ 4.4, 1.3],
[ 4.1, 1.3],
[ 4. , 1.3],
[ 4.4, 1.2],
[ 4.6, 1.4],
[ 4. , 1.2],
[ 3.3, 1. ],
[ 4.2, 1.3],
[ 4.2, 1.2],
[ 4.2, 1.3],
[ 4.3, 1.3],
[ 3. , 1.1],
[ 4.1, 1.3],
[ 6. , 2.5],
[ 5.1, 1.9],
[ 5.9, 2.1],
[ 5.6, 1.8],
[ 5.8, 2.2],
[ 6.6, 2.1],
[ 4.5, 1.7],
[ 6.3, 1.8],
[ 5.8, 1.8],
[ 6.1, 2.5],
[ 5.1, 2. ],
[ 5.3, 1.9],
[ 5.5, 2.1],
[ 5. , 2. ],
[ 5.1, 2.4],
[ 5.3, 2.3],
[ 5.5, 1.8],
[ 6.7, 2.2],
[ 6.9, 2.3],
[ 5. , 1.5],
[ 5.7, 2.3],
[ 4.9, 2. ],
[ 6.7, 2. ],
[ 4.9, 1.8],
[ 5.7, 2.1],
[ 6. , 1.8],
[ 4.8, 1.8],
[ 4.9, 1.8],
[ 5.6, 2.1],
[ 5.8, 1.6],
[ 6.1, 1.9],
[ 6.4, 2. ],
[ 5.6, 2.2],
[ 5.1, 1.5],
[ 5.6, 1.4],
[ 6.1, 2.3],
[ 5.6, 2.4],
[ 5.5, 1.8],
[ 4.8, 1.8],
[ 5.4, 2.1],
[ 5.6, 2.4],
[ 5.1, 2.3],
[ 5.1, 1.9],
[ 5.9, 2.3],
[ 5.7, 2.5],
[ 5.2, 2.3],
[ 5. , 1.9],
[ 5.2, 2. ],
[ 5.4, 2.3],
[ 5.1, 1.8]])
使用feature_selection库的SelectKBest类结合最大信息系数法来选择特征的代码如下:
from sklearn.feature_selection import SelectKBest
from minepy import MINE
# 由于MINE的设计不是函数式的,定义mic方法将其为函数式的
def mic(x, y):
m = MINE()
m.compute_score(x, y)
return m.mic()
#选择K个最好的特征,返回特征选择后的数据
SelectKBest(lambda X, Y: array(list(map(lambda x: mic(x, Y), X.T))).T, k=2).fit_transform(iris.data, iris.target)
array([[ 1.4, 0.2],
[ 1.4, 0.2],
[ 1.3, 0.2],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.7, 0.4],
[ 1.4, 0.3],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.5, 0.1],
[ 1.5, 0.2],
[ 1.6, 0.2],
[ 1.4, 0.1],
[ 1.1, 0.1],
[ 1.2, 0.2],
[ 1.5, 0.4],
[ 1.3, 0.4],
[ 1.4, 0.3],
[ 1.7, 0.3],
[ 1.5, 0.3],
[ 1.7, 0.2],
[ 1.5, 0.4],
[ 1. , 0.2],
[ 1.7, 0.5],
[ 1.9, 0.2],
[ 1.6, 0.2],
[ 1.6, 0.4],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.6, 0.2],
[ 1.6, 0.2],
[ 1.5, 0.4],
[ 1.5, 0.1],
[ 1.4, 0.2],
[ 1.5, 0.1],
[ 1.2, 0.2],
[ 1.3, 0.2],
[ 1.5, 0.1],
[ 1.3, 0.2],
[ 1.5, 0.2],
[ 1.3, 0.3],
[ 1.3, 0.3],
[ 1.3, 0.2],
[ 1.6, 0.6],
[ 1.9, 0.4],
[ 1.4, 0.3],
[ 1.6, 0.2],
[ 1.4, 0.2],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 4.7, 1.4],
[ 4.5, 1.5],
[ 4.9, 1.5],
[ 4. , 1.3],
[ 4.6, 1.5],
[ 4.5, 1.3],
[ 4.7, 1.6],
[ 3.3, 1. ],
[ 4.6, 1.3],
[ 3.9, 1.4],
[ 3.5, 1. ],
[ 4.2, 1.5],
[ 4. , 1. ],
[ 4.7, 1.4],
[ 3.6, 1.3],
[ 4.4, 1.4],
[ 4.5, 1.5],
[ 4.1, 1. ],
[ 4.5, 1.5],
[ 3.9, 1.1],
[ 4.8, 1.8],
[ 4. , 1.3],
[ 4.9, 1.5],
[ 4.7, 1.2],
[ 4.3, 1.3],
[ 4.4, 1.4],
[ 4.8, 1.4],
[ 5. , 1.7],
[ 4.5, 1.5],
[ 3.5, 1. ],
[ 3.8, 1.1],
[ 3.7, 1. ],
[ 3.9, 1.2],
[ 5.1, 1.6],
[ 4.5, 1.5],
[ 4.5, 1.6],
[ 4.7, 1.5],
[ 4.4, 1.3],
[ 4.1, 1.3],
[ 4. , 1.3],
[ 4.4, 1.2],
[ 4.6, 1.4],
[ 4. , 1.2],
[ 3.3, 1. ],
[ 4.2, 1.3],
[ 4.2, 1.2],
[ 4.2, 1.3],
[ 4.3, 1.3],
[ 3. , 1.1],
[ 4.1, 1.3],
[ 6. , 2.5],
[ 5.1, 1.9],
[ 5.9, 2.1],
[ 5.6, 1.8],
[ 5.8, 2.2],
[ 6.6, 2.1],
[ 4.5, 1.7],
[ 6.3, 1.8],
[ 5.8, 1.8],
[ 6.1, 2.5],
[ 5.1, 2. ],
[ 5.3, 1.9],
[ 5.5, 2.1],
[ 5. , 2. ],
[ 5.1, 2.4],
[ 5.3, 2.3],
[ 5.5, 1.8],
[ 6.7, 2.2],
[ 6.9, 2.3],
[ 5. , 1.5],
[ 5.7, 2.3],
[ 4.9, 2. ],
[ 6.7, 2. ],
[ 4.9, 1.8],
[ 5.7, 2.1],
[ 6. , 1.8],
[ 4.8, 1.8],
[ 4.9, 1.8],
[ 5.6, 2.1],
[ 5.8, 1.6],
[ 6.1, 1.9],
[ 6.4, 2. ],
[ 5.6, 2.2],
[ 5.1, 1.5],
[ 5.6, 1.4],
[ 6.1, 2.3],
[ 5.6, 2.4],
[ 5.5, 1.8],
[ 4.8, 1.8],
[ 5.4, 2.1],
[ 5.6, 2.4],
[ 5.1, 2.3],
[ 5.1, 1.9],
[ 5.9, 2.3],
[ 5.7, 2.5],
[ 5.2, 2.3],
[ 5. , 1.9],
[ 5.2, 2. ],
[ 5.4, 2.3],
[ 5.1, 1.8]])
使用feature_selection库的RFE类来选择特征的代码如下。
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
# 递归特征消除法,返回特征选择后的数据
# 参数estimator为基模型
# 参数n_features_to_select为选择的特征个数
RFE(estimator=LogisticRegression(), n_features_to_select=2).fit_transform(iris.data, iris.target)
array([[ 3.5, 0.2],
[ 3. , 0.2],
[ 3.2, 0.2],
[ 3.1, 0.2],
[ 3.6, 0.2],
[ 3.9, 0.4],
[ 3.4, 0.3],
[ 3.4, 0.2],
[ 2.9, 0.2],
[ 3.1, 0.1],
[ 3.7, 0.2],
[ 3.4, 0.2],
[ 3. , 0.1],
[ 3. , 0.1],
[ 4. , 0.2],
[ 4.4, 0.4],
[ 3.9, 0.4],
[ 3.5, 0.3],
[ 3.8, 0.3],
[ 3.8, 0.3],
[ 3.4, 0.2],
[ 3.7, 0.4],
[ 3.6, 0.2],
[ 3.3, 0.5],
[ 3.4, 0.2],
[ 3. , 0.2],
[ 3.4, 0.4],
[ 3.5, 0.2],
[ 3.4, 0.2],
[ 3.2, 0.2],
[ 3.1, 0.2],
[ 3.4, 0.4],
[ 4.1, 0.1],
[ 4.2, 0.2],
[ 3.1, 0.1],
[ 3.2, 0.2],
[ 3.5, 0.2],
[ 3.1, 0.1],
[ 3. , 0.2],
[ 3.4, 0.2],
[ 3.5, 0.3],
[ 2.3, 0.3],
[ 3.2, 0.2],
[ 3.5, 0.6],
[ 3.8, 0.4],
[ 3. , 0.3],
[ 3.8, 0.2],
[ 3.2, 0.2],
[ 3.7, 0.2],
[ 3.3, 0.2],
[ 3.2, 1.4],
[ 3.2, 1.5],
[ 3.1, 1.5],
[ 2.3, 1.3],
[ 2.8, 1.5],
[ 2.8, 1.3],
[ 3.3, 1.6],
[ 2.4, 1. ],
[ 2.9, 1.3],
[ 2.7, 1.4],
[ 2. , 1. ],
[ 3. , 1.5],
[ 2.2, 1. ],
[ 2.9, 1.4],
[ 2.9, 1.3],
[ 3.1, 1.4],
[ 3. , 1.5],
[ 2.7, 1. ],
[ 2.2, 1.5],
[ 2.5, 1.1],
[ 3.2, 1.8],
[ 2.8, 1.3],
[ 2.5, 1.5],
[ 2.8, 1.2],
[ 2.9, 1.3],
[ 3. , 1.4],
[ 2.8, 1.4],
[ 3. , 1.7],
[ 2.9, 1.5],
[ 2.6, 1. ],
[ 2.4, 1.1],
[ 2.4, 1. ],
[ 2.7, 1.2],
[ 2.7, 1.6],
[ 3. , 1.5],
[ 3.4, 1.6],
[ 3.1, 1.5],
[ 2.3, 1.3],
[ 3. , 1.3],
[ 2.5, 1.3],
[ 2.6, 1.2],
[ 3. , 1.4],
[ 2.6, 1.2],
[ 2.3, 1. ],
[ 2.7, 1.3],
[ 3. , 1.2],
[ 2.9, 1.3],
[ 2.9, 1.3],
[ 2.5, 1.1],
[ 2.8, 1.3],
[ 3.3, 2.5],
[ 2.7, 1.9],
[ 3. , 2.1],
[ 2.9, 1.8],
[ 3. , 2.2],
[ 3. , 2.1],
[ 2.5, 1.7],
[ 2.9, 1.8],
[ 2.5, 1.8],
[ 3.6, 2.5],
[ 3.2, 2. ],
[ 2.7, 1.9],
[ 3. , 2.1],
[ 2.5, 2. ],
[ 2.8, 2.4],
[ 3.2, 2.3],
[ 3. , 1.8],
[ 3.8, 2.2],
[ 2.6, 2.3],
[ 2.2, 1.5],
[ 3.2, 2.3],
[ 2.8, 2. ],
[ 2.8, 2. ],
[ 2.7, 1.8],
[ 3.3, 2.1],
[ 3.2, 1.8],
[ 2.8, 1.8],
[ 3. , 1.8],
[ 2.8, 2.1],
[ 3. , 1.6],
[ 2.8, 1.9],
[ 3.8, 2. ],
[ 2.8, 2.2],
[ 2.8, 1.5],
[ 2.6, 1.4],
[ 3. , 2.3],
[ 3.4, 2.4],
[ 3.1, 1.8],
[ 3. , 1.8],
[ 3.1, 2.1],
[ 3.1, 2.4],
[ 3.1, 2.3],
[ 2.7, 1.9],
[ 3.2, 2.3],
[ 3.3, 2.5],
[ 3. , 2.3],
[ 2.5, 1.9],
[ 3. , 2. ],
[ 3.4, 2.3],
[ 3. , 1.8]])
使用feature_selection库的SelectFromModel类结合带L1惩罚项的逻辑回归模型,来选择特征的代码如下:
from sklearn.feature_selection import SelectFromModel
from sklearn.linear_model import LogisticRegression
#带L1惩罚项的逻辑回归作为基模型的特征选择
SelectFromModel(LogisticRegression(penalty="l1", C=0.1)).fit_transform(iris.data, iris.target)
array([[ 5.1, 3.5, 1.4],
[ 4.9, 3. , 1.4],
[ 4.7, 3.2, 1.3],
[ 4.6, 3.1, 1.5],
[ 5. , 3.6, 1.4],
[ 5.4, 3.9, 1.7],
[ 4.6, 3.4, 1.4],
[ 5. , 3.4, 1.5],
[ 4.4, 2.9, 1.4],
[ 4.9, 3.1, 1.5],
[ 5.4, 3.7, 1.5],
[ 4.8, 3.4, 1.6],
[ 4.8, 3. , 1.4],
[ 4.3, 3. , 1.1],
[ 5.8, 4. , 1.2],
[ 5.7, 4.4, 1.5],
[ 5.4, 3.9, 1.3],
[ 5.1, 3.5, 1.4],
[ 5.7, 3.8, 1.7],
[ 5.1, 3.8, 1.5],
[ 5.4, 3.4, 1.7],
[ 5.1, 3.7, 1.5],
[ 4.6, 3.6, 1. ],
[ 5.1, 3.3, 1.7],
[ 4.8, 3.4, 1.9],
[ 5. , 3. , 1.6],
[ 5. , 3.4, 1.6],
[ 5.2, 3.5, 1.5],
[ 5.2, 3.4, 1.4],
[ 4.7, 3.2, 1.6],
[ 4.8, 3.1, 1.6],
[ 5.4, 3.4, 1.5],
[ 5.2, 4.1, 1.5],
[ 5.5, 4.2, 1.4],
[ 4.9, 3.1, 1.5],
[ 5. , 3.2, 1.2],
[ 5.5, 3.5, 1.3],
[ 4.9, 3.1, 1.5],
[ 4.4, 3. , 1.3],
[ 5.1, 3.4, 1.5],
[ 5. , 3.5, 1.3],
[ 4.5, 2.3, 1.3],
[ 4.4, 3.2, 1.3],
[ 5. , 3.5, 1.6],
[ 5.1, 3.8, 1.9],
[ 4.8, 3. , 1.4],
[ 5.1, 3.8, 1.6],
[ 4.6, 3.2, 1.4],
[ 5.3, 3.7, 1.5],
[ 5. , 3.3, 1.4],
[ 7. , 3.2, 4.7],
[ 6.4, 3.2, 4.5],
[ 6.9, 3.1, 4.9],
[ 5.5, 2.3, 4. ],
[ 6.5, 2.8, 4.6],
[ 5.7, 2.8, 4.5],
[ 6.3, 3.3, 4.7],
[ 4.9, 2.4, 3.3],
[ 6.6, 2.9, 4.6],
[ 5.2, 2.7, 3.9],
[ 5. , 2. , 3.5],
[ 5.9, 3. , 4.2],
[ 6. , 2.2, 4. ],
[ 6.1, 2.9, 4.7],
[ 5.6, 2.9, 3.6],
[ 6.7, 3.1, 4.4],
[ 5.6, 3. , 4.5],
[ 5.8, 2.7, 4.1],
[ 6.2, 2.2, 4.5],
[ 5.6, 2.5, 3.9],
[ 5.9, 3.2, 4.8],
[ 6.1, 2.8, 4. ],
[ 6.3, 2.5, 4.9],
[ 6.1, 2.8, 4.7],
[ 6.4, 2.9, 4.3],
[ 6.6, 3. , 4.4],
[ 6.8, 2.8, 4.8],
[ 6.7, 3. , 5. ],
[ 6. , 2.9, 4.5],
[ 5.7, 2.6, 3.5],
[ 5.5, 2.4, 3.8],
[ 5.5, 2.4, 3.7],
[ 5.8, 2.7, 3.9],
[ 6. , 2.7, 5.1],
[ 5.4, 3. , 4.5],
[ 6. , 3.4, 4.5],
[ 6.7, 3.1, 4.7],
[ 6.3, 2.3, 4.4],
[ 5.6, 3. , 4.1],
[ 5.5, 2.5, 4. ],
[ 5.5, 2.6, 4.4],
[ 6.1, 3. , 4.6],
[ 5.8, 2.6, 4. ],
[ 5. , 2.3, 3.3],
[ 5.6, 2.7, 4.2],
[ 5.7, 3. , 4.2],
[ 5.7, 2.9, 4.2],
[ 6.2, 2.9, 4.3],
[ 5.1, 2.5, 3. ],
[ 5.7, 2.8, 4.1],
[ 6.3, 3.3, 6. ],
[ 5.8, 2.7, 5.1],
[ 7.1, 3. , 5.9],
[ 6.3, 2.9, 5.6],
[ 6.5, 3. , 5.8],
[ 7.6, 3. , 6.6],
[ 4.9, 2.5, 4.5],
[ 7.3, 2.9, 6.3],
[ 6.7, 2.5, 5.8],
[ 7.2, 3.6, 6.1],
[ 6.5, 3.2, 5.1],
[ 6.4, 2.7, 5.3],
[ 6.8, 3. , 5.5],
[ 5.7, 2.5, 5. ],
[ 5.8, 2.8, 5.1],
[ 6.4, 3.2, 5.3],
[ 6.5, 3. , 5.5],
[ 7.7, 3.8, 6.7],
[ 7.7, 2.6, 6.9],
[ 6. , 2.2, 5. ],
[ 6.9, 3.2, 5.7],
[ 5.6, 2.8, 4.9],
[ 7.7, 2.8, 6.7],
[ 6.3, 2.7, 4.9],
[ 6.7, 3.3, 5.7],
[ 7.2, 3.2, 6. ],
[ 6.2, 2.8, 4.8],
[ 6.1, 3. , 4.9],
[ 6.4, 2.8, 5.6],
[ 7.2, 3. , 5.8],
[ 7.4, 2.8, 6.1],
[ 7.9, 3.8, 6.4],
[ 6.4, 2.8, 5.6],
[ 6.3, 2.8, 5.1],
[ 6.1, 2.6, 5.6],
[ 7.7, 3. , 6.1],
[ 6.3, 3.4, 5.6],
[ 6.4, 3.1, 5.5],
[ 6. , 3. , 4.8],
[ 6.9, 3.1, 5.4],
[ 6.7, 3.1, 5.6],
[ 6.9, 3.1, 5.1],
[ 5.8, 2.7, 5.1],
[ 6.8, 3.2, 5.9],
[ 6.7, 3.3, 5.7],
[ 6.7, 3. , 5.2],
[ 6.3, 2.5, 5. ],
[ 6.5, 3. , 5.2],
[ 6.2, 3.4, 5.4],
[ 5.9, 3. , 5.1]])
实际上,L1惩罚项降维的原理在于保留多个对目标值具有同等相关性的特征中的一个,所以没选到的特征不代表不重要。故,可结合L2惩罚项来优化。具体操作为:若一个特征在L1中的权值为1,选择在L2中权值差别不大且在L1中权值为0的特征构成同类集合,将这一集合中的特征平分L1中的权值,故需要构建一个新的逻辑回归模型:
from sklearn.linear_model import LogisticRegression
class LR(LogisticRegression):
def __init__(self, threshold=0.01, dual=False, tol=1e-4, C=1.0,
fit_intercept=True, intercept_scaling=1, class_weight=None,
random_state=None, solver='liblinear', max_iter=100,
multi_class='ovr', verbose=0, warm_start=False, n_jobs=1):
#权值相近的阈值
self.threshold = threshold
LogisticRegression.__init__(self, penalty='l1', dual=dual, tol=tol, C=C,
fit_intercept=fit_intercept, intercept_scaling=intercept_scaling, class_weight=class_weight,
random_state=random_state, solver=solver, max_iter=max_iter,
multi_class=multi_class, verbose=verbose, warm_start=warm_start, n_jobs=n_jobs)
#使用同样的参数创建L2逻辑回归
self.l2 = LogisticRegression(penalty='l2', dual=dual, tol=tol, C=C, fit_intercept=fit_intercept, intercept_scaling=intercept_scaling, class_weight = class_weight, random_state=random_state, solver=solver, max_iter=max_iter, multi_class=multi_class, verbose=verbose, warm_start=warm_start, n_jobs=n_jobs)
def fit(self, X, y, sample_weight=None):
#训练L1逻辑回归
super(LR, self).fit(X, y, sample_weight=sample_weight)
self.coef_old_ = self.coef_.copy()
#训练L2逻辑回归
self.l2.fit(X, y, sample_weight=sample_weight)
cntOfRow, cntOfCol = self.coef_.shape
#权值系数矩阵的行数对应目标值的种类数目
for i in range(cntOfRow):
for j in range(cntOfCol):
coef = self.coef_[i][j]
#L1逻辑回归的权值系数不为0
if coef != 0:
idx = [j]
#对应在L2逻辑回归中的权值系数
coef1 = self.l2.coef_[i][j]
for k in range(cntOfCol):
coef2 = self.l2.coef_[i][k]
#在L2逻辑回归中,权值系数之差小于设定的阈值,且在L1中对应的权值为0
if abs(coef1-coef2) < self.threshold and j != k and self.coef_[i][k] == 0:
idx.append(k)
#计算这一类特征的权值系数均值
mean = coef / len(idx)
self.coef_[i][idx] = mean
return self
使用feature_selection库的SelectFromModel类结合带L1以及L2惩罚项的逻辑回归模型,来选择特征的代码如下:
from sklearn.feature_selection import SelectFromModel
#带L1和L2惩罚项的逻辑回归作为基模型的特征选择
#参数threshold为权值系数之差的阈值
SelectFromModel(LR(threshold=0.5, C=0.1)).fit_transform(iris.data, iris.target)
array([[ 5.1, 3.5, 1.4, 0.2],
[ 4.9, 3. , 1.4, 0.2],
[ 4.7, 3.2, 1.3, 0.2],
[ 4.6, 3.1, 1.5, 0.2],
[ 5. , 3.6, 1.4, 0.2],
[ 5.4, 3.9, 1.7, 0.4],
[ 4.6, 3.4, 1.4, 0.3],
[ 5. , 3.4, 1.5, 0.2],
[ 4.4, 2.9, 1.4, 0.2],
[ 4.9, 3.1, 1.5, 0.1],
[ 5.4, 3.7, 1.5, 0.2],
[ 4.8, 3.4, 1.6, 0.2],
[ 4.8, 3. , 1.4, 0.1],
[ 4.3, 3. , 1.1, 0.1],
[ 5.8, 4. , 1.2, 0.2],
[ 5.7, 4.4, 1.5, 0.4],
[ 5.4, 3.9, 1.3, 0.4],
[ 5.1, 3.5, 1.4, 0.3],
[ 5.7, 3.8, 1.7, 0.3],
[ 5.1, 3.8, 1.5, 0.3],
[ 5.4, 3.4, 1.7, 0.2],
[ 5.1, 3.7, 1.5, 0.4],
[ 4.6, 3.6, 1. , 0.2],
[ 5.1, 3.3, 1.7, 0.5],
[ 4.8, 3.4, 1.9, 0.2],
[ 5. , 3. , 1.6, 0.2],
[ 5. , 3.4, 1.6, 0.4],
[ 5.2, 3.5, 1.5, 0.2],
[ 5.2, 3.4, 1.4, 0.2],
[ 4.7, 3.2, 1.6, 0.2],
[ 4.8, 3.1, 1.6, 0.2],
[ 5.4, 3.4, 1.5, 0.4],
[ 5.2, 4.1, 1.5, 0.1],
[ 5.5, 4.2, 1.4, 0.2],
[ 4.9, 3.1, 1.5, 0.1],
[ 5. , 3.2, 1.2, 0.2],
[ 5.5, 3.5, 1.3, 0.2],
[ 4.9, 3.1, 1.5, 0.1],
[ 4.4, 3. , 1.3, 0.2],
[ 5.1, 3.4, 1.5, 0.2],
[ 5. , 3.5, 1.3, 0.3],
[ 4.5, 2.3, 1.3, 0.3],
[ 4.4, 3.2, 1.3, 0.2],
[ 5. , 3.5, 1.6, 0.6],
[ 5.1, 3.8, 1.9, 0.4],
[ 4.8, 3. , 1.4, 0.3],
[ 5.1, 3.8, 1.6, 0.2],
[ 4.6, 3.2, 1.4, 0.2],
[ 5.3, 3.7, 1.5, 0.2],
[ 5. , 3.3, 1.4, 0.2],
[ 7. , 3.2, 4.7, 1.4],
[ 6.4, 3.2, 4.5, 1.5],
[ 6.9, 3.1, 4.9, 1.5],
[ 5.5, 2.3, 4. , 1.3],
[ 6.5, 2.8, 4.6, 1.5],
[ 5.7, 2.8, 4.5, 1.3],
[ 6.3, 3.3, 4.7, 1.6],
[ 4.9, 2.4, 3.3, 1. ],
[ 6.6, 2.9, 4.6, 1.3],
[ 5.2, 2.7, 3.9, 1.4],
[ 5. , 2. , 3.5, 1. ],
[ 5.9, 3. , 4.2, 1.5],
[ 6. , 2.2, 4. , 1. ],
[ 6.1, 2.9, 4.7, 1.4],
[ 5.6, 2.9, 3.6, 1.3],
[ 6.7, 3.1, 4.4, 1.4],
[ 5.6, 3. , 4.5, 1.5],
[ 5.8, 2.7, 4.1, 1. ],
[ 6.2, 2.2, 4.5, 1.5],
[ 5.6, 2.5, 3.9, 1.1],
[ 5.9, 3.2, 4.8, 1.8],
[ 6.1, 2.8, 4. , 1.3],
[ 6.3, 2.5, 4.9, 1.5],
[ 6.1, 2.8, 4.7, 1.2],
[ 6.4, 2.9, 4.3, 1.3],
[ 6.6, 3. , 4.4, 1.4],
[ 6.8, 2.8, 4.8, 1.4],
[ 6.7, 3. , 5. , 1.7],
[ 6. , 2.9, 4.5, 1.5],
[ 5.7, 2.6, 3.5, 1. ],
[ 5.5, 2.4, 3.8, 1.1],
[ 5.5, 2.4, 3.7, 1. ],
[ 5.8, 2.7, 3.9, 1.2],
[ 6. , 2.7, 5.1, 1.6],
[ 5.4, 3. , 4.5, 1.5],
[ 6. , 3.4, 4.5, 1.6],
[ 6.7, 3.1, 4.7, 1.5],
[ 6.3, 2.3, 4.4, 1.3],
[ 5.6, 3. , 4.1, 1.3],
[ 5.5, 2.5, 4. , 1.3],
[ 5.5, 2.6, 4.4, 1.2],
[ 6.1, 3. , 4.6, 1.4],
[ 5.8, 2.6, 4. , 1.2],
[ 5. , 2.3, 3.3, 1. ],
[ 5.6, 2.7, 4.2, 1.3],
[ 5.7, 3. , 4.2, 1.2],
[ 5.7, 2.9, 4.2, 1.3],
[ 6.2, 2.9, 4.3, 1.3],
[ 5.1, 2.5, 3. , 1.1],
[ 5.7, 2.8, 4.1, 1.3],
[ 6.3, 3.3, 6. , 2.5],
[ 5.8, 2.7, 5.1, 1.9],
[ 7.1, 3. , 5.9, 2.1],
[ 6.3, 2.9, 5.6, 1.8],
[ 6.5, 3. , 5.8, 2.2],
[ 7.6, 3. , 6.6, 2.1],
[ 4.9, 2.5, 4.5, 1.7],
[ 7.3, 2.9, 6.3, 1.8],
[ 6.7, 2.5, 5.8, 1.8],
[ 7.2, 3.6, 6.1, 2.5],
[ 6.5, 3.2, 5.1, 2. ],
[ 6.4, 2.7, 5.3, 1.9],
[ 6.8, 3. , 5.5, 2.1],
[ 5.7, 2.5, 5. , 2. ],
[ 5.8, 2.8, 5.1, 2.4],
[ 6.4, 3.2, 5.3, 2.3],
[ 6.5, 3. , 5.5, 1.8],
[ 7.7, 3.8, 6.7, 2.2],
[ 7.7, 2.6, 6.9, 2.3],
[ 6. , 2.2, 5. , 1.5],
[ 6.9, 3.2, 5.7, 2.3],
[ 5.6, 2.8, 4.9, 2. ],
[ 7.7, 2.8, 6.7, 2. ],
[ 6.3, 2.7, 4.9, 1.8],
[ 6.7, 3.3, 5.7, 2.1],
[ 7.2, 3.2, 6. , 1.8],
[ 6.2, 2.8, 4.8, 1.8],
[ 6.1, 3. , 4.9, 1.8],
[ 6.4, 2.8, 5.6, 2.1],
[ 7.2, 3. , 5.8, 1.6],
[ 7.4, 2.8, 6.1, 1.9],
[ 7.9, 3.8, 6.4, 2. ],
[ 6.4, 2.8, 5.6, 2.2],
[ 6.3, 2.8, 5.1, 1.5],
[ 6.1, 2.6, 5.6, 1.4],
[ 7.7, 3. , 6.1, 2.3],
[ 6.3, 3.4, 5.6, 2.4],
[ 6.4, 3.1, 5.5, 1.8],
[ 6. , 3. , 4.8, 1.8],
[ 6.9, 3.1, 5.4, 2.1],
[ 6.7, 3.1, 5.6, 2.4],
[ 6.9, 3.1, 5.1, 2.3],
[ 5.8, 2.7, 5.1, 1.9],
[ 6.8, 3.2, 5.9, 2.3],
[ 6.7, 3.3, 5.7, 2.5],
[ 6.7, 3. , 5.2, 2.3],
[ 6.3, 2.5, 5. , 1.9],
[ 6.5, 3. , 5.2, 2. ],
[ 6.2, 3.4, 5.4, 2.3],
[ 5.9, 3. , 5.1, 1.8]])
树模型中GBDT也可用来作为基模型进行特征选择,使用feature_selection库的SelectFromModel类结合GBDT模型,来选择特征的代码如下:
from sklearn.feature_selection import SelectFromModel
from sklearn.ensemble import GradientBoostingClassifier
#GBDT作为基模型的特征选择
SelectFromModel(GradientBoostingClassifier()).fit_transform(iris.data, iris.target)
array([[ 1.4, 0.2],
[ 1.4, 0.2],
[ 1.3, 0.2],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.7, 0.4],
[ 1.4, 0.3],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.5, 0.1],
[ 1.5, 0.2],
[ 1.6, 0.2],
[ 1.4, 0.1],
[ 1.1, 0.1],
[ 1.2, 0.2],
[ 1.5, 0.4],
[ 1.3, 0.4],
[ 1.4, 0.3],
[ 1.7, 0.3],
[ 1.5, 0.3],
[ 1.7, 0.2],
[ 1.5, 0.4],
[ 1. , 0.2],
[ 1.7, 0.5],
[ 1.9, 0.2],
[ 1.6, 0.2],
[ 1.6, 0.4],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 1.6, 0.2],
[ 1.6, 0.2],
[ 1.5, 0.4],
[ 1.5, 0.1],
[ 1.4, 0.2],
[ 1.5, 0.1],
[ 1.2, 0.2],
[ 1.3, 0.2],
[ 1.5, 0.1],
[ 1.3, 0.2],
[ 1.5, 0.2],
[ 1.3, 0.3],
[ 1.3, 0.3],
[ 1.3, 0.2],
[ 1.6, 0.6],
[ 1.9, 0.4],
[ 1.4, 0.3],
[ 1.6, 0.2],
[ 1.4, 0.2],
[ 1.5, 0.2],
[ 1.4, 0.2],
[ 4.7, 1.4],
[ 4.5, 1.5],
[ 4.9, 1.5],
[ 4. , 1.3],
[ 4.6, 1.5],
[ 4.5, 1.3],
[ 4.7, 1.6],
[ 3.3, 1. ],
[ 4.6, 1.3],
[ 3.9, 1.4],
[ 3.5, 1. ],
[ 4.2, 1.5],
[ 4. , 1. ],
[ 4.7, 1.4],
[ 3.6, 1.3],
[ 4.4, 1.4],
[ 4.5, 1.5],
[ 4.1, 1. ],
[ 4.5, 1.5],
[ 3.9, 1.1],
[ 4.8, 1.8],
[ 4. , 1.3],
[ 4.9, 1.5],
[ 4.7, 1.2],
[ 4.3, 1.3],
[ 4.4, 1.4],
[ 4.8, 1.4],
[ 5. , 1.7],
[ 4.5, 1.5],
[ 3.5, 1. ],
[ 3.8, 1.1],
[ 3.7, 1. ],
[ 3.9, 1.2],
[ 5.1, 1.6],
[ 4.5, 1.5],
[ 4.5, 1.6],
[ 4.7, 1.5],
[ 4.4, 1.3],
[ 4.1, 1.3],
[ 4. , 1.3],
[ 4.4, 1.2],
[ 4.6, 1.4],
[ 4. , 1.2],
[ 3.3, 1. ],
[ 4.2, 1.3],
[ 4.2, 1.2],
[ 4.2, 1.3],
[ 4.3, 1.3],
[ 3. , 1.1],
[ 4.1, 1.3],
[ 6. , 2.5],
[ 5.1, 1.9],
[ 5.9, 2.1],
[ 5.6, 1.8],
[ 5.8, 2.2],
[ 6.6, 2.1],
[ 4.5, 1.7],
[ 6.3, 1.8],
[ 5.8, 1.8],
[ 6.1, 2.5],
[ 5.1, 2. ],
[ 5.3, 1.9],
[ 5.5, 2.1],
[ 5. , 2. ],
[ 5.1, 2.4],
[ 5.3, 2.3],
[ 5.5, 1.8],
[ 6.7, 2.2],
[ 6.9, 2.3],
[ 5. , 1.5],
[ 5.7, 2.3],
[ 4.9, 2. ],
[ 6.7, 2. ],
[ 4.9, 1.8],
[ 5.7, 2.1],
[ 6. , 1.8],
[ 4.8, 1.8],
[ 4.9, 1.8],
[ 5.6, 2.1],
[ 5.8, 1.6],
[ 6.1, 1.9],
[ 6.4, 2. ],
[ 5.6, 2.2],
[ 5.1, 1.5],
[ 5.6, 1.4],
[ 6.1, 2.3],
[ 5.6, 2.4],
[ 5.5, 1.8],
[ 4.8, 1.8],
[ 5.4, 2.1],
[ 5.6, 2.4],
[ 5.1, 2.3],
[ 5.1, 1.9],
[ 5.9, 2.3],
[ 5.7, 2.5],
[ 5.2, 2.3],
[ 5. , 1.9],
[ 5.2, 2. ],
[ 5.4, 2.3],
[ 5.1, 1.8]])
使用decomposition库的PCA类选择特征的代码如下:
from sklearn.decomposition import PCA
#主成分分析法,返回降维后的数据
#参数n_components为主成分数目
PCA(n_components=2).fit_transform(iris.data)
array([[-2.68420713, 0.32660731],
[-2.71539062, -0.16955685],
[-2.88981954, -0.13734561],
[-2.7464372 , -0.31112432],
[-2.72859298, 0.33392456],
[-2.27989736, 0.74778271],
[-2.82089068, -0.08210451],
[-2.62648199, 0.17040535],
[-2.88795857, -0.57079803],
[-2.67384469, -0.1066917 ],
[-2.50652679, 0.65193501],
[-2.61314272, 0.02152063],
[-2.78743398, -0.22774019],
[-3.22520045, -0.50327991],
[-2.64354322, 1.1861949 ],
[-2.38386932, 1.34475434],
[-2.6225262 , 0.81808967],
[-2.64832273, 0.31913667],
[-2.19907796, 0.87924409],
[-2.58734619, 0.52047364],
[-2.3105317 , 0.39786782],
[-2.54323491, 0.44003175],
[-3.21585769, 0.14161557],
[-2.30312854, 0.10552268],
[-2.35617109, -0.03120959],
[-2.50791723, -0.13905634],
[-2.469056 , 0.13788731],
[-2.56239095, 0.37468456],
[-2.63982127, 0.31929007],
[-2.63284791, -0.19007583],
[-2.58846205, -0.19739308],
[-2.41007734, 0.41808001],
[-2.64763667, 0.81998263],
[-2.59715948, 1.10002193],
[-2.67384469, -0.1066917 ],
[-2.86699985, 0.0771931 ],
[-2.62522846, 0.60680001],
[-2.67384469, -0.1066917 ],
[-2.98184266, -0.48025005],
[-2.59032303, 0.23605934],
[-2.77013891, 0.27105942],
[-2.85221108, -0.93286537],
[-2.99829644, -0.33430757],
[-2.4055141 , 0.19591726],
[-2.20883295, 0.44269603],
[-2.71566519, -0.24268148],
[-2.53757337, 0.51036755],
[-2.8403213 , -0.22057634],
[-2.54268576, 0.58628103],
[-2.70391231, 0.11501085],
[ 1.28479459, 0.68543919],
[ 0.93241075, 0.31919809],
[ 1.46406132, 0.50418983],
[ 0.18096721, -0.82560394],
[ 1.08713449, 0.07539039],
[ 0.64043675, -0.41732348],
[ 1.09522371, 0.28389121],
[-0.75146714, -1.00110751],
[ 1.04329778, 0.22895691],
[-0.01019007, -0.72057487],
[-0.5110862 , -1.26249195],
[ 0.51109806, -0.10228411],
[ 0.26233576, -0.5478933 ],
[ 0.98404455, -0.12436042],
[-0.174864 , -0.25181557],
[ 0.92757294, 0.46823621],
[ 0.65959279, -0.35197629],
[ 0.23454059, -0.33192183],
[ 0.94236171, -0.54182226],
[ 0.0432464 , -0.58148945],
[ 1.11624072, -0.08421401],
[ 0.35678657, -0.06682383],
[ 1.29646885, -0.32756152],
[ 0.92050265, -0.18239036],
[ 0.71400821, 0.15037915],
[ 0.89964086, 0.32961098],
[ 1.33104142, 0.24466952],
[ 1.55739627, 0.26739258],
[ 0.81245555, -0.16233157],
[-0.30733476, -0.36508661],
[-0.07034289, -0.70253793],
[-0.19188449, -0.67749054],
[ 0.13499495, -0.31170964],
[ 1.37873698, -0.42120514],
[ 0.58727485, -0.48328427],
[ 0.8072055 , 0.19505396],
[ 1.22042897, 0.40803534],
[ 0.81286779, -0.370679 ],
[ 0.24519516, -0.26672804],
[ 0.16451343, -0.67966147],
[ 0.46303099, -0.66952655],
[ 0.89016045, -0.03381244],
[ 0.22887905, -0.40225762],
[-0.70708128, -1.00842476],
[ 0.35553304, -0.50321849],
[ 0.33112695, -0.21118014],
[ 0.37523823, -0.29162202],
[ 0.64169028, 0.01907118],
[-0.90846333, -0.75156873],
[ 0.29780791, -0.34701652],
[ 2.53172698, -0.01184224],
[ 1.41407223, -0.57492506],
[ 2.61648461, 0.34193529],
[ 1.97081495, -0.18112569],
[ 2.34975798, -0.04188255],
[ 3.39687992, 0.54716805],
[ 0.51938325, -1.19135169],
[ 2.9320051 , 0.35237701],
[ 2.31967279, -0.24554817],
[ 2.91813423, 0.78038063],
[ 1.66193495, 0.2420384 ],
[ 1.80234045, -0.21615461],
[ 2.16537886, 0.21528028],
[ 1.34459422, -0.77641543],
[ 1.5852673 , -0.53930705],
[ 1.90474358, 0.11881899],
[ 1.94924878, 0.04073026],
[ 3.48876538, 1.17154454],
[ 3.79468686, 0.25326557],
[ 1.29832982, -0.76101394],
[ 2.42816726, 0.37678197],
[ 1.19809737, -0.60557896],
[ 3.49926548, 0.45677347],
[ 1.38766825, -0.20403099],
[ 2.27585365, 0.33338653],
[ 2.61419383, 0.55836695],
[ 1.25762518, -0.179137 ],
[ 1.29066965, -0.11642525],
[ 2.12285398, -0.21085488],
[ 2.3875644 , 0.46251925],
[ 2.84096093, 0.37274259],
[ 3.2323429 , 1.37052404],
[ 2.15873837, -0.21832553],
[ 1.4431026 , -0.14380129],
[ 1.77964011, -0.50146479],
[ 3.07652162, 0.68576444],
[ 2.14498686, 0.13890661],
[ 1.90486293, 0.04804751],
[ 1.16885347, -0.1645025 ],
[ 2.10765373, 0.37148225],
[ 2.31430339, 0.18260885],
[ 1.92245088, 0.40927118],
[ 1.41407223, -0.57492506],
[ 2.56332271, 0.2759745 ],
[ 2.41939122, 0.30350394],
[ 1.94401705, 0.18741522],
[ 1.52566363, -0.37502085],
[ 1.76404594, 0.07851919],
[ 1.90162908, 0.11587675],
[ 1.38966613, -0.28288671]])
使用lda库的LDA类选择特征的代码如下:
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
#线性判别分析法,返回降维后的数据
#参数n_components为降维后的维数
LDA(n_components=2).fit_transform(iris.data, iris.target)
array([[ 8.0849532 , 0.32845422],
[ 7.1471629 , -0.75547326],
[ 7.51137789, -0.23807832],
[ 6.83767561, -0.64288476],
[ 8.15781367, 0.54063935],
[ 7.72363087, 1.48232345],
[ 7.23514662, 0.3771537 ],
[ 7.62974497, 0.01667246],
[ 6.58274132, -0.98737424],
[ 7.36884116, -0.91362729],
[ 8.42181434, 0.67622968],
[ 7.24739721, -0.08292417],
[ 7.35062105, -1.0393597 ],
[ 7.59646896, -0.77671553],
[ 9.86936588, 1.61486093],
[ 9.18033614, 2.75558626],
[ 8.59760709, 1.85442217],
[ 7.7995682 , 0.60905468],
[ 8.1000091 , 0.99610981],
[ 8.04543611, 1.16244332],
[ 7.52046427, -0.156233 ],
[ 7.60526378, 1.22757267],
[ 8.70408249, 0.89959416],
[ 6.26374139, 0.46023935],
[ 6.59191505, -0.36199821],
[ 6.79210164, -0.93823664],
[ 6.84048091, 0.4848487 ],
[ 7.948386 , 0.23871551],
[ 8.01209273, 0.11626909],
[ 6.85589572, -0.51715236],
[ 6.78303525, -0.72933749],
[ 7.38668238, 0.59101728],
[ 9.16249492, 1.25094169],
[ 9.49617185, 1.84989586],
[ 7.36884116, -0.91362729],
[ 7.9756525 , -0.13519572],
[ 8.63115466, 0.4346228 ],
[ 7.36884116, -0.91362729],
[ 6.95602269, -0.67887846],
[ 7.71167183, 0.01995843],
[ 7.9361354 , 0.69879338],
[ 5.6690533 , -1.90328976],
[ 7.26559733, -0.24793625],
[ 6.42449823, 1.26152073],
[ 6.88607488, 1.07094506],
[ 6.77985104, -0.47815878],
[ 8.11232705, 0.78881818],
[ 7.21095698, -0.33438897],
[ 8.33988749, 0.6729437 ],
[ 7.69345171, -0.10577397],
[-1.45772244, 0.04186554],
[-1.79768044, 0.48879951],
[-2.41680973, -0.08234044],
[-2.26486771, -1.57609174],
[-2.55339693, -0.46282362],
[-2.41954768, -0.95728766],
[-2.44719309, 0.79553574],
[-0.2160281 , -1.57096512],
[-1.74591275, -0.80526746],
[-1.95838993, -0.35044011],
[-1.19023864, -2.61561292],
[-1.86140718, 0.32050146],
[-1.15386577, -2.61693435],
[-2.65942607, -0.63412155],
[-0.38024071, 0.09211958],
[-1.20280815, 0.09561055],
[-2.7626699 , 0.03156949],
[-0.76227692, -1.63917546],
[-3.50940735, -1.6724835 ],
[-1.08410216, -1.6100398 ],
[-3.71895188, 1.03509697],
[-0.99937 , -0.47902036],
[-3.83709476, -1.39488292],
[-2.24344339, -1.41079358],
[-1.25428429, -0.53276537],
[-1.43952232, -0.12314653],
[-2.45921948, -0.91961551],
[-3.52471481, 0.16379275],
[-2.58974981, -0.17075771],
[ 0.31197324, -1.29978446],
[-1.10232227, -1.7357722 ],
[-0.59844322, -1.92334798],
[-0.89605882, -0.89192518],
[-4.49567379, -0.87924754],
[-2.9265236 , 0.02499754],
[-2.10119821, 1.18719828],
[-2.14367532, 0.09713697],
[-2.48342912, -1.92190266],
[-1.31792367, -0.15753271],
[-1.95529307, -1.14514953],
[-2.38909697, -1.5823776 ],
[-2.28614469, -0.32562577],
[-1.26934019, -1.20042096],
[-0.28888857, -1.78315025],
[-2.00077969, -0.8969707 ],
[-1.16910587, -0.52787187],
[-1.6092782 , -0.46274252],
[-1.41813799, -0.53933732],
[ 0.47271009, -0.78924756],
[-1.54557146, -0.58518894],
[-7.85608083, 2.11161905],
[-5.5156825 , -0.04401811],
[-6.30499392, 0.46211638],
[-5.60355888, -0.34236987],
[-6.86344597, 0.81602566],
[-7.42481805, -0.1726265 ],
[-4.68086447, -0.50758694],
[-6.31374875, -0.96068288],
[-6.33198886, -1.37715975],
[-6.87287126, 2.69458147],
[-4.45364294, 1.33693971],
[-5.4611095 , -0.21035161],
[-5.67679825, 0.82435717],
[-5.97407494, -0.10462115],
[-6.78782019, 1.5744553 ],
[-5.82871291, 1.98940576],
[-5.0664238 , -0.02730214],
[-6.60847169, 1.7420041 ],
[-9.18829265, -0.74909806],
[-4.76573133, -2.14417884],
[-6.29305487, 1.63373692],
[-5.37314577, 0.63153087],
[-7.58557489, -0.97390788],
[-4.38367513, -0.12213933],
[-5.73135125, 1.28143515],
[-5.27583147, -0.0384815 ],
[-4.0923206 , 0.18307048],
[-4.08316687, 0.51770204],
[-6.53257435, 0.28724638],
[-4.577648 , -0.84457527],
[-6.23500611, -0.70621819],
[-5.21836582, 1.46644917],
[-6.81795935, 0.56784684],
[-3.80972091, -0.93451896],
[-5.09023453, -2.11775698],
[-6.82119092, 0.85698379],
[-6.54193229, 2.41858841],
[-4.99356333, 0.18488299],
[-3.94659967, 0.60744074],
[-5.22159002, 1.13613893],
[-6.67858684, 1.785319 ],
[-5.13687786, 1.97641389],
[-5.5156825 , -0.04401811],
[-6.81196984, 1.44440158],
[-6.87289126, 2.40383699],
[-5.67401294, 1.66134615],
[-5.19712883, -0.36550576],
[-4.98171163, 0.81297282],
[-5.90148603, 2.32075134],
[-4.68400868, 0.32508073]])