ML之sklearn:sklearn.metrics中confusion_matrix函数、make_scorer函数解读、案例应用之详细攻略
ML之sklearn:sklearn.metrics中confusion_matrix函数、make_scorer函数解读、案例应用之详细攻略
目录
sklearn.metrics.confusion_matrix函数
sklearn.metrics.make_scorer()函数
推荐文章
ML:分类预测问题中评价指标(ER/混淆矩阵P-R-F1/ROC-AUC/RP/mAP)简介、使用方法、代码实现、案例应用之详细攻略
CNN之性能指标:卷积神经网络中常用的性能指标(IOU/AP/mAP、混淆矩阵)简介、使用方法之详细攻略
sklearn.metrics中常用的函数参数
sklearn.metrics.confusion_matrix函数
函数解释
返回值:混淆矩阵,其第i行和第j列条目表示真实标签为第i类、预测标签为第j类的样本数。
预测 0 1 真实 0 1
def confusion_matrix Found at: sklearn.metrics._classification @_deprecate_positional_args | 在:sklear. metrics._classification找到的def confusion_matrix @_deprecate_positional_args
|
Examples -------- >>> from sklearn.metrics import confusion_matrix >>> y_true = [2, 0, 2, 2, 0, 1] >>> y_pred = [0, 0, 2, 2, 0, 2] >>> confusion_matrix(y_true, y_pred) array([[2, 0, 0], [0, 0, 1], [1, 0, 2]]) >>> y_true = ["cat", "ant", "cat", "cat", "ant", "bird"] >>> y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"] >>> confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"]) array([[2, 0, 0], [0, 0, 1], [1, 0, 2]]) In the binary case, we can extract true positives, etc as follows: >>> tn, fp, fn, tp = confusion_matrix([0, 1, 0, 1], [1, 1, 1, 0]).ravel() >>> (tn, fp, fn, tp) (0, 2, 1, 1) | |
""" y_type, y_true, y_pred = _check_targets(y_true, y_pred) if y_type not in ("binary", "multiclass"): raise ValueError("%s is not supported" % y_type) if labels is None: labels = unique_labels(y_true, y_pred) else: labels = np.asarray(labels) n_labels = labels.size if n_labels == 0: raise ValueError("'labels' should contains at least one label.") elif y_true.size == 0: return np.zeros((n_labels, n_labels), dtype=np.int) elif np.all([l not in y_true for l in labels]): raise ValueError("At least one label specified must be in y_true") if sample_weight is None: sample_weight = np.ones(y_true.shape[0], dtype=np.int64) else: sample_weight = np.asarray(sample_weight) check_consistent_length(y_true, y_pred, sample_weight) if normalize not in ['true', 'pred', 'all', None]: raise ValueError("normalize must be one of {'true', 'pred', " "'all', None}") n_labels = labels.size label_to_ind = {y:x for x, y in enumerate(labels)} # convert yt, yp into index y_pred = np.array([label_to_ind.get(x, n_labels + 1) for x in y_pred]) y_true = np.array([label_to_ind.get(x, n_labels + 1) for x in y_true]) # intersect y_pred, y_true with labels, eliminate items not in labels ind = np.logical_and(y_pred < n_labels, y_true < n_labels) y_pred = y_pred[ind] y_true = y_true[ind] # also eliminate weights of eliminated items sample_weight = sample_weight[ind] # Choose the accumulator dtype to always have high precision if sample_weight.dtype.kind in {'i', 'u', 'b'}: dtype = np.int64 else: dtype = np.float64 cm = coo_matrix((sample_weight, (y_true, y_pred)), shape=(n_labels, n_labels), dtype=dtype).toarray() with np.errstate(all='ignore'): if normalize == 'true': cm = cm / cm.sum(axis=1, keepdims=True) elif normalize == 'pred': cm = cm / cm.sum(axis=0, keepdims=True) elif normalize == 'all': cm = cm / cm.sum() cm = np.nan_to_num(cm) return cm |
sklearn.metrics.make_scorer()函数
函数的解读
def make_scorer(score_func, *, greater_is_better=True, needs_proba=False, needs_threshold=False, **kwargs): Read more in the :ref:`User Guide <scoring>`. | 根据性能指标或损失函数制作记分员。这个工厂函数包装了评分函数,用于:class:`~sklearn.model_selection。GridSearchCV '和:func: ' ~sklearn.model_selection.cross_val_score '。它接受一个评分函数,例如:func:`~sklearn.metrics。accuracy_score ~ sklearn.metrics, func:。mean_squared_error ~ sklearn.metrics, func:。adjuststed_rand_index '或:func: ' ~sklearn.metrics。并返回一个可调用对象,对估计器的输出进行评分。调用的签名是' (estimator, X, y) ',其中' estimator '是要评估的模型,' X '是数据,' y '是基本真理标记(在无监督模型的情况下为' None ')。阅读更多:参考:“用户指南”。 |
Parameters greater_is_better : bool, default=True needs_proba : bool, default=False. Whether score_func requires predict_proba to get probability estimates out of a classifier. If True, for binary `y_true`, the score function is supposed to accept a 1D `y_pred` (i.e., probability of the positive class, shape `(n_samples,)`). needs_threshold : bool, default=False. Whether score_func takes a continuous decision certainty. This only works for binary classification using estimators that have either a decision_function or predict_proba method. If True, for binary `y_true`, the score function is supposed to accept a 1D `y_pred` (i.e., probability of the positive class or the decision function, shape `(n_samples,)`). For example ``average_precision`` or the area under the roc curve can not be computed using discrete predictions alone. **kwargs : additional arguments. Additional parameters to be passed to score_func. Returns | 参数 ---------- score_func:可调用的。分数函数(或损失函数)带有签名“score_func(y, y_pred, **kwargs)”。greater_is_better: bool, default=True score_func是一个得分函数(默认),表示高是好,还是一个损失函数,表示低是好。在后一种情况下,scorer对象将对score_func的结果进行符号翻转。needs_proba: bool, default=False。score_func是否需要predict_proba从分类器中获得概率估计。如果为True,对于二进制' y_true ', score函数应该接受1D ' y_pred '(即,正类的概率,形状' (n_samples,) ')。 needs_threshold : bool, default=False。score_func是否具有连续的决策确定性。这只适用于使用具有decision_function或predict_proba方法的估计器进行二进制分类。如果为True,对于二进制' y_true ',得分函数应该接受1D ' y_pred '(即,正类或决策函数shape ' (n_samples,) '的概率)。例如,“average_precision”或roc曲线下的面积不能单独使用离散预测来计算。**kwargs:附加参数。要传递给score_func的附加参数。 返回 ------ scorer :可调用。返回标量分数的可调用对象;越大越好。 |
Examples Notes |
函数案例应用
结合log_transfer函数使用
from sklearn.metrics import make_scorer,mean_absolute_error,r2_score
def log_transfer(func):
def wrapper(y, y_hat):
result = func(np.log(y), np.nan_to_num(np.log(y_hat)))
return result
return wrapper
cv_scores = cross_val_score(LiR_Model, X=X_train, y=y_train, verbose=1, cv=5, scoring=make_scorer(log_transfer(r2_score)))
相关文章
- ASP.NET Web配置使用HTTPS实用案例
- Ajax基本案例详解之$.getjson的实现
- TDengine 如何助力钢铁行业处理日均亿级的数据量?来看几个真实案例
- javascript案例15——学生成绩等级评价(采用if)
- ML与Regularization:正则化理论即bias-variance tradeoff(权值衰减/提前终止/数据扩增/Dropout/融合技术)在机器学习中的简介、常用方法、案例应用之详细攻略
- Soft:软件开发/软件测试/运维的简介(经验/注意事项)、常用工具、案例应用之详细攻略
- ML之lightgbm:LightGBM参数手册、调参技巧/调参顺序/网格搜索实现、lightgbm.train/LGBMClassifier/LGBMRegressor函数简介及其案例应用之详细攻略
- VB.net:VB.net编程语言学习之ADO.net基本名称空间与类的简介、案例应用(实现与SQL数据库编程案例)之详细攻略
- FE之DR之线性降维:PCA/白化、LDA算法的数学知识(协方差矩阵)、相关论文、算法骤、代码实现、案例应用等相关配图之详细攻略
- Python语言学习:Python语言学习之正则表达式相关(re正则表达式库)的简介、常用函数、案例应用之详细攻略
- NLP之LSA/GloVe:LSA/GloVe的简介、使用方法、案例应用之详细攻略
- CV之IPE之PoseEstimation:Pose Estimation人体姿态估计的简介(AI识人,OpenPose+DeepCut+RMPE+Mask RCNN)、案例应用(活动识别)之详细攻略
- DL之DeepLabv1:DeepLabv1算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略
- DL之PSPNet:PSPNet算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略
- ML之ME/LF:机器学习之风控业务中常用模型监控指标CSI(特征稳定性指标)的简介、使用方法、案例应用之详细攻略
- ML之ME/LF:机器学习中常见模型评估指标/损失函数(LiR损失、L1损失、L2损失、Logistic损失)求梯度/求导、案例应用之详细攻略
- BigData之MongoDB:MongoDB(基于分布式文件存储的非关系型数据库)的简介、下载、案例应用之详细攻略
- Python编程语言学习:python中与数字相关的函数(取整等)、案例应用之详细攻略
- 【数字信号处理】线性时不变系统 LTI ( 判断某个系统是否是 “ 非时变 “ 系统 | 案例一 | 先变换后移位 | 先移位后变换 )
- js之购物车案例
- 第29讲:Python强大的内置函数zip()的核心概念以及丰富的应用案例