第五章 线性回归 学习笔记中
2023-09-27 14:25:50 时间
目录
5-5 衡量线性回归法的指标 MSE,RMS,MAE05-Regression-Metrics-MSE-vs-MAE
5-5 衡量线性回归法的指标 MSE,RMS,MAE05-Regression-Metrics-MSE-vs-MAE
其实是对训练数据集来说
衡量标准与样本数无关
量纲是平方,有时候会比较麻烦所以
x = x[y < 50.0]
y = y[y < 50.0]没想明白? x,y怎么都是判断y<50
SimpleLinearRegression.py
import numpy as np
class SimpleLinearRegression:
def __init__(self):
"""初始化Simple Linear Regression模型"""
self.a_ = None
self.b_ = None
def fit(self, x_train, y_train):
"""根据训练数据集x_train训练Simple Linear Regression模型"""
assert x_train.ndim == 1, \
"Simple Linear Regressor can only solve single feature training data."
assert len(x_train) == len(y_train), \
"the size of x_train must be equal to the size of y_train"
x_mean = np.mean(x_train)
y_mean = np.mean(y_train)
self.a_ = (x_train - x_mean).dot(y_train - y_mean) / (x_train - x_mean).dot(x_train - x_mean)
self.b_ = y_mean - self.a_ * x_mean
return self
def predict(self, x_predict):
"""给定待预测数据集x_predict,返回表示x_predict的结果向量"""
assert x_predict.ndim == 1, \
"Simple Linear Regressor can only solve single feature training data."
assert self.a_ is not None and self.b_ is not None, \
"must fit before predict!"
return np.array([self._predict(x) for x in x_predict])
def _predict(self, x_single):
"""给定单个待预测数据x,返回x的预测结果值"""
return self.a_ * x_single + self.b_
def __repr__(self):
return "SimpleLinearRegression()"
封装我们自己的评测函数
metrics.py
import numpy as np
from math import sqrt
def accuracy_score(y_true, y_predict):
"""计算y_true和y_predict之间的准确率"""
assert len(y_true) == len(y_predict), \
"the size of y_true must be equal to the size of y_predict"
return np.sum(y_true == y_predict) / len(y_true)
def mean_squared_error(y_true, y_predict):
"""计算y_true和y_predict之间的MSE"""
assert len(y_true) == len(y_predict), \
"the size of y_true must be equal to the size of y_predict"
return np.sum((y_true - y_predict)**2) / len(y_true)
def root_mean_squared_error(y_true, y_predict):
"""计算y_true和y_predict之间的RMSE"""
return sqrt(mean_squared_error(y_true, y_predict))
def mean_absolute_error(y_true, y_predict):
"""计算y_true和y_predict之间的MAE"""
return np.sum(np.absolute(y_true - y_predict)) / len(y_true)
RMSE有放大错误值的趋势,而MAE没有, RMSE尽量小则其最大错误值比较小,其本质是在减小最大的识差的那个值
5-6 最好的衡量线性回归法的指标 R Squared
RMSE和MAE没有这样的体现
预测一真值,平均值—y
意义是什么?为什么好?
假设数据间有一定的线性关系
两个脚本在同一个目录或文件夹下,则一个引用另一个可以用 aaa.bbb 省略aaa则 .bbb
SimpleLinearRegression.py
import numpy as np
from .metrics import r2_score
class SimpleLinearRegression:
def __init__(self):
"""初始化Simple Linear Regression模型"""
self.a_ = None
self.b_ = None
def fit(self, x_train, y_train):
"""根据训练数据集x_train训练Simple Linear Regression模型"""
assert x_train.ndim == 1, \
"Simple Linear Regressor can only solve single feature training data."
assert len(x_train) == len(y_train), \
"the size of x_train must be equal to the size of y_train"
x_mean = np.mean(x_train)
y_mean = np.mean(y_train)
self.a_ = (x_train - x_mean).dot(y_train - y_mean) / (x_train - x_mean).dot(x_train - x_mean)
self.b_ = y_mean - self.a_ * x_mean
return self
def predict(self, x_predict):
"""给定待预测数据集x_predict,返回表示x_predict的结果向量"""
assert x_predict.ndim == 1, \
"Simple Linear Regressor can only solve single feature training data."
assert self.a_ is not None and self.b_ is not None, \
"must fit before predict!"
return np.array([self._predict(x) for x in x_predict])
def _predict(self, x_single):
"""给定单个待预测数据x,返回x的预测结果值"""
return self.a_ * x_single + self.b_
def score(self, x_test, y_test):
"""根据测试数据集 x_test 和 y_test 确定当前模型的准确度"""
y_predict = self.predict(x_test)
return r2_score(y_test, y_predict)
def __repr__(self):
return "SimpleLinearRegression()"
metrics.py
import numpy as np
from math import sqrt
def accuracy_score(y_true, y_predict):
"""计算y_true和y_predict之间的准确率"""
assert len(y_true) == len(y_predict), \
"the size of y_true must be equal to the size of y_predict"
return np.sum(y_true == y_predict) / len(y_true)
def mean_squared_error(y_true, y_predict):
"""计算y_true和y_predict之间的MSE"""
assert len(y_true) == len(y_predict), \
"the size of y_true must be equal to the size of y_predict"
return np.sum((y_true - y_predict)**2) / len(y_true)
def root_mean_squared_error(y_true, y_predict):
"""计算y_true和y_predict之间的RMSE"""
return sqrt(mean_squared_error(y_true, y_predict))
def mean_absolute_error(y_true, y_predict):
"""计算y_true和y_predict之间的MAE"""
assert len(y_true) == len(y_predict), \
"the size of y_true must be equal to the size of y_predict"
return np.sum(np.absolute(y_true - y_predict)) / len(y_true)
def r2_score(y_true, y_predict):
"""计算y_true和y_predict之间的R Square"""
return 1 - mean_squared_error(y_true, y_predict)/np.var(y_true)
scikit-learn中的LinearRegression中的score返回r2_score
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
相关文章
- 【Pytorch学习笔记】4.细讲Pytorch的gather函数是什么——从Softmax回归中交叉熵损失函数定义的角度讲述
- 2021李宏毅学习笔记——5. logistic Regression(逻辑回归)
- 学习笔记3:《大型网站技术架构 核心原理与案例分析》之 大型网站架构模式
- Vue-router2.0学习笔记(转)
- Linux命令chmod学习笔记
- Gradle学习笔记(1)创建简单的Java项目
- 机器学习笔记之一般线性回归Liner Regression
- ELK学习笔记之Kibana安装配置
- Django学习笔记之Django ORM Aggregation聚合详解
- 疯狂Java学习笔记(77)-----------凝视注意事项
- jsp学习笔记总结
- Java学习笔记——MySQL的安装使用以及SQL语法简介
- 神经网络学习笔记(六) 广义回归神经网络
- 传智播客 python 私有化学习笔记
- 第9章 逻辑回归 学习笔记 下
- 第9章 逻辑回归 学习笔记 中
- Windows Shellcode学习笔记——Shellcode的提取与测试
- 斯坦福机器学习公开课学习笔记(3)—拟合问题以及局部权重回归、逻辑回归
- Python-爬虫入门-笔记-Requests-Beautiful Soup-Re-Scrapy
- JNI 学习笔记系列(二)