您现在的位置是：首页 > 大数据

当前栏目

动手学AI——线性回归

AI 回归线性动手

2023-09-14 09:14:57 时间

文章目录

线性回归的从零实现
线性回归的简洁实现

%matplotlib inline
import random
import torch
from d2l import torch as d2l

线性回归的从零实现

torch.cuda.is_available(),torch.__version__

(True, '1.10.0')

1.构造数据集

def synthetic_data(w, b, num_examples):  
    """生成 y = Xw + b + 噪声。"""
    X = torch.normal(0, 1, (num_examples, len(w)))
    # y = torch.matmul(X, w) + b # 区别就是torch.matmul自带形状转化
    y = torch.mv(X,w) + b 
    y += torch.normal(0, 0.01, y.shape)
    return X, y.reshape((-1, 1))

true_w = torch.tensor([2, -3.4])
true_b = 4.2
features, labels = synthetic_data(true_w, true_b, 1000)

print('features:', features[0], '\nlabel:', labels[0])

features: tensor([0.3420, 0.0065]) 
label: tensor([4.8720])

d2l.set_figsize()
d2l.plt.scatter(features[:,1].numpy(),labels.numpy(), 1)

<matplotlib.collections.PathCollection at 0x1b64f008208>

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-DCr0zB2p-1665042436916)(output_6_1.svg)]

2.定义数据读取函数

# 定义一个data_iter 函数， 该函数接收批量大小、特征矩阵和标签向量作为输入，生成大小为batch_size的小批量
def data_iter(batch_size, features, labels):
    num_examples = len(features)
    indices = list(range(num_examples))
    random.shuffle(indices)
    
    for i in range(0, num_examples, batch_size):
        batch_indices = indices[i:min(i+batch_size,num_examples)]
        yield features[batch_indices], labels[batch_indices]

batch_size = 10

for X,y in data_iter(batch_size, features, labels):
    print(X,y)
    break

tensor([[-1.0328, -1.6753],
        [ 0.0146, -0.1582],
        [ 1.3496, -0.2746],
        [ 0.7416,  1.1490],
        [-0.6307, -1.3160],
        [-0.4223,  0.2959],
        [ 0.1045,  0.0867],
        [-0.2621,  1.2941],
        [-0.1359,  0.5364],
        [-0.9781, -0.6348]]) tensor([[ 7.8388],
        [ 4.7718],
        [ 7.8217],
        [ 1.7736],
        [ 7.4124],
        [ 2.3772],
        [ 4.1344],
        [-0.7325],
        [ 2.0993],
        [ 4.3911]])

3.定义初始化模型参数

w = torch.normal(0, 0.01, size=(2,1), requires_grad=True)
b = torch.zeros(1, requires_grad=True)

4.定义模型

def linreg(X,w,b):
    ''' 线性回归模型 '''
    return torch.matmul(X,w) + b

5.定义损失函数和优化算法

def squared_loss(y_hat,y,batch_size):
    '''
    y_hat:预测值
    '''
    return (y_hat - y.reshape(y_hat.shape))**2 / 2 / batch_size

def sgd(params,lr):
    ''' 小批量梯度下降算法 '''
    with torch.no_grad():
        for param in params:
            param -= lr * param.grad 
            param.grad.zero_()

6.开始训练

lr = 0.03
num_epochs = 3
net = linreg
loss = squared_loss

for epoch in range(num_epochs):
    for X,y in data_iter(batch_size, features,labels):
        l = loss(net(X,w,b),y,batch_size)
        l.sum().backward() # 求和目的：在神经网络中我们往往对标量进行求导，而不是对向量/矩阵，防止数据扩大
        sgd([w,b],lr)
    with torch.no_grad():
        train_l = loss(net(features,w,b),labels,batch_size)
        print(f'epoch {epoch+1},loss{float(train_l.mean()):f}')

epoch 1,loss0.000022
epoch 2,loss0.000005
epoch 3,loss0.000005

7.模型评估

print(f"w的估计误差：{true_w - w.reshape(true_w.shape)}")

w的估计误差：tensor([-0.0001,  0.0005], grad_fn=<SubBackward0>)

print(f'b的估计误差: {true_b - float(b)}')

b的估计误差: -0.0005381584167478692

线性回归的简洁实现

import numpy as np
import torch
from torch.utils import data
from d2l import torch as d2l

1.构造数据

true_w = torch.tensor([2, -3.4])
true_b = 4.2
features, labels = d2l.synthetic_data(true_w, true_b, 1000)

2.读取数据

def load_array(data_arrays, batch_size, is_train=True):
    ''' 构造一个数据迭代器 '''
    dataset = data.TensorDataset(*data_arrays)
    return data.DataLoader(dataset, batch_size, shuffle = is_train)

batch_size = 10
data_iter = load_array((features,labels), batch_size)
next(iter(data_iter))

[tensor([[-0.5843,  0.8177],
         [ 2.0447,  0.0471],
         [-0.4842,  0.4462],
         [-0.7347, -0.1442],
         [-0.1306, -1.0834],
         [-0.9230, -0.5122],
         [ 0.5734,  0.4148],
         [-0.8744, -0.8803],
         [ 0.2996, -1.8365],
         [-1.7872, -0.6060]]),
 tensor([[ 0.2532],
         [ 8.1311],
         [ 1.7064],
         [ 3.2185],
         [ 7.6259],
         [ 4.0917],
         [ 3.9358],
         [ 5.4591],
         [11.0403],
         [ 2.6870]])]

3.定义模型

from torch import nn

net = nn.Linear(2,1)

4.初始化模型参数（可以省略，因为pytorch自带参数初始化）

net.weight

Parameter containing:
tensor([[-0.0771,  0.0570]], requires_grad=True)

net.bias

Parameter containing:
tensor([-0.3867], requires_grad=True)

net.weight.data.normal_(0,0.01)
net.bias.data.fill_(0)

tensor([0.])

5.损失函数和优化算法

list(net.parameters())

[Parameter containing:
 tensor([[-0.0771,  0.0570]], requires_grad=True),
 Parameter containing:
 tensor([-0.3867], requires_grad=True)]

loss = nn.MSELoss()

trainer = torch.optim.SGD(net.parameters(), lr=0.03)

6.开始训练

num_epochs = 3

for epoch in range(num_epochs):
    for X,y in data_iter:
        l = loss(net(X), y)
        trainer.zero_grad()
        l.backward()
        trainer.step()
    l = loss(net(features), labels)
    print(f"epoch {epoch + 1}, loss{l:.6f}")

epoch 1, loss0.000093
epoch 2, loss0.000093
epoch 3, loss0.000094

7.评估模型

w = net.weight.data
print('w的估计误差：', true_w - w.reshape(true_w.shape))
b = net.bias.data
print('b的估计误差：', true_b - b)

w的估计误差： tensor([ 0.0001, -0.0009])
b的估计误差： tensor([1.0014e-05])

猜你喜欢

C# 程序性能提升篇-1、装箱和拆箱，枚举的ToString浅析
LabVIEW开发圆形分子识别的方法与例程
SAP Fiori 的 UI 新主题 Horizon
集成学习-幸福感预测案例分析
2PHP页面缓存
成功解决Cannot uninstall 'pywin32'. It is a distutils installed project and thus we cannot accurately de
SAP UI5函数节流(Throttle)的一个最简单的例子
JavaScript 循环语句入门详解
MySQL · 引擎特性 · 初识 MySQL GIS 及 InnoDB R-TREE
根据ABAP类方法的形式参数名，反查是哪个方法定义了该形式参数
mybatis使用全注解的方式案例（包含一对多关系映射）
Boost之async_read_some() callback(一百零四)
Containerd ctr、crictl、nerdctl 客户端命令介绍与实战操作
Centos7安装部署openstack--dashboard服务（计算节点）
我国 3G 用户历时 3 年多突破 2 亿户
linux apidoc的安装和使用
php手册总结《安装与配置》
[React Testing] Test timer related feature in React

相关主题

2022.27 AI架构师
Ai-符号
百度AI
AI update
AI 竞赛2022来啦
AI：AI是什么？
AI SaaS

zl程序教程

当前栏目

动手学AI——线性回归

文章目录

线性回归的从零实现

1.构造数据集

2.定义数据读取函数

3.定义初始化模型参数

4.定义模型

5.定义损失函数和优化算法

6.开始训练

7.模型评估

线性回归的简洁实现

1.构造数据

2.读取数据

3.定义模型

4.初始化模型参数（可以省略，因为pytorch自带参数初始化）

5.损失函数和优化算法

6.开始训练

7.评估模型

相关文章