PyTorch实战入门

最常用命名规范小结

在这里插入图片描述

常用的程序库

在这里插入图片描述

文件组织

不要将所有的层和模型放在同一个文件中。最好的做法是将最终的网络分离到独立的文件（networks.py）中，并将层、损失函数以及各种操作保存在各自的文件中（layers.py，losses.py，ops.py）。最终得到的模型（由一个或多个网络组成）应该用该模型的名称命名（例如，yolov3.py，DCGAN.py），且引用各个模块。主程序、单独的训练和测试脚本应该只需要导入带有模型名字的 Python 文件。

PyTorch 开发风格与技巧

最好将网络分解为更小的可复用的片段。一个 nn.Module 网络包含各种操作或其它构建模块。损失函数也是包含在 nn.Module 内，因此它们可以被直接整合到网络中。
继承 nn.Module 的类必须拥有一个「forward」方法，它实现了各个层或操作的前向传导。
一个 nn.module 可以通过「self.net(input)」处理输入数据。在这里直接使用了对象的「call()」方法将输入数据传递给模块。

1	output = self.net(input)

Tensor的认识

Tensor可以认为是一个高维数组，和Numpy相似，但Tensor可以用GPU加速；
Tensor与Numpy之间的转换，互操作比较容易且快速，Tensor不支持的操作，可以先转换为Numpy数组处理，之后再转回Tensor。

Variable的了解

Variable是Pytorch中autograd自动微分模块的核心，它封装了Tensor,支持几乎所有的tensor操作。主要包含如下3个属性：

data: 保存Variable所包含的Tensor。
grad: 保存data对应的梯度，grad也是一个Variable，而不是一个Tensor，和data的形状一样。

grad_fn: 指向一个Function对象，这个Function用来反向传播计算输入的梯度。

nn.Module模块详解

# coding=utf-8
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable

"""
torch.nn是专门为神经网络设计的模块化接口。nn构建于autograd之上，可以用来定义和运行神经网络。
nn.Module是nn中十分重要的类,包含网络各层的定义及forward方法。
在定义网络的时候，如果层内有Variable,那么用nn定义，反之，则用nn.functional定义。
定义自已的网络：
    需要继承nn.Module类，并实现forward方法。
    一般把网络中具有可学习参数的层放在构造函数__init__()中，
    不具有可学习参数的层(如ReLU)可放在构造函数中，也可不放在构造函数中(而在forward中使用nn.functional来代替)
    只要在nn.Module的子类中定义了forward函数，backward函数就会被自动实现(利用Autograd)。
    在forward函数中可以使用任何Variable支持的函数，毕竟在整个pytorch构建的图中，是Variable在流动。还可以使用if,for,print,log等python语法.
    注：Pytorch基于nn.Module构建的模型中，只支持mini-batch的Variable输入方式，
    比如，只有一张输入图片，也需要变成 N x C x H x W 的形式：
    
    input_image = torch.FloatTensor(1, 28, 28)
    input_image = Variable(input_image)
    input_image = input_image.unsqueeze(0)   # 1 x 1 x 28 x 28
    
"""
 
 
class LeNet(nn.Module):
    def __init__(self):
        # nn.Module的子类函数必须在构造函数中执行父类的构造函数
        #self指的是类实例对象本身(注意：不是类本身)
        super(LeNet, self).__init__()   # 等价与nn.Module.__init__()
 		# super() 函数是用于调用父类(超类)的一个方法
        # nn.Conv2d返回的是一个Conv2d class的一个对象，该类中包含forward函数的实现
        # 当调用self.conv1(input)的时候，就会调用该类的forward函数
        self.conv1 = nn.Conv2d(1, 6, (5, 5))   # output (N, C_{out}, H_{out}, W_{out})`
        self.conv2 = nn.Conv2d(6, 16, (5, 5))
        self.fc1 = nn.Linear(256, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        # nn.Linear(inputSize, outputSize)输入和输出节点数
 
    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))  # F.max_pool2d的返回值是一个Variable
        x = F.max_pool2d(F.relu(self.conv2(x)), (2, 2))
        # torch.nn.functional.max_pool2d(input, kernel_size, stride=None, padding=0, dilation=1, ceil_mode=False, return_indices=False)
        x = x.view(x.size()[0], -1)
        # x = x.view(x.size()[0], -1)将多维度的tensor展平为一维，forward()函数中，input首先经过卷积层，此时的输出x是包含batchsize维度为4的tensor，即(batchsize，channels，x，y)，x.size(0)指batchsize的值。
        # x = x.view(x.size(0), -1)简化x = x.view(batchsize, -1)。
        # view()函数的功能根reshape类似，用来转换size大小。x = x.view(batchsize, -1)中batchsize指转换后有几行，而-1指在不告诉函数有多少列的情况下，根据原tensor数据和batchsize自动分配列数。
        # -1是自适应的意思，x.size(0)是batch size，比如原来的数据一共12个，batch size为2，就会view成2*6，batch size为4，就会就会view成4*3。
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        # torch.nn.functional.relu(input, inplace=False)
        # 返回值也是一个Variable对象
        return x
 
 
def output_name_and_params(net):
    for name, parameters in net.named_parameters():
        print('name: {}, param: {}'.format(name, parameters))
 # 查看可训练的参数，包括从父类中继承的
 
if __name__ == '__main__':
    net = LeNet()
    print('net: {}'.format(net))   # str.format()，它增强了字符串格式化
    # "{1} {0} {1}".format("hello", "world")  # 设置指定位置
    # 上面语句输出结果：'world hello world'
    params = net.parameters()   # generator object
    print('params: {}'.format(params))
    output_name_and_params(net)
 
    input_image = torch.FloatTensor(10, 1, 28, 28)
    
    # 和tensorflow不一样，pytorch中模型的输入是一个Variable，而且是Variable在图中流动，不是Tensor。
    # 这可以从forward中每一步的执行结果可以看出
    input_image = Variable(input_image)
    # Variable是篮子，而tensor是鸡蛋，鸡蛋应该放在篮子里才能方便拿走（定义variable时一个参数就是tensor）
    output = net(input_image)
    print('output: {}'.format(output))
    print('output.size: {}'.format(output.size()))

PyTorch环境下的一个简单网络

class ConvBlock(nn.Module):
    def __init__(self):
        super(ConvBlock, self).__init__()
        block = [nn.Conv2d(...)]
        block += [nn.ReLU()]
        block += [nn.BatchNorm2d(...)]
        self.block = nn.Sequential(*block)
# 复用了简单的循环构建模块（如卷积块 ConvBlocks），它们由相同的循环模式（卷积、激活函数、归一化）组成，并装入独立的 nn.Module 中。

    def forward(self, x):
        return self.block(x)

class SimpleNetwork(nn.Module):
    def __init__(self, num_resnet_blocks=6):
        super(SimpleNetwork, self).__init__()
        # here we add the individual layers
        layers = [ConvBlock(...)]
        for i in range(num_resnet_blocks):
            layers += [ResBlock(...)]
        self.net = nn.Sequential(*layers)
# 构建了一个所需要层的列表，并最终使用「nn.Sequential()」将所有层级组合到了一个模型中。我们在 list 对象前使用「*」操作来展开它。
# 必须使用「*」，若不使用，则会报错：list is not a Module subclass

    def forward(self, x):
        return self.net(x)
# 在前向传导过程中，我们直接使用输入数据运行模型

PyTorch 环境下的简单残差网络

ResNet 模块的跳跃连接直接在前向传导过程中实现，PyTorch 允许在前向传导过程中进行动态操作。

class ResnetBlock(nn.Module):
    def __init__(self, dim, padding_type, norm_layer, use_dropout, use_bias):
        super(ResnetBlock, self).__init__()
        self.conv_block = self.build_conv_block(...)

    def build_conv_block(self, ...):
        conv_block = []

        conv_block += [nn.Conv2d(...),
                       norm_layer(...),
                       nn.ReLU()]
        if use_dropout:
            conv_block += [nn.Dropout(...)]

        conv_block += [nn.Conv2d(...),
                       norm_layer(...)]

        return nn.Sequential(*conv_block)

    def forward(self, x):
        out = x + self.conv_block(x)
        return out

Pytorch中数据加载—Dataset类和DataLoader类

PyTorch 环境下的带多个输出的网络

对于有多个输出的网络（例如使用一个预训练好的 VGG 网络构建感知损失），我们使用以下模式:

class Vgg19(torch.nn.Module):
  def __init__(self, requires_grad=False):
    super(Vgg19, self).__init__()
    vgg_pretrained_features = models.vgg19(pretrained=True).features
    # 使用由「torchvision」包提供的预训练模型
    self.slice1 = torch.nn.Sequential()
    self.slice2 = torch.nn.Sequential()
    self.slice3 = torch.nn.Sequential()
    # 将一个网络切分成三个模块，每个模块由预训练模型中的层组成

    for x in range(7):
        self.slice1.add_module(str(x), vgg_pretrained_features[x])
    for x in range(7, 21):
        self.slice2.add_module(str(x), vgg_pretrained_features[x])
    for x in range(21, 30):
        self.slice3.add_module(str(x), vgg_pretrained_features[x])
    if not requires_grad:
        for param in self.parameters():
            param.requires_grad = False
            # 通过设置「requires_grad = False」来固定网络权重

  def forward(self, x):
    h_relu1 = self.slice1(x)
    h_relu2 = self.slice2(h_relu1)        
    h_relu3 = self.slice3(h_relu2)        
    out = [h_relu1, h_relu2, h_relu3]
    return out
    # 返回一个带有三个模块输出的 list

自定义损失函数

即使 PyTorch 已经具有了大量标准损失函数，你有时也可能需要创建自己的损失函数。为了做到这一点，你需要创建一个独立的「losses.py」文件，并且通过扩展「nn.Module」创建你的自定义损失函数：

class CustomLoss(torch.nn.Module):

    def __init__(self):
        super(CustomLoss,self).__init__()

    def forward(self,x,y):
        loss = torch.mean((x - y)**2)
        return loss

Dataset类和DataLoader类

# coding=utf-8
import matplotlib as mpl
mpl.use('tkagg')   # 调试：agg;  运行： tkagg
import matplotlib.pyplot as plt
 
import os
import pandas as pd 
# pandas 是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的
import torch
 
"""
torch.utils.data.Dataset 是一个表示数据集的抽象类.
你自己的数据集一般应该继承``Dataset``, 并且重写下面的方法:
    1. __len__ 使用``len(dataset)`` 可以返回数据集的大小
    2. __getitem__ 支持索引, 以便于使用 dataset[i] 可以 获取第i个样本(0索引)
"""
from torch.utils.data import Dataset
 
 
"""
torch.utils.data中的DataLoader提供为Dataset类对象提供了:
    1.批量读取数据
    2.打乱数据顺序
    3.使用multiprocessing并行加载数据
    
    DataLoader中的一个参数collate_fn：可以使用它来指定如何精确地读取一批样本，
     merges a list of samples to form a mini-batch.
    然而，默认情况下collate_fn在大部分情况下都表现很好
"""
from torch.utils.data import DataLoader
from torchvision import transforms, utils
from skimage import io, transform
import numpy as np
 
 
def just_see_face_dataset():
    """
    摸一下数据
    :return: 
    """
    landmarks_frame = pd.read_csv('./faces/face_landmarks.csv')
    n = 65
    img_name = landmarks_frame.iloc[n, 0]
    landmarks = landmarks_frame.iloc[n, 1:].as_matrix()    # as_matrix()
    landmarks = landmarks.astype('float').reshape(-1, 2)
    print('img_name: {}'.format(img_name))
    print('landmarks shape: {}'.format(landmarks.shape))
    print('first 4 landmarks: {}'.format(landmarks[:4]))
 
    plt.figure()
    show_landmarks(io.imread(os.path.join('faces', img_name)), landmarks)
    plt.show()
 
 
def show_landmarks(image, landmarks):
    """
    显示一张图片和它对应的标记点
    :param image:
    :param landmarks:
    :return:
    """
    plt.imshow(image)
    plt.scatter(landmarks[:, 0], landmarks[:, 1], s=10, marker='.', c='r')
    plt.pause(0.001)
 
 
class FaceLandmarksDataset(Dataset):
    def __init__(self, csv_file, root_dir, transform=None):
        self.landmarks_frame = pd.read_csv(csv_file)
        self.root_dir = root_dir
        self.transform = transform
 
    def __len__(self):
        """
        继承 Dataset 类后,必须重写的一个方法
        返回数据集的大小
        :return:
        """
        return len(self.landmarks_frame)
 
    def __getitem__(self, idx):
        """
        继承 Dataset 类后,必须重写的一个方法
        返回第 idx 个图像及相关信息
        :param idx:
        :return:
        """
        img_name = os.path.join(self.root_dir, self.landmarks_frame.iloc[idx, 0])
        image = io.imread(img_name)
        landmarks = self.landmarks_frame.iloc[idx, 1:].as_matrix()
        landmarks = landmarks.astype('float').reshape(-1, 2)
        sample = {'image': image, 'landmarks': landmarks}
 
        if self.transform:
            sample = self.transform(sample)
 
        return sample
 
 
def t_dataset():
    """
    测试 FaceLandmarksDataset 类的使用
    :return: 
    """
    # 实列化 FaceLandmarksDataset 类
    face_dataset = FaceLandmarksDataset(csv_file='./faces/face_landmarks.csv', root_dir='./faces')
    fig = plt.figure()
    length_dataset = len(face_dataset)
 
    for i in range(length_dataset):
        # 注: Dataset 类对象可以直接索引[i]访问
        sample = face_dataset[i]
        print(i, sample['image'].shape, sample['landmarks'].shape)
 
        ax = plt.subplot(1, 4, i + 1)
        plt.tight_layout()
        ax.set_title('sample #{}'.format(i))
        ax.axis('off')
        show_landmarks(sample['image'], sample['landmarks'])
        if i == 3:
            plt.show()
            break
 
 
"""Transform操作"""
class Rescale(object):
    """按照给定尺寸更改一个图像的尺寸
    Args:
        output_size (tuple or int): 要求输出的尺寸.  如果是个元组类型, 输出
        和output_size匹配. 如果时int类型,图片的短边和output_size匹配, 图片的
        长宽比保持不变.
    """
 
    def __init__(self, output_size):
        assert isinstance(output_size, (int, tuple))
        self.output_size = output_size
 
    def __call__(self, sample):
        image, landmarks = sample['image'], sample['landmarks']
 
        h, w = image.shape[:2]
        if isinstance(self.output_size, int):
            if h > w:
                new_h, new_w = self.output_size * h / w, self.output_size
            else:
                new_h, new_w = self.output_size, self.output_size * w / h
        else:
            new_h, new_w = self.output_size
 
        new_h, new_w = int(new_h), int(new_w)
 
        img = transform.resize(image, (new_h, new_w))
 
        # 对于标记点, h和w需要交换位置, 因为对于图像, x和y分别时第1维和第0维
        landmarks = landmarks * [new_w / w, new_h / h]
 
        # 返回值实际上也是一个sample
        return {'image': img, 'landmarks': landmarks}
 
 
class RandomCrop(object):
    """随机裁剪图片
    Args:
        output_size (tuple or int): 期望输出的尺寸, 如果时int类型, 裁切成正方形.
    """
 
    def __init__(self, output_size):
        assert isinstance(output_size, (int, tuple))
        if isinstance(output_size, int):
            self.output_size = (output_size, output_size)
        else:
            assert len(output_size) == 2
            self.output_size = output_size
 
    def __call__(self, sample):
        image, landmarks = sample['image'], sample['landmarks']
 
        h, w = image.shape[:2]
        new_h, new_w = self.output_size
 
        top = np.random.randint(0, h - new_h)
        left = np.random.randint(0, w - new_w)
 
        image = image[top: top + new_h,
                      left: left + new_w]
 
        landmarks = landmarks - [left, top]
 
        # 返回值实际上也是一个sample
        return {'image': image, 'landmarks': landmarks}
 
 
class ToTensor(object):
    """
    将 ndarray 的样本转化为 Tensor 的样本
    """
    def __call__(self, sample):
        image, landmarks = sample['image'], sample['landmarks']
 
        # 交换轴，因为 numpy 图片：H x W x C, torch输入图片要求： C x H x W
        image = image.transpose((2, 0, 1))
        return {'image': torch.from_numpy(image), 'landmarks': torch.from_numpy(landmarks)}
 
 
def use_transoform(one_sample):
    """
    演示如何使用 transform: 把几种 transform 组合在一起
    :return: 
    """
    # transforms.Compose 只是将这两种tranform组合在一起，按顺序对sample进行处理
    composed = transforms.Compose([Rescale(256), RandomCrop(224)])
    transfromed_sample = composed(one_sample)
    plt.figure()
    show_landmarks(transfromed_sample['image'], transfromed_sample['landmarks'])
    plt.show()
 
 
def union_all_knowledge():
    """
    迭代整个数据集：
        每次迭代数据，都会1.从文件中读取图像    2.对所读取的图像应用上述变换transform。 从而对数据集进行增强操作
    :return: 
    """
    transformed_dataset = FaceLandmarksDataset(csv_file='./faces/face_landmarks.csv', root_dir='./faces',
                                               transform=transforms.Compose([
                                                   Rescale(256),
                                                   RandomCrop(225),
                                                   ToTensor()]))
    for i in range(len(transformed_dataset)):
        sample = transformed_dataset[i]
        print(i, sample['image'].size(), sample['landmarks'].size())
        if i == 3:
            break
 
 
def t_dataloader():
    transformed_dataset = FaceLandmarksDataset(csv_file='./faces/face_landmarks.csv', root_dir='./faces',
                                               transform=transforms.Compose([
                                                   Rescale(256),
                                                   RandomCrop(225),
                                                   ToTensor()]))
    dataloader = DataLoader(transformed_dataset, batch_size=4, shuffle=True, num_workers=2)
 
    # 对dataloader对象进行迭代，读取数据
    for i_batch, sample_batched in enumerate(dataloader):
        image_batch, landmarks_batch = sample_batched['image'], sample_batched['landmarks']
        print('i_batch: {}, image_batch.size(): {}, landmarks_batch.size(): {}'.format(
            i_batch, image_batch.size(), landmarks_batch.size()))
 
 
if __name__ == '__main__':
    # just_see_face_dataset()
 
    # t_dataset()
 
    # face_dataset = FaceLandmarksDataset(csv_file='./faces/face_landmarks.csv', root_dir='./faces')
    # one_sample = face_dataset[0]
    # use_transoform(one_sample)
 
    # union_all_knowledge()
 
    t_dataloader()

训练模型的最佳代码结构

对于训练的最佳代码结构，我们需要使用以下两种模式：

使用 prefetch_generator 中的 BackgroundGenerator 来加载下一个批量数据

使用 tqdm 监控训练过程，并展示计算效率，这能帮助我们找到数据加载流程中的瓶颈

# import statements
import torch
import torch.nn as nn
from torch.utils import data
...

# set flags / seeds
torch.backends.cudnn.benchmark = True
# torch.backends.cudnn.benchmark = True 在程序刚开始加这条语句可以提升一点训练速度，没什么额外开销。我一般都会加
np.random.seed(1)
torch.manual_seed(1)
torch.cuda.manual_seed(1)
...
# 每次运行代码时设置相同的seed，则每次生成的随机数也相同，如果不设置seed，则每次生成的随机数都会不一样。

# Start with main code
if __name__ == '__main__':
    # argparse for additional flags for experiment
    """
    argparse是python用于解析命令行参数和选项的标准模块
    argparse使用步骤：
    1：import argparse  # 首先导入该模块
    2：parser = argparse.ArgumentParser() # 然后创建一个解析对象
    3：parser.add_argument() # 向该对象中添加你要关注的命令行参数和选项，然后每一个add_argument方法对应一个你要关注的参数或选项
    4：parser.parse_args() # 最后调用parse_args()方法进行解析，解析成功之后即可使用
    """
    parser = argparse.ArgumentParser(description="Train a network for ...")
    ...
    opt = parser.parse_args() 

    # add code for datasets (we always use train and validation/ test set)
    data_transforms = transforms.Compose([
        transforms.Resize((opt.img_size, opt.img_size)),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
    ])

    train_dataset = datasets.ImageFolder(
        root=os.path.join(opt.path_to_data, "train"),
        transform=data_transforms)
    train_data_loader = data.DataLoader(train_dataset, ...)

    test_dataset = datasets.ImageFolder(
        root=os.path.join(opt.path_to_data, "test"),
        transform=data_transforms)
    test_data_loader = data.DataLoader(test_dataset ...)
    ...

    # instantiate network (which has been imported from *networks.py*)
    net = MyNetwork(...)
    ...

    # create losses (criterion in pytorch)
    criterion_L1 = torch.nn.L1Loss()
    ...

    # if running on GPU and we want to use cuda move model there
    use_cuda = torch.cuda.is_available()
    if use_cuda:
        net = net.cuda()
        ...

    # create optimizers
    optim = torch.optim.Adam(net.parameters(), lr=opt.lr)
    ...

    # load checkpoint if needed/ wanted
    start_n_iter = 0
    start_epoch = 0
    if opt.resume:
        ckpt = load_checkpoint(opt.path_to_checkpoint) # custom method for loading last checkpoint
        net.load_state_dict(ckpt['net'])
        start_epoch = ckpt['epoch']
        start_n_iter = ckpt['n_iter']
        optim.load_state_dict(ckpt['optim'])
        print("last checkpoint restored")
        ...

    # if we want to run experiment on multiple GPUs we move the models there
    net = torch.nn.DataParallel(net)
    ...

    # typically we use tensorboardX to keep track of experiments
    writer = SummaryWriter(...)

    # now we start the main loop
    n_iter = start_n_iter
    for epoch in range(start_epoch, opt.epochs):
        # set models to train mode
        net.train()
        ...

        # use prefetch_generator and tqdm for iterating through data
        pbar = tqdm(enumerate(BackgroundGenerator(train_data_loader, ...)),
                    total=len(train_data_loader))
        start_time = time.time()

        # for loop going through dataset
        for i, data in pbar:
            # data preparation
            img, label = data
            if use_cuda:
                img = img.cuda()
                label = label.cuda()
            ...

            # It's very good practice to keep track of preparation time and computation time using tqdm to find any issues in your dataloader
            prepare_time = start_time-time.time()

            # forward and backward pass
            optim.zero_grad()
            ...
            loss.backward()
            optim.step()
            ...

            # udpate tensorboardX
            writer.add_scalar(..., n_iter)
            ...

            # compute computation time and *compute_efficiency*
            process_time = start_time-time.time()-prepare_time
            pbar.set_description("Compute efficiency: {:.2f}, epoch: {}/{}:".format(
                process_time/(process_time+prepare_time), epoch, opt.epochs))
            start_time = time.time()

        # maybe do a test pass every x epochs
        if epoch % x == x-1:
            # bring models to evaluation mode
            net.eval()
            ...
            #do some tests
            pbar = tqdm(enumerate(BackgroundGenerator(test_data_loader, ...)),
                    total=len(test_data_loader)) 
            for i, data in pbar:
                ...

            # save checkpoint if needed
            ...

使用PyTorch注意事项

在「nn.Module」的「forward」方法中避免使用 Numpy 代码。Numpy 是在 CPU 上运行的，它比 torch 的代码运行得要慢一些。由于 torch 的开发思路与 numpy 相似，所以大多数 Numpy 中的函数已经在 PyTorch 中得到了支持。
将「DataLoader」从主程序的代码中分离。载入数据的工作流程应该独立于你的主训练程序代码。PyTorch 使用「background」进程更加高效地载入数据，而不会干扰到主训练进程。
使用命令行参数。使用命令行参数设置代码执行时使用的参数（batch 的大小、学习率等）非常方便。一个简单的实验参数跟踪方法，即直接把从「parse_args」接收到的字典（dict 数据）打印出来：
1
2
3
# saves arguments to config.txt file
opt = parser.parse_args()with open("config.txt", "w") as f:
f.write(opt.__str__())
如果可能的话，请使用「Use .detach()」从计算图中释放张量。为了实现自动微分，PyTorch 会跟踪所有涉及张量的操作。请使用「.detach()」来防止记录不必要的操作。
使用「.item()」打印出标量张量。你可以直接打印变量。然而，我们建议你使用「variable.detach()」或「variable.item()」。在早期版本的 PyTorch（< 0.4）中，你必须使用「.data」访问变量中的张量值。
使用「call」方法代替「nn.Module」中的「forward」方法。这两种方式并不完全相同.原文链接：https://github.com/IgorSusmelj/pytorch-styleguide
1
2
3
output = self.net.forward(input)
# they are not equal!
output = self.net(input)

参考文献：

[1][https://mp.weixin.qq.com/s/6OxnjoaR2SQINKk9U_OrtQ ]
[2][https://blog.csdn.net/u012609509/article/details/81203436 ]
[3][https://blog.csdn.net/u012609509/article/details/81203308 ]
[4][https://blog.csdn.net/u012609509/article/details/81203376 ]
[5][https://blog.csdn.net/u012609509/article/details/81264687 ]
[6][https://blog.csdn.net/GZHermit/article/details/78730856 ]