Why PyTorch nn.Module.cuda() not moving Module tensor but only parameters and buffers to GPU?(为什么 PyTorch nn.Module.cuda() 不移动模块张量而只移动参数和缓冲区到 GPU?)
问题描述
nn.Module.cuda() 将所有模型参数和缓冲区移动到 GPU.
但为什么不是模型成员张量?
class ToyModule(torch.nn.Module):def __init__(self) ->没有任何:super(ToyModule, self).__init__()self.layer = torch.nn.Linear(2, 2)self.expected_moved_cuda_tensor = torch.tensor([0, 2, 3])def forward(self, input: torch.Tensor) ->火炬.张量:返回 self.layer(input)toy_module = ToyModule()toy_module.cuda()next(toy_module.layer.parameters()).device>>>设备(类型 ='cuda',索引 = 0)对于模型成员张量,设备保持不变.
<预><代码>>>>toy_module.expected_moved_cuda_tensor.device设备(类型='cpu')
如果你在模块内定义了一个张量,它需要被注册为参数或缓冲区,以便模块知道它.
<小时>Parameters 是要训练的张量,由 model.parameters() 返回.它们很容易注册,您需要做的就是将张量包装在 nn.Parameter 类型中,它将被自动注册.请注意,只有浮点张量可以作为参数.
class ToyModule(torch.nn.Module):def __init__(self) ->没有任何:super(ToyModule, self).__init__()self.layer = torch.nn.Linear(2, 2)# 将 expected_moved_cuda_tensor 注册为可训练参数self.expected_moved_cuda_tensor = torch.nn.Parameter(torch.tensor([0., 2., 3.]))def forward(self, input: torch.Tensor) ->火炬.张量:返回 self.layer(input)<小时>
Buffers 是将在模块中注册的张量,因此像 .cuda() 这样的方法会影响它们,但它们不会返回通过 model.parameters().缓冲区不限于特定的数据类型.
class ToyModule(torch.nn.Module):def __init__(self) ->没有任何:super(ToyModule, self).__init__()self.layer = torch.nn.Linear(2, 2)# 注册 expected_moved_cuda_tensor 作为缓冲区# 注意:这会创建一个名为 expected_moved_cuda_tensor 的新成员变量self.register_buffer('expected_moved_cuda_tensor', torch.tensor([0, 2, 3])))def forward(self, input: torch.Tensor) ->火炬.张量:返回 self.layer(input)<小时>
在上述两种情况下,以下代码的行为相同
<预><代码>>>>toy_module = ToyModule()>>>toy_module.cuda()>>>下一个(toy_module.layer.parameters()).device设备(类型 ='cuda',索引 = 0)>>>toy_module.expected_moved_cuda_tensor.device设备(类型 ='cuda',索引 = 0)nn.Module.cuda() moves all model parameters and buffers to the GPU.
But why not the model member tensor?
class ToyModule(torch.nn.Module):
def __init__(self) -> None:
super(ToyModule, self).__init__()
self.layer = torch.nn.Linear(2, 2)
self.expected_moved_cuda_tensor = torch.tensor([0, 2, 3])
def forward(self, input: torch.Tensor) -> torch.Tensor:
return self.layer(input)
toy_module = ToyModule()
toy_module.cuda()
next(toy_module.layer.parameters()).device
>>> device(type='cuda', index=0)
for the model member tensor, the device stays unchanged.
>>> toy_module.expected_moved_cuda_tensor.device
device(type='cpu')
If you define a tensor inside the module it needs to be registered as either a parameter or a buffer so that the module is aware of it.
Parameters are tensors that are to be trained and will be returned by model.parameters(). They are easy to register, all you need to do is wrap the tensor in the nn.Parameter type and it will be automatically registered. Note that only floating point tensors can be parameters.
class ToyModule(torch.nn.Module):
def __init__(self) -> None:
super(ToyModule, self).__init__()
self.layer = torch.nn.Linear(2, 2)
# registering expected_moved_cuda_tensor as a trainable parameter
self.expected_moved_cuda_tensor = torch.nn.Parameter(torch.tensor([0., 2., 3.]))
def forward(self, input: torch.Tensor) -> torch.Tensor:
return self.layer(input)
Buffers are tensors that will be registered in the module so methods like .cuda() will affect them but they will not be returned by model.parameters(). Buffers are not restricted to a particular data type.
class ToyModule(torch.nn.Module):
def __init__(self) -> None:
super(ToyModule, self).__init__()
self.layer = torch.nn.Linear(2, 2)
# registering expected_moved_cuda_tensor as a buffer
# Note: this creates a new member variable named expected_moved_cuda_tensor
self.register_buffer('expected_moved_cuda_tensor', torch.tensor([0, 2, 3])))
def forward(self, input: torch.Tensor) -> torch.Tensor:
return self.layer(input)
In both of the above cases the following code behaves the same
>>> toy_module = ToyModule()
>>> toy_module.cuda()
>>> next(toy_module.layer.parameters()).device
device(type='cuda', index=0)
>>> toy_module.expected_moved_cuda_tensor.device
device(type='cuda', index=0)
这篇关于为什么 PyTorch nn.Module.cuda() 不移动模块张量而只移动参数和缓冲区到 GPU?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:为什么 PyTorch nn.Module.cuda() 不移动模块张量而只移
基础教程推荐
- numpy float:比算术运算中内置的慢 10 倍? 2022-01-01
- Discord.py 缺少必需的参数 2022-01-01
- 将 x 轴刻度更改为自定义字符串 2022-01-01
- 使用生成器和迭代器时 Python 多循环失败 2022-01-01
- pyserial - 可以从线程 a 写入串行端口,是否阻塞从线程 b 读取? 2022-01-01
- 与常规 dict 相比,Python manager.dict() 非常慢 2022-01-01
- 用 Python 编写 Fortran 无格式文件 2022-01-01
- 在 Celery 工作人员中捕获 Heroku SIGTERM 以优雅地关 2022-01-01
- 尝试制作WhatsApp机器人 2022-01-01
- 由Python将MP3转换为MIDI(类型错误:无法加载插件:mtg-Melodia:Melodia) 2022-01-01
