This series would not be a general PyTorch introduction or detailed tutorials. Instead, this would be a very practical introduction to some common basics needed for Implementing Hand-Written Modules.
This is the First Section of this series, we would like to introduce some tensor basics, including: tensor attributes, tensor creation, and some other things. All the things I mentioned will be practical, but not exhaustive.
1. Tensor attributes
We introduce 5 key attributes for torch.tensor
a
here:
1.1 a.shape
a.shape
: Returns the shape ofa
. The return type istorch.Size
. Example:
1 | a = torch.randn(10, 20) # create a 10x20 tensor |
The torch.Size
object supports some tricks:
1 | # unpack |
1 | # unpack in function calls |
1.2 a.ndim
a.ndim
: Returns number of dimensions ofa
.
It looks like len(a.shape)
. It also has a function version, called a.ndimension()
1 | a.ndim |
1.3 a.device
a.device
: Returns where thea
locates.
1 | a.device |
Convert to CUDA by using a = a.to('cuda:0')
. Convert back to CPU by using a = a.to('cpu')
or a = a.cpu()
.
1.4 a.dtype
a.dtype
: Returns the data type ofa
.
The data type of tensor a
. It’s very important in PyTorch! Usually, the data type would be torch.float32
or torch.int64
. Some data type convert method:
1 | # to float32 |
1 | # to int64 |
1 | # to int32 |
1 | # Also, we can use .to() as well: |
1.5 a.numel
a.numel()
: Returns number of elements ina
. Usually used in counting number of parameters in the model.
1 | a.numel() |
1 | import torchvision |
2. Tensor creation
PyTorch tensors plays key role in writing deep learning programs. Usually, tensor are from two types: data and auxiliary variables (e.g., masks).
2.1 From data
For the data tensor, they are usually converted from other packages, such as numpy
. We have several methods to convert it to torch.tensor
.
torch.tensor(arr)
Returns a deep copy ofarr
, i.e., the storage data is independent witharr
. (Very memory and time consuming, not recommended for most cases)torch.from_numpy(arr)
Returns a shallow copy tensor, i.e., the storage data is shared witharr
.torch.as_tensor(arr, dtype=..., device=...)
Ifdtype
anddevice
is same asarr
, then it behaves liketorch.from_numpy()
function, shallow copy. Otherwise, it acts liketorch.tensor()
, deep copy. So using the function is recommended.
2.2 Special tensors
For the special tensors, PyTorch provides some common methods:
- Linear:
We have torch.linspace
and torch.arange
. They are easy to understand. Please see the docs linspace and arange.
- Random:
1 | torch.randn(1, 2) # normal distribution, shape 1x2 |
These functions also support passing in torch.Size()
or a sequence as the size parameter.
1 | a = torch.randn(10, 10) # a is in shape 10x10 |
- Special tensors:
1 | torch.zeros(10, 10) # all zero tensor, shape 10x10 |
xxx_like()
PyTorch has a series of function looks like xxx_like()
, such as ones_like()
, zeros_like()
, randn_like()
. These functions generates the tensor with the name, and the dtype
and device
and layout
is same as the passing-in tensor.
torch.rand_like(input)
is equivalent to torch.rand(input.size(), dtype=input.dtype, layout=input.layout, device=input.device)
.
An example:
1 | arr = torch.tensor([1,2,3], dtype=torch.float64) |