A Tensor is a container that can hold an N dimensional data structure.

Neural Networks love numbers. In fact that’s all they understand. GPUs are great at handling numbers. And they can operate on many numbers in parallel. Therefore a key idea in machine learning is to group numbers together and create a Tensor that can be handed over to the GPU.

Arrays and tensors

  • An array is a one dimensional data structure and a tensor that has a single dimension is called a rank 1 tensor.
  • A matrix is a two dimensional data structure and a tensor that has two dimensions is called a rank 2 tensor.
  • A stack of matrices can be thought of as a three dimensional data structure and a tensor that three dimensions is called a rank 3 tensor.

Enough text, let’s look at some code.

We will start with tensors from Pytorch and look at the available operations. Towards the end of the article we will implement a Tensor class from scratch.

Pytorch

Here is a quick copy pasta on PyTorch.

PyTorch is a fully featured framework for building deep learning models, which is a type of machine learning that’s commonly used in applications like image recognition and language processing. Written in Python, it’s relatively easy for most machine learning developers to learn and use.

Moving on. Let’s get the bread board ready.

import torch

Visit Pytorch docs for installation instructions.

Creating a tensor

You can create a tensor by passing in a array to the tensor method from pytorch.

torch.tensor([1, 2, 3])

The shape of the input data defines the shape of the tensor.

print(torch.tensor([1, 2, 3]).shape)
print(torch.tensor([[1, 2, 3]]).shape)
print(torch.tensor([[1, 2, 3], [1, 2, 3]]).shape)

# Output
# torch.Size([3])
# torch.Size([1, 3])
# torch.Size([2, 3])

A more common use case is to create a tensor of a given shape. Pytorch provides 2 ways to do that

  • torch.randn()
  • torch.zeros()
print(torch.randn(1, 3))
print(torch.zeros(1, 3))

# Output
# tensor([[ 1.4361,  1.5225, -0.1090]])
# tensor([[0., 0., 0.]])

Scalar operations

We can perform arithmetic operations between a tensor and a number. This works because of broadcasting and the magic of Dunder methods we saw in the post on Langchain.

x = torch.tensor([1, 2, 3])
print(x * 3)
print(x + 2)

# Output
# tensor([3, 6, 9])
# tensor([3, 4, 5])

Element-wise operations

We can also perform element-wise operations between tensors of the same shape.

x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
print(x + y)
print(x * y)

# Output
# tensor([5, 7, 9])
# tensor([ 4, 10, 18])

Matrix operations

For matrix operations like matrix multiplication, PyTorch provides the torch.matmul() function.

a = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([[2, 0], [1, 2]])
print(torch.matmul(a, b))

# Output
# tensor([[ 4,  4],
#         [10,  8]])

Reshaping tensors

In PyTorch, we can reshape tensors using various methods. Let’s explore some of them

➡️ Reshape

The reshape() method allows us to change the shape of a tensor to another compatible shape. The number of elements in both the shapes should remain the same.

x = torch.tensor([[1, 2, 3], [4, 5, 6]])
print(torch.reshape(x, (3, 2)))

# Output
# tensor([[1, 2],
#         [3, 4],
#         [5, 6]])

➡️ Using [:, None]

[:, None] adds a new dimension towards the end of the tensor. This is useful when we want to convert a 1D tensor into a 2D column tensor. This is similar to adding a new axis in numpy.

x = torch.tensor([1, 2, 3, 4, 5])
print(x[:, None])

# Output:
# tensor([[1],
#         [2],
#         [3],
#         [4],
#         [5]])

➡️ Squeeze

We can use squeeze to remove single-dimensional entries. This is useful when we want to reduce the number of dimensions in a tensor. It removes all dimensions of size 1.

x = torch.tensor([[[1, 2, 3]]])
print(x.squeeze())

# Output:
# tensor([1, 2, 3])

➡️ Unsqueeze

We can use unsqueeze to add a new dimension. This is useful when you want to add a dimension of size 1 at a specific position.

x = torch.tensor([1, 2, 3])

# Adds a dimension of size 1 at position 0
print(x.unsqueeze(0))  

# Output:
# tensor([[1, 2, 3]])

You can also use negative indexing to unsqueeze at the end:

print(x.unsqueeze(-1))  # Adds a dimension of size 1 at the end

#Output:
# tensor([[1],
#         [2],
#         [3]])

The operations we have seen above should cover most of the things you run into. You can try out this Kaggle notebook to experiment with these operations yourself.