PyTorch is an open-source machine learning library developed on Python. It provides a simple and intuitive programming environment for developing Deep Learning models.
The use of PyTorch is becoming increasingly important for developers because it has proven to be a very effective tool for artificial intelligence and machine learning development. It offers easy integration with other applications and tools, flexible data processing and strong GPU acceleration. Because of its ease of use and high performance, it has become one of the most widely used libraries in deep learning model development. PyTorch’s features can be summarised as follows:
- Easy to learn and use: PyTorch uses a simple and intuitive programming environment that allows developers to quickly start developing deep learning models.
- Flexible Data Processing: PyTorch provides flexible data processing that allows developers to process data in the formats that best suit their applications.
- GPU Acceleration: PyTorch harnesses the power of modern GPUs for fast processing of data and development of complex deep learning models.
- Integrability: PyTorch integrates easily with other applications and tools, allowing developers to integrate their deep learning models into existing workflows.
- Active Community: PyTorch has an active community of developers who are constantly developing new features and enhancements, which constantly improves the performance and flexibility of the library.
- Dynamic Computational Graph: PyTorch provides a dynamic computational graph that allows developers to change and improve their models as they run. 7th Transfer Learning Support: PyTorch provides easy support for transfer learning, allowing developers to use and improve existing deep learning models for new applications.
PyTorch is now the gold standard in machine learning in 2023 and is one of the two big frameworks for it, along with TensorFlow.
From Facebook to number 1
PyTorch was developed by Facebook and first released in 2017. The development of PyTorch began as a replacement for the machine learning framework Torch used at the time. The original Torch code was written in C (CUDA). An SDK based on LuaJIT served as the interface language. In practice, however, there were some hurdles in using Torch, especially in integrating it with other existing tools and libraries.
To address these issues, Facebook’s developers decided to write a new framework that built on the best features of Torch while taking advantage of Python. The result was PyTorch, an open-source machine learning library that provides a simple programming environment and relatively flexible data processing. Hot Stuff! 🤓
Jupyter Notebooks
Before jumping in with a PyTorch example, we need to understand what PyTorch is best used with, i.e. on which platform or machine the deep learning model is written, modelled, trained and evaluated. Of course, one could simply install PyTorch on one’s computer and use existing Python-enabled IDEs such as JetBrain’s PyCharm, JetBrain’s DataSpells, or even Microsoft’s VSCode.
The common thread that is at the forefront of our choices is the integrated nature of Jupyter Notebook. For those who read our TechUps regularly and still have little intersection with Data Science or Machine Learning, the term “Jupyter Notebook” might be new. At this point, it should be said that Jupyter Notebooks are the heart and basic tool of any Data Scientist or Data Engineer.
Jupyter Notebook is a web-based, interactive environment for creating documents that contain live code, descriptive prose, visualisations and other multimedia content. This allows the Data Scientist to write code, run it and see the results immediately. This makes it easy to use and keeps iteration cycles short. In addition, Jupyter Notebook, precisely because it is web-based, enables collaboration by allowing users to share documents and work on them together.
Although Jupyter supports a variety of programming languages, the lingua franca is Python. Also common are languages such as Julia, GNU Octave or R. Thus, Jupyter Notebook is the favoured environment to run Machine Learning with PyTorch.
As a web-based environment, Jupyter is based on a client-server architecture. You can install it on your local machine or embed it in a well-designed cloud architecture. For this TechUp, we’ll make it a little easier and use a free cloud-based platform, namely Google Colab.
Jupyter in the Cloud with Google Colab
Google Colab is a free online platform that allows users to run Jupyter notebooks in the cloud. It is an easy-to-use platform that provides users with easy access to all popular machine learning libraries and frameworks. Colab also offers GPU support that allows users to quickly and efficiently build and train deep-learning models.
To create deep learning models with Colab, you first need to create a new notebook and install the required libraries and frameworks. Colab comes with many libraries and frameworks pre-installed, so there is usually no need to install them manually. You can then write code, import data, and create and train models.
PyTorch Kickstart
Here is a simple example of Deep Learning with PyTorch.
1st Importing Dependencies: First we need to import the PyTorch libraries we need.
``python import torch import torch.nn as nn import torch.optim as optim
|
|
3rd Model Definition: Next, we define our model using PyTorch. Here we use a simple, linearly layered model.
``python class LinearRegression(nn.Module): def init(self): super().init() self.linear = nn.Linear(2, 1)
def forward(self, x):
return self.linear(x)
model = LinearRegression()
|
|
- Prediction: Finally, we can use our model to make predictions on new data.
``python x_test = torch.tensor([[4.0, 4.0]]) y_test = model(x_test) print(y_test)
|
|
What happens here can be summarised in the following steps
- first, the required libraries are imported, including
torch
,torch.nn
,torch.optim
,MNIST
andToTensor
. - the training and testing libraries are imported. The training and test data set are loaded and stored in
train_data
andtest_data
.ToTensor
is used to convert the images into tensors. - The model
MNISTModel
is defined. It consists of three linear layers with ReLU activation between them. The first layer has 784 inputs (28 x 28 images), the second layer has 64 neurons and the third layer has 32 neurons and returns an output of 10 classes. - the model is initialised and the adam optimiser and the cross entropy loss function are defined.
- the function
train
is defined to train the model on the training dataset.model.train()
is called to put the model into training mode. The data and targets are loaded bytrain_loader
. The optimiser is reset to zero (optimizer.zero_grad()
), forward propagation is performed (output = model(data)
), loss is calculated (loss = loss_fn(output, target)
) and backward propagation is performed (loss.backward()
). Finally, the optimiser is updated (optimizer.step()
). The functiontest
is defined to test the model on the test data set.model.eval()
is called to put the model into evaluation mode. The test losses and accuracy are calculated by comparing the output of the model to the target (pred.eq(target.view_as(pred)).sum().item()
). train_loader
andtest_loader
are defined to load the data in batches.- the model is trained for 10 epochs. In each epoch, the model is tested on the test data set and the test losses and accuracy are output. The
train
andtest
functions are used to train and test the model on the training and test data set.
Conclusion
Dealing with PyTorch, Jupyter Notebooks and Deep Learning is extremely future proof. PyTorch is a powerful library for Deep Learning and is used by many well-known companies and research institutions. It plays a crucial role in the machine learning ecosystem and is expected to continue to play a significant role in the future.
Jupyter Notebooks are also an essential tool in the machine learning space and are used by many researchers and data scientists to document, share and make their work reproducible. They provide an effective way to combine code, text and visual representations into a single document.
Deep Learning is a rapidly growing discipline that has applications in numerous industries and use cases. The demand for professionals with Deep Learning skills is high and is expected to increase in the future. Therefore, exposure to PyTorch, Jupyter Notebooks and Deep Learning will continue to be extremely relevant and future-proof.
Resources and further links
What is torch.nn really? - PyTorch Tutorials 1.13.1+cu117 documentation
MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges
Top 15 Machine Learning Libraries in 2023
Top Machine Learning Trends for 2023
[Torch - Meta Research](