Intro to Google Colab - A place where magic happens
Kickstart Machine learning or Data Analytics journey without setting up local development
As per official documentation
Colab, or "Colaboratory", allows you to write and execute Python in your browser, with
- Zero configuration required
- Access to GPUs free of charge
- Easy sharing
Jupyter-based hosted Python notebook service Colab gives free access to computing resources (including GPUs and TPUs). With Colab, sharing with collaborators is quick and straightforward without the need for downloads or the creation of a Python environment.
Numerous scientific and machine learning packages, including NumPy, scipy, pandas, TensorFlow, and PyTorch, are pre-installed in the Colab environment.
Getting Started
Starting with Google Colab is the most straightforward way you can think of as a developer, just a Google account is what you need. There are multiple ways, I prefer via Google Drive easy to share with your other collaborators.
Installing Google Collab App.
Login to any of your Google drive accounts.
Click on New --> More --> Connect more apps --> Search for "Colaboratory" in Google Workspace Marketplace
Select Google Colaboratory --> Click "Install" --> Approve permission to install the app --> Select the Google account against which it needs to be installed.
Voila, Google Colaboratory is installed. We are ready to go!
Creating the First Notebook via Google Colaboratory
Go to New --> More --> Click on Google Colaboratory
This will open up interactive type IDE, we will just type a print statement and then press Ctrl + Enter
to execute the current selected Cell. (If you don't know shortcuts, use Runtime and select the required option)
On running, it will connect to Python 3 GCE (Google Compute Engine) to execute the statement provided. That's it, we have successfully executed our first Notebook within Google Colaboratory. The notebook will be saved automatically too.
As told earlier, we have a lot of scientific and machine learning modules/packages available for use out of the box within Google Colab. We will try to use the inbuilt Panda Module.
Trying Panda for Fun
Being a football fan and an avid Liverpool fan specifically, I am just downloading a sample CSV file with data for the 2018-2019 English Premier League season from FootyStats to load via the Panda module.
We will upload the CSV file to my Google drive, where my notebook file exists for reading purposes. To read CSV uploaded to your drive, we need to mount a drive and give the required permission using the below commands
from google.colab import drive
drive.mount('/content/drive')
This will mount a drive and show if successfully mounted on the left side menu as below
Just right-click on the file, copy the path to CSV and then load a file in Panda Data Frame as below and execute the program. Viola, we are able to load Data Frame using Panda.
Installing Third Party Python modules
Recently I have been working on Text to Text Generation AI, so I was exploring different models, one which I came across was Parrot Paraphraser
So before, I try to fine-tune the model further, decided to give it a try but didn't want to set up an environment, so used Google Colab.
As told it's a third-party module, but installing it in Google Colab is as simple as using Pip, a Python package installer. We install on Google Colab, by executing below in code Shell
!pip install git+https://github.com/PrithivirajDamodaran/Parrot_Paraphraser.git
Import the Parrot module, which will be in all required models and others.
Then, we just pass the input phrase to Parrot Model and get new paraphrases from ML model. Voila, so easy to get started.
Notebooks created during blog are available here
Alternatives to Google Colab
- Azure Notebooks
- Kaggle
- Amazon Sagemaker
- IBM Data Platform Notebooks
- Jupyter Notebooks
Useful Links
Thank you for reading, If you have reached it so far, please like the article, It will encourage me to write more such articles. Do share your valuable suggestions, I appreciate your honest feedback and suggestions!