Load Face Mask Data on Colab

This article is the first part of the face mask detection project. This project is also similar to face recognition. Here we are load data for building model. Train a neural network model on computer required high performance. Regarding processor and other things I recommended you use google colab. So, let’s start our project.

Download Data

Data for the face mask is available at Kaggle. It’s very easy to load data from Kaggle by making our own API. For making API firstly, go to your Kaggle profile and search create an API token. Download JSON file saves in a safe place which is using for access Kaggle data. Also, useful for other face recognition dataset.

When your API token is ready then load data by following line of code. While running this code you will see choose file option. Upload your API json file there. And wait until data are download.

from google.colab import files

!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
!kaggle datasets download -d wobotintelligence/face-mask-detection-dataset
!rm kaggle.json
Choose File

The download file is in zip format it’s required to unzip. Follow this code for unzip file. It may take 5 to 10-second wait for complete unzipping process.

!unzip face-mask-detection-dataset.zip

File section, there is sample_data folder provide by google. If you want to remove this and only put face mask data run following command. It’s optional to you.

!rm -rf sample_data

At the right side of co-lab, you can see the file icon. Go there and look at the whole data set. Now we have a data set. Let’s move to other processes.

Import library

After that here we import following library for building model.

import numpy as np
import matplotlib.pyplot as plt
import os
import json
import pprint as pp
import pandas as pd
import cv2
import random

I consider you are familiar with this library. Matplotlib for visualize on data. Pandas are using to operate on a data frame. The pprint for print json data. Cv2 for reading and write data.

Load Data

On data set folder there are image and their annotation is in a different place so, need to merge them. Following code is used. Now let’s load annotation file which is in json format contain the most important file name, class type etc.

BASE_DIR = 'Medical mask/Medical mask/Medical Mask'
IMAGE_DIR = os.path.join(BASE_DIR, 'images')
ANNOTATIONS_DIR = os.path.join(BASE_DIR, 'annotations')
for file in os.listdir(ANNOTATIONS_DIR):
  with open(os.path.join(ANNOTATIONS_DIR, file), 'r') as f:
    data = json.load(f)
print json data
print json data

Also in our data set folder, there is train.csv file that contains the label of image corresponding with the class label of data. So load this file and identify our data and their corresponding labels.

train data csv

After that, separate data and their label with mace with mask and face without mask.


Using matplot plot the image in two-class by following line of code. this is also doing for the face-recognition project.

fig, ax = plt.subplots(2, 5, figsize=(20, 20))

for i, face in enumerate(faces_with_mask[:5]):
  image = cv2.cvtColor(cv2.imread(os.path.join(IMAGE_DIR, face[0])), cv2.COLOR_BGR2RGB)
  start = int(face[1]), int(face[2])
  end = int(face[3]), int(face[4])
  image = image[start[1]:end[1], start[0]:end[0]]
  image = cv2.resize(image, (100, 100))
  ax[0, i].imshow(image)

for i, face in enumerate(faces_without_mask[:5]):
  image = cv2.cvtColor(cv2.imread(os.path.join(IMAGE_DIR, face[0])), cv2.COLOR_BGR2RGB)
  start = int(face[1]), int(face[2])
  end = int(face[3]), int(face[4])
  image = image[start[1]:end[1], start[0]:end[0]]
  image = cv2.resize(image, (100, 100))
  ax[1, i].imshow(image)
face with mask
face recognition
face with mask

Here load an image and their corresponding class in list form. To clarify that Class are represented in binary form o for the face with a mask and 1 for face without mask. We have a total of 5749 data and their label.

X = []
y = []

for i, cat in enumerate((faces_without_mask, faces_with_mask)):
  for f in cat:
    image = cv2.cvtColor(cv2.imread(os.path.join(IMAGE_DIR, f[0])), cv2.COLOR_BGR2RGB)
    start = int(f[1]), int(f[2])
    end = int(f[3]), int(f[4])
    image = image[start[1]:end[1], start[0]:end[0]]
    image = cv2.resize(image, (100, 100))

After data is load successfully. then split the data into train and test by using sklearn model selection method.

import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow.keras.applications.vgg16 import preprocess_input

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1)

print(len(X_train), len(X_test))
X_train = np.array(X_train, dtype='float32')
y_train = np.array(y_train)
X_test = np.array(X_test, dtype='float32')
y_test = np.array(y_test)


This article is the first part of a face mask detection project. If you have any question regarding this project please feel free to comment below or contact us. The whole project file is available in the last article of this series. I hope from my article you gain some knowledge. if you want to free python course here is a link you can visit.

Leave a Reply