Computer Vision

Q1. Object Detection System

Question:

Write a Python function using OpenCV and a pre-trained deep learning model to detect objects in an image. The function should take an image path as input, use a MobileNet SSD (Single Shot Detector) model pre-trained on the COCO dataset, and return the image with bounding boxes around detected objects, including labels and confidence scores.

Solution:

import cv2
import numpy as np

def detect_objects(image_path):
    # Load the pre-trained MobileNet SSD model and the corresponding class labels
    net = cv2.dnn.readNetFromCaffe('MobileNetSSD_deploy.prototxt.txt', 'MobileNetSSD_deploy.caffemodel')
    classes = ["background", "aeroplane", "bicycle", "bird", "boat",
               "bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
               "dog", "horse", "motorbike", "person", "pottedplant", "sheep",
               "sofa", "train", "tvmonitor"]

    # Load the image and create a blob from it
    image = cv2.imread(image_path)
    (h, w) = image.shape[:2]
    blob = cv2.dnn.blobFromImage(cv2.resize(image, (300, 300)), 0.007843, (300, 300), 127.5)

    # Pass the blob through the network and obtain the detections and predictions
    net.setInput(blob)
    detections = net.forward()

    # Loop over the detections
    for i in np.arange(0, detections.shape[2]):
        confidence = detections[0, 0, i, 2]

        # Filter out weak detections by ensuring the confidence is greater than a minimum threshold
        if confidence > 0.2:
            # Extract the index of the class label from the detections
            idx = int(detections[0, 0, i, 1])
            label = classes[idx]

            # Compute the (x, y)-coordinates of the bounding box for the object
            box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
            (startX, startY, endX, endY) = box.astype("int")

            # Display the prediction
            label = "{}: {:.2f}%".format(label, confidence * 100)
            cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2)
            y = startY - 15 if startY - 15 > 15 else startY + 15
            cv2.putText(image, label, (startX, y), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Show the output image
    cv2.imshow("Output", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

# Example usage:
detect_objects('path_to_image.jpg')

Explanation of the Solution:

Load Pre-trained Model and Labels:
- net = cv2.dnn.readNetFromCaffe(...): Loads the SSD model trained with the Caffe framework from disk.
- classes: A list of class labels the model was trained to detect.
Image Preprocessing:
- image = cv2.imread(image_path): Loads an image from the specified path.
- blob = cv2.dnn.blobFromImage(...): Converts the image into a blob by resizing and scaling operations. This blob is then used as the input to the network.
Object Detection:
- detections = net.forward(): Passes the blob through the network, which returns the detections. Each detection includes the class, score, and bounding box coordinates.
Process Each Detection:
- Loop through each detection and filter out weak detections based on a confidence threshold.
- For each valid detection, calculate the bounding box coordinates and draw a rectangle and label on the image to display the result.
Display Result:
- cv2.imshow(...): Displays the image with bounding boxes and labels.
- cv2.waitKey(0): Waits for a key press to close the window.

This solution leverages the MobileNet SSD model, which is efficient for real-time object detection due to its balance of speed and accuracy, making it suitable for applications that require running on limited computational resources like mobile devices.

Computer Vision

Machine Learning

Data Science

Q1. Object Detection System

Question:

Solution:

Explanation of the Solution:

On this page