You finally don’t have to train your convolutional neural networks for weeks. Or even months.

If you have been following this series, we have built multiple CNNs for image classification tasks. In this article, we are not going to build the network, but instead, we are going to use a technique called transfer learning for our image classification tasks. Now, this is not to say that we will not have a convolutional neural network in our model, but we are just not going to build it. :)

Why? You may ask.

There are so many applications that require us to build our model from scratch in order to use them for a specific task. A…

Credits to

In this article, we will try to explore one of the CNN architectures, AlexNet and apply a modified version of the architecture to build a classifier to differentiate between a cat and a dog.

Reasons it is a modified version:

  • Here we are building a binary classifier, which is to classify two categories, but in AlexNet, it can classify 1000 categories, hence the final layer is a dense layer of 1000 tensors, but our binary classifier has the final layer with only 1 tensor.
  • We cannot match the number of filters learnt in each convolutional layer as it requires a…

Previously, we have talked about a few algorithms on image segmentation and how we can find the object boundaries in an image. Let’s take a look at an algorithm that tracks objects whose appearance is defined by color histograms, but of course, not limited to only color histograms.

In this section, we will dive a little into the basic math part of the algorithm.

Before we talk about tracking objects, let’s discuss the mean-shift algorithm. It is an algorithm that shifts a data point iteratively to the point where the average of the data points is located in that neighbourhood…

One of the most computational efficient feature detector algorithm out there, it is very suitable for real-time video processing. Let’s jump straight into the algorithm to see how it works:

  1. We first take a pixel p in the image (consider this point an interest point, which we might disregard it later). Let its intensity be Ip.
  2. We then draw a circle of 16 pixels around it, called the Bresenham circle of radius 3, which is the circle in the picture above.
  3. Then we select an appropriate threshold t.
  4. We determine if the pixel p is a corner if there exists…

In previous stories, we have determined how to identify features in an image. But there is a problem: scale. With different scales, the features that deemed to have existed might not be there anymore. A simple analogy: Looking at your phone with a 6.7 inch screen, you pretty much see everything on the screen. But as soon as companies shrink the screen sizes, you won’t be able to see everything, you either pinch the screen to zoom in on text/images or you lean in closer. Take this to a wider level, if you look at your phone from the outside…

Chris Harris and Mike Stephens came out with an algorithm that is super powerful in being able to detect corners, a specifically useful feature in images. To define a corner, let’s talk in terms of image processing techniques. Corners can be considered as a junction of two edges, where an edge is a sudden change in the pixel brightness or intensity and constitutes a gradient. Corners are important, because they are considered feature points that are invariant to translation, rotation and illumination, but sadly not scale-invariant. We’ll come to that in later posts.

To make things simple, this algorithm basically…

Feature detection and matching, as it sounds like, is a computer vision technique to detect features and match these detected features across images to identify regions of interest. So, what then is a feature?

A feature can be considered to be a specific structure in an image that contributes to being able to identify the region of interest in an image. For example, an edge, a corner or a point that stands out could be a feature of an image. …

Well, we have looked at detecting objects by segmenting them in an image in earlier posts. Now, the question comes. How do you detect and recognize faces of humans? And why do we ever need to do that? It turns out that face detection and recognition is not that much different from the object detection technique we learnt in the last post. But to answer the second question, let’s first understand the ‘how’ part.

For many years, the Haar cascade algorithm proved to be very effective in detecting objects and is actually one of the early use cases of machine…

Oftentimes in image processing, we turn the image to grayscale before running any operations on it. Watershed is one of the transformation on grayscale images that perform object segmentation. However, even after we turn the image to grayscale, we do not lose that much information after all! Watershed works best in this case. Let’s try to understand it.

In any grayscale images, there are areas where the intensity is high and there are some where intensity is low. We can denote these high intensity areas to peaks while low intensity areas to valleys. Think of an image as a topography…


Mechatronic Engineering Student || Indie Game Developer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store