03_Colors_Thresholding

command
v0.0.0-...-9f3f7ca Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 13, 2019 License: Apache-2.0 Imports: 4 Imported by: 0

README

Colorspaces and Thresholding

In the previous chapter we had a quick look at digital image airthmetics and how it can help us select various regions of the image and do something with it like overlay another image on it, blur it etc.

We have scratched upon a concept of color spaces and image thresholding, but didn't dedicate much time to it. This chapter will take a closer a look at both of these topics.

Colorspaces

Digital images are usually composed of several "layers" of colors often referred to as color channels (even a grayscale image can have alpha channel, turning the number of channels of the grayscale image to 2). To make things more complicated there are different channel "types" a.k.a. color spaces, each of which encodes color channels in different way. Each color mode has its cons and pros whicih we won't deal with here. Instead we will have a look at how to convert images between different color spaces and color modes using gocv.

Most developers are familiar with RGB color mode, which splits the image into three color channels: Red (R), Green (G) and Blue (B). By default gocv reads a colored image in BGR color mode. Converting between different color modes is easy using gocv.CvtColor() function. Equally, the same function can be used for over 100 color conversion operations. Let's have a look at simple example of convertin a colored image into a grayscale image:

package main

import (
	"fmt"
	"os"
	"path/filepath"

	"gocv.io/x/gocv"
)

func main() {
	messiPath := filepath.Join("messi.jpg")
	// read images
	messi := gocv.IMRead(messiPath, gocv.IMReadColor)
	if messi.Empty() {
		fmt.Printf("Failed to read image: %s\n", messiPath)
		os.Exit(1)
	}
	// grayscale messi image
	grayMessi := gocv.NewMat()
	gocv.CvtColor(messi, &grayMessi, gocv.ColorBGRToGray)
	// write image to filesystem
	outPath := filepath.Join("gray_messi.jpeg")
	if ok := gocv.IMWrite(outPath, grayMessi); !ok {
		fmt.Printf("Failed to write image: %s\n")
		os.Exit(1)
	}
}

We can see the colored and grayscale image below:

Messi Grayscale Messi

For completeness, note that we could have actually read image of Messi as grayscale, not needing the conversion, that is, if we knew in advance that we would never actually need the original colored image later on like this (note gocv.IMReadGrayScale parameter passed to gocv.IMRead function):

package main

import (
	"fmt"
	"os"
	"path/filepath"

	"gocv.io/x/gocv"
)

func main() {
	messiPath := filepath.Join("messi.jpg")
	// read images
	grayMessi := gocv.IMRead(messiPath, IMReadGrayScale)
	if messi.Empty() {
		fmt.Printf("Failed to read image: %s\n", messiPath)
		os.Exit(1)
	}
}

One thing which used to bite me when working with colored images was when I converted BGR image into RGB and then after having done some work with it I'd write the RGB image to the filesystem forgetting to convert it back to the original BGR color scheme which gocv expects. The result would look like this:

RGB Messi

Dont forget to convert RGB images back to BGR images before you write them to filesystem

HSV Colorspace

One color scheme you will encounter a lot when dealing with image processing, especially when it comes down to processing video streams, is HSV: Hue Saturation Value. It stores color information in three channels, just like RGB, but one channel is devoted to brightness (Value), and the other two convey colour information. HSV is much better than RGB under varying light conditions and is especially useful in lossy video compression, where loss of color information is less noticeable to the human eye.

Thresholding

As I mentioned at the beginning of this chapter we talked a bit about image thresholding in the previous chapter, in particular about a specific type of thresholding: binary image thresholding. Thresholding is usually done on grayscale image to get a bi-level (binary) image out of a grayscale image or for removing a noise, that is, filtering out pixels with too small or too large values.

There are several different types of thresholding available in opencv. Let's have a look at some of the options. We will use the familiar Wikimedia logo we used in the previous chapter:

Wikimedia Commons

In the previous chapter we have shown how a binary thresholding can help us create an image mask which can then be "applied" on another image of the same size in order to mask out some pixels. We used the gocv.ThresholdBinary to get a binary thresholded image. We could easily get the inverse mask by using gocv.ThresholdBinaryInvflag. Let's have a look how would the appropriate images of logo look like when using different thresholding types:

package main

import (
	"fmt"
	"os"
	"path/filepath"

	"gocv.io/x/gocv"
)

func main() {
	// logo image
	logoPath := filepath.Join("commons.png")
	logo := gocv.IMRead(logoPath, gocv.IMReadColor)
	if logo.Empty() {
		fmt.Printf("Failed to read image: %s\n", logoPath)
		os.Exit(1)
	}
	// turn the logo to gray image
	grayLogo := gocv.NewMat()
	gocv.CvtColor(logo, &grayLogo, gocv.ColorBGRToGray)
	// binary threshold
	binLogo := gocv.NewMat()
	gocv.Threshold(grayLogo, &binLogo, 10.0, 255.0, gocv.ThresholdBinary)
	// inverse binary logo
	invBinLogo := gocv.NewMat()
	gocv.Threshold(grayLogo, &invBinLogo, 10.0, 255.0, gocv.ThresholdBinaryInv)
	// write logo to filesystem
	if ok := gocv.IMWrite("bin_logo.jpeg", binLogo); !ok {
		fmt.Printf("Failed to write image\n")
		os.Exit(1)
	}
	if ok := gocv.IMWrite("inv_bin_logo.jpeg", invBinLogo); !ok {
		fmt.Printf("Failed to write image\n")
		os.Exit(1)
	}
}

The resulting images illustrate both thresholding types and how truly inverse they are to one another:

Binary logo Inverse binary logo
Adaptive Thresholding

So far we had a look at fixed thresholding techniques where we explicitly specified the threshold value. The problem with this approach is, if we don't know the right value in advance we end up testing various threshold values like there is no tomorrow. Futhermore, the value which we eventually settle on after endless tests might not work well in different lighting conditions. This is where adaptive thresholding can help us out.

Adaptive thresholding calculates different thresholds for different regions of the same image. So we end up with several threshold values which gives us better results for images with regions of varying illumination. We will demonstrate different thresholding techniques on an image of sudoku game:

Sudoku

We will apply both adaptive mean and adaptive gaussian thresholding on the sudoku image and compare the results:

package main

import (
	"fmt"
	"os"
	"path/filepath"

	"gocv.io/x/gocv"
)

func main() {
	// sudoku image
	sudokuPath := filepath.Join("sudoku.jpg")
	sudoku := gocv.IMRead(sudokuPath, gocv.IMReadGrayScale)
	if sudoku.Empty() {
		fmt.Printf("Failed to read image: %s\n", sudokuPath)
		os.Exit(1)
	}
	// adaptive mean
	adptMean := gocv.NewMat()
	gocv.AdaptiveThreshold(sudoku, &adptMean, 255.0, gocv.AdaptiveThresholdMean, gocv.ThresholdBinary, 5, 4.0)
	// adaptive gaussian
	adptGauss := gocv.NewMat()
	gocv.AdaptiveThreshold(sudoku, &adptGauss, 255.0, gocv.AdaptiveThresholdGaussian, gocv.ThresholdBinary, 5, 4.0)
	// write images to filesystem
	outAdptMean := "sudoku_adaptive_mean.jpeg"
	if ok := gocv.IMWrite(outAdptMean, adptMean); !ok {
		fmt.Printf("Failed to write image: %s\n", outAdptMean)
		os.Exit(1)
	}
	outAdptGauss := "sudoku_adaptive_gauss.jpeg"
	if ok := gocv.IMWrite(outAdptGauss, adptGauss); !ok {
		fmt.Printf("Failed to write image: %s\n", outAdptGauss)
		os.Exit(1)
	}
}

As you can see on the resulting images below, Gaussian thresholding performed slightly better: the image that passed through it is less noisy than the one thresholded by adaptive mean threshold. Gaussian threshold uses gaussian blur filter which helps with removing noise before the image gets processed. This is a widely used technique in digital image processing:

adaptive mean gaussian adaptive
Advanced thresholding

There is one more advanced thresholding technique available in opencv: Otsu Binarization. It's used in combination with fixed thresholding, but provides more advanced features.

Otsu Binarization is handy when you are not sure about threshold value and your image is bimodal i.e. it has two histogram peaks (two "dominant" colors). In this case, finding the right threshold value can be a nightmare as you are essentially dancing between two colors. Otsu Binarization thresholding helps with this. When using Otsu on bimodal image, you normally pass threshold value as 0 and simply add the gocv.ThresholdBinary and gocv.ThresholdOtsu flags together:

gocv.Threshold(img, &img, 0.0, 255.0, gocv.ThresholdBinary+gocv.ThresholdOtsu)
Tracking and extraction objects, manipulting object scenes

Before we wrap up this chapter let's talk about something slightly more elaborate which is akin to how visual effects in movies are done. Usually, visual effects in movies are shot on a blue or green background which is then replaced in post production with another background. Similarly, you can also extract an object of certain color and replace it with another object or simply just follow it around i.e. track it.

You can achieve object/background extraction by using gocv.InRange() function which is a range thresholding function that allows you to pick lower and upper bound threshold values for pixel intensities. We will illustrate the usage of this function by cutting out pizza from the image below which has pizza sitting on a blue background:

pizza blue background

Here is what we are going to do:

  • we will search for all the blue pixels
  • we will create a mask which will mask all pizza pixels with black pixels
  • we will use the mask to cut out the pizza out of the image

In order to use gocv.InRange() function we need to pick lower and upper pixel values. We will once again use the HSV color mode. The tricky part, though, is to figure out what should the blue color upper and lower bounds be in HSV color space? In RGB colorspace this would be easy: [255, 0, 0] is the absolute blue color -- we would pick the range values empirically around 255. In HSV colorspace things are not as simple. What we will do is we will find HSV values for the absolute blue color in HSV color space and then pick the lower and upper bounds like this: [H-10, 100,100] and [H+10, 255, 255]. We can get these pixel values using following program:

package main

import (
	"fmt"

	"gocv.io/x/gocv"
)

func main() {
	blue := gocv.NewMatFromScalar(gocv.NewScalar(255.0, 0.0, 0.0, 0.0), gocv.MatTypeCV8UC3)
	gocv.CvtColor(blue, &blue, gocv.ColorBGRToHSV)
	hsv := gocv.Split(blue)
	fmt.Printf("H: %d S: %d V: %d\n", hsv[0].GetUCharAt(0, 0), hsv[1].GetUCharAt(0, 0), hsv[2].GetUCharAt(0, 0))
}

The resulting HSV triplette will look like this:

H: 120 S: 255 V: 255

This means that our upper and lower bound values for blue background would be: [110, 100, 100] and [130, 255, 255].

Let's get to coding. The code to cut out the pizza from the image shown above used to be a bit elaborate due to how gocv API used to work, but luckily a PR opened by me some time ago got merged it so the required code is a bit simpler. Here it is in full:

package main

import (
	"fmt"
	"os"
	"path/filepath"

	"gocv.io/x/gocv"
)

func main() {
	// read image
	pizzaPath := filepath.Join("pizza.png")
	pizza := gocv.IMRead(pizzaPath, gocv.IMReadColor)
	if pizza.Empty() {
		fmt.Printf("Failed to read image: %s\n", pizzaPath)
		os.Exit(1)
	}
	// Convert BGR to HSV image (dont modify the original)
	hsvPizza := gocv.NewMat()
	gocv.CvtColor(pizza, &hsvPizza, gocv.ColorBGRToHSV)
	pizzaRows, pizzaCols := hsvPizza.Rows(), hsvPizza.Cols()
	// define HSV color upper and lower bound ranges
	lowerMask := gocv.NewMatWithSizeFromScalar(gocv.NewScalar(110.0, 50.0, 50.0, 0.0), pizzaRows, pizzaCols, gocv.MatTypeCV8UC3)
	upperMask := gocv.NewMatWithSizeFromScalar(gocv.NewScalar(130.0, 255.0, 255.0, 0.0), pizzaRows, pizzaCols, gocv.MatTypeCV8UC3)
	// global mask
	mask := gocv.NewMat()
	gocv.InRange(hsvPizza, lowerMask, upperMask, &mask)
	// cut out pizza mask
	pizzaMask := gocv.NewMat()
	gocv.Merge([]gocv.Mat{mask, mask, mask}, &pizzaMask)
	// cut out the pizza and convert back to BGR
	gocv.BitwiseAnd(hsvPizza, pizzaMask, &hsvPizza)
	gocv.CvtColor(hsvPizza, &hsvPizza, gocv.ColorHSVToBGR)
	// write image to filesystem
	outPizza := "no_pizza.jpeg"
	if ok := gocv.IMWrite(outPizza, hsvPizza); !ok {
		fmt.Printf("Failed to write image: %s\n", outPizza)
		os.Exit(1)
	}
	// write pizza mask to filesystem
	outPizzaMask := "no_pizza_mask.jpeg"
	if ok := gocv.IMWrite(outPizzaMask, mask); !ok {
		fmt.Printf("Failed to write image: %s\n", outPizza)
		os.Exit(1)
	}
}

Note that when allocating the gocv.Scalars to define some pixel values, you need to make sure these are 3 channel types as HSV is a 3 channel color space; you can use gocv.MatTypeCV8UC3 type for this.

This is what the resulting image looks like once we cut out the pizza from it:

no pizza

For completeness let's also display what the mask looks like once it passess through gocv.InRange() thresholding. Note that all the blue pixels from the original image are now white i.e. the intensity value these pixels now have is 255 and all the other pixels -- the pizza ones -- have pixel intensity set to 0:

no pizza mask

This wraps up this chapter on colorspaces and image thresholding.

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL