lu-vision-dock: Basic Edge Detection in Python

http://alwaysmovefast.com/2007/12/05/basic-edge-detection-in-python/

Text from the link above:

Detecting edges in images is being actively researched for many different applications. The most notable of these applications is computer vision. The reason I began studying edge detection algorithms, aside from them being really cool, is that I’ve been noticing that I can use edge detection as part of my toolbox for optical character recognition.

So, how do we define edges in any given image? Edges are really just areas where the pixels intensities contrast. Basically where you have a bunch of light pixels touching a bunch of dark pixels. There are a couple different methods used for detecting edges: gradient and Laplacian. I’m going to be covering a basic gradient edge detection technique and will cover Laplacian techniques in future posts.

Gradient edge detection approximates the first derivative of the image, looking for minimum and maximum intensities in the magnitude of the gradient. Locating edge pixels can be done by setting a threshold of some value and testing if the gradient is greater than that threshold.

The gradient of the image function I is given by the vector:

ߜ I = [∂I / ∂x, ∂I / ∂y]

To approximate the first derivative of the image, we use convolution masks. The method I’m going to present is the Prewitt method. It uses two masks to approximate ∂I / ∂x and ∂I / ∂y, giving us a gradient of the image’s pixels. ∂I / ∂x and ∂I / ∂y detect vertical and horizontal edges, respectively. The masks that define ∂I / ∂x and ∂I / ∂y for the Prewitt operator are:

∂I / ∂x:
[-1, 0, 1]
[-1, 0, 1]
[-1, 0, 1]

∂I / ∂y:
[1, 1, 1]
[0, 0, 0]
[-1, -1, -1]

The resulting outputs of convolving the image with these masks are then added to get the magnitude of the gradient. The magnitude of the gradient is given by:

|G| = sqrt(Gx² + Gy²)

To approximate the magnitude of the gradient, we use:

|G| = |Gx| + |Gy|

After getting the magnitude of the gradient, we want to check if it’s larger than our threshold. All the methods I’ve seen use a threshold of 255. What this means is that when the magnitude of the gradient is larger than 255, we’ve found an edge. We cap the magnitude to 255 if it’s larger than 255 and mark the pixel in the output image as a 0, which is black. This is done implicitly by setting the pixel value to 255 - magnitude, meaning if the magnitude is 255, the pixel value is black. Magnitudes of 0 will set the pixel to 255 - 0, which is white. The magnitudes can be any value between 0 and 255, inclusive.

The Prewitt masks in Python are given by the function get_prewitt_masks():

# Uses hashes of tuples to simulate 2-d arrays for the masks.
def get_prewitt_masks():
xmask = {}
ymask = {}

xmask[(0,0)] = -1
xmask[(0,1)] = 0
xmask[(0,2)] = 1
xmask[(1,0)] = -1
xmask[(1,1)] = 0
xmask[(1,2)] = 1
xmask[(2,0)] = -1
xmask[(2,1)] = 0
xmask[(2,2)] = 1

ymask[(0,0)] = 1
ymask[(0,1)] = 1
ymask[(0,2)] = 1
ymask[(1,0)] = 0
ymask[(1,1)] = 0
ymask[(1,2)] = 0
ymask[(2,0)] = -1
ymask[(2,1)] = -1
ymask[(2,2)] = -1
return (xmask, ymask)

Now on to the meat of the entire operation. The prewitt() function takes a 1-d array of pixels and the width and height of the input image. It returns a greyscale edge map image.

# create a new greyscale image for the output
outimg = Image.new('L', (width, height))
outpixels = list(outimg.getdata())

for y in xrange(height):
for x in xrange(width):
sumX, sumY, magnitude = 0, 0, 0
if y == 0 or y == height-1: magnitude = 0
elif x == 0 or x == width-1: magnitude = 0
else:
for i in xrange(-1, 2):
for j in xrange(-1, 2):
# convolve the image pixels with the Prewitt mask, approximating ∂I / ∂x
sumX += (pixels[x+i+(y+j)*width]) * xmask[i+1, j+1]

for i in xrange(-1, 2):
for j in xrange(-1, 2):
# convolve the image pixels with the Prewitt mask, approximating ∂I / ∂y
sumY += (pixels[x+i+(y+j)*width]) * ymask[i+1, j+1]

# approximate the magnitude of the gradient
magnitude = abs(sumX) + abs(sumY)

if magnitude > 255: magnitude = 255
if magnitude < magnitude =" 0">

You can store this code all in one file so when you run it, you can pass the program arguments for the input and output image filenames on the command line. To do so, add this code to the Python file with the edge detection code from earlier:

import sys
if __name__ == '__main__':
img = Image.open(sys.argv[1])
# only operates on greyscale images
if img.mode != 'L': img = img.convert('L')
pixels = list(img.getdata())
w, h = img.size
outimg = prewitt(pixels, w, h)
outimg.save(sys.argv[2])

I called my file prewitt.py, so with all that code in the same file, you can call it from the command line:

$ python prewitt.py input_image.gif output_image.gif

Note that it will work for pretty much any image type you give it.

lu-vision-dock

Informazioni personali

Etichette

Archivio blog

martedì 21 aprile 2009

Basic Edge Detection in Python

Nessun commento:

Posta un commento