lu-vision-dock: 2009

venerdì 2 ottobre 2009

WEB DESIGN WORKFLOW?

http://tutorialaday.com/web-design-workflow-complete-overview/
http://tutorialaday.com/

martedì 8 settembre 2009

Remote Control with PYs60

http://mulinux.sunsite.dk/python/pssc.html

Nokia Computer Vision Library

http://www.bernardotti.it/portal/showthread.php?p=214974

PYs60: computer vision with mobile

http://www.danilocesar.com/blog/2006/12/03/smartphones-aonde-podemos-parar/
http://snippets.dzone.com/posts/show/636
http://wiki.opensource.nokia.com/projects/PyS60
http://developer.symbian.org/wiki/index.php/Main_Page

venerdì 10 luglio 2009

Nice blog on stereo-vision

I've found a nice blog on stereo-vision:

http://siddhantahuja.wordpress.com/category/stereo-vision/

Pretty cool examples!

giovedì 9 luglio 2009

A python library to parse, read and write JPEG EXIF, IPTC and COM metadata.

Have a look at this:
http://www.emilas.com/jpeg/

and have FUN!

Metadata reading from JPG

I've found working code for reading jpg metadata.
It's quite handy for understanding how to read the exif structure and its parsing process.
Here some code from:
http://topo.math.u-psud.fr/~bousch/exifdump.py

#!/usr/local/bin/python
#
# Exif information decoder
# Written by Thierry Bousch 
# Public Domain
#
# $Id: exifdump.py,v 1.14 2004/01/30 02:11:14 bousch Exp $
#
# Since I don't have a copy of the Exif standard, I got most of the
# information from the TIFF/EP draft International Standard:
#
#   ISO/DIS 12234-2
#   Photography - Electronic still picture cameras - Removable Memory
#   Part 2: Image data format - TIFF/EP
#
# You can (and should) get a copy of this document from PIMA's web site;
# their URL is:  http://www.pima.net/it10a.htm
# You'll also find there some documentation on DCF (Digital Camera Format)
# which is based on Exif.
#
# Another must-read is the TIFF 6.0 specification, which can be
# obtained from Adobe's FTP site, at
# ftp://ftp.adobe.com/pub/adobe/devrelations/devtechnotes/pdffiles/tiff6.pdf
#
# Many thanks to Doraneko  for filling the
# holes in the Exif tag list.

import sys

EXIF_TAGS = {
 0x100: "ImageWidth",
 0x101: "ImageLength",
 0x102: "BitsPerSample",
 0x103: "Compression",
 0x106: "PhotometricInterpretation",
 0x10A: "FillOrder",
 0x10D: "DocumentName",
 0x10E: "ImageDescription",
 0x10F: "Make",
 0x110: "Model",
 0x111: "StripOffsets",
 0x112: "Orientation",
 0x115: "SamplesPerPixel",
 0x116: "RowsPerStrip",
 0x117: "StripByteCounts",
 0x11A: "XResolution",
 0x11B: "YResolution",
 0x11C: "PlanarConfiguration",
 0x128: "ResolutionUnit",
 0x12D: "TransferFunction",
 0x131: "Software",
 0x132: "DateTime",
 0x13B: "Artist",
 0x13E: "WhitePoint",
 0x13F: "PrimaryChromaticities",
 0x156: "TransferRange",
 0x200: "JPEGProc",
 0x201: "JPEGInterchangeFormat",
 0x202: "JPEGInterchangeFormatLength",
 0x211: "YCbCrCoefficients",
 0x212: "YCbCrSubSampling",
 0x213: "YCbCrPositioning",
 0x214: "ReferenceBlackWhite",
 0x828F: "BatteryLevel",
 0x8298: "Copyright",
 0x829A: "ExposureTime",
 0x829D: "FNumber",
 0x83BB: "IPTC/NAA",
 0x8769: "ExifIFDPointer",
 0x8773: "InterColorProfile",
 0x8822: "ExposureProgram",
 0x8824: "SpectralSensitivity",
 0x8825: "GPSInfoIFDPointer",
 0x8827: "ISOSpeedRatings",
 0x8828: "OECF",
 0x9000: "ExifVersion",
 0x9003: "DateTimeOriginal",
 0x9004: "DateTimeDigitized",
 0x9101: "ComponentsConfiguration",
 0x9102: "CompressedBitsPerPixel",
 0x9201: "ShutterSpeedValue",
 0x9202: "ApertureValue",
 0x9203: "BrightnessValue",
 0x9204: "ExposureBiasValue",
 0x9205: "MaxApertureValue",
 0x9206: "SubjectDistance",
 0x9207: "MeteringMode",
 0x9208: "LightSource",
 0x9209: "Flash",
 0x920A: "FocalLength",
 0x9214: "SubjectArea",
 0x927C: "MakerNote",
 0x9286: "UserComment",
 0x9290: "SubSecTime",
 0x9291: "SubSecTimeOriginal",
 0x9292: "SubSecTimeDigitized",
 0xA000: "FlashPixVersion",
 0xA001: "ColorSpace",
 0xA002: "PixelXDimension",
 0xA003: "PixelYDimension",
 0xA004: "RelatedSoundFile",
 0xA005: "InteroperabilityIFDPointer",
 0xA20B: "FlashEnergy",   # 0x920B in TIFF/EP
 0xA20C: "SpatialFrequencyResponse", # 0x920C    -  -
 0xA20E: "FocalPlaneXResolution", # 0x920E    -  -
 0xA20F: "FocalPlaneYResolution", # 0x920F    -  -
 0xA210: "FocalPlaneResolutionUnit", # 0x9210    -  -
 0xA214: "SubjectLocation",  # 0x9214    -  -
 0xA215: "ExposureIndex",  # 0x9215    -  -
 0xA217: "SensingMethod",  # 0x9217    -  -
 0xA300: "FileSource",
 0xA301: "SceneType",
 0xA302: "CFAPattern",   # 0x828E in TIFF/EP
 0xA401: "CustomRendered",
 0xA402: "ExposureMode",
 0xA403: "WhiteBalance",
 0xA404: "DigitalZoomRatio",
 0xA405: "FocalLengthIn35mmFilm",
 0xA406: "SceneCaptureType",
 0xA407: "GainControl",
 0xA408: "Contrast",
 0xA409: "Saturation",
 0xA40A: "Sharpness",
 0xA40B: "DeviceSettingDescription",
 0xA40C: "SubjectDistanceRange",
 0xA420: "ImageUniqueID",
}

INTR_TAGS = {
 0x1: "InteroperabilityIndex",
 0x2: "InteroperabilityVersion",
 0x1000: "RelatedImageFileFormat",
 0x1001: "RelatedImageWidth",
 0x1002: "RelatedImageLength",
}

GPS_TAGS = {
 0x0: "GPSVersionID",
 0x1: "GPSLatitudeRef",
 0x2: "GPSLatitude",
 0x3: "GPSLongitudeRef",
 0x4: "GPSLongitude",
 0x5: "GPSAltitudeRef",
 0x6: "GPSAltitude",
 0x7: "GPSTimeStamp",
 0x8: "GPSSatellites",
 0x9: "GPSStatus",
 0xA: "GPSMeasureMode",
 0xB: "GPSDOP",
 0xC: "GPSSpeedRef",
 0xD: "GPSSpeed",
 0xE: "GPSTrackRef",
 0xF: "GPSTrack",
 0x10: "GPSImgDirectionRef",
 0x11: "GPSImgDirection",
 0x12: "GPSMapDatum",
 0x13: "GPSDestLatitudeRef",
 0x14: "GPSDestLatitude",
 0x15: "GPSDestLongitudeRef",
 0x16: "GPSDestLongitude",
 0x17: "GPSDestBearingRef",
 0x18: "GPSDestBearing",
 0x19: "GPSDestDistanceRef",
 0x1A: "GPSDestDistance",
 0x1B: "GPSProcessingMethod",
 0x1C: "GPSAreaInformation",
 0x1D: "GPSDateStamp",
 0x1E: "GPSDifferential"
}

def s2n_motorola(str):
 x = 0
 for c in str:
   x = (x << 8) | ord(c)
 return x

def s2n_intel(str):
 x = 0
 y = 0
 for c in str:
   x = x | (ord(c) << y)
   y = y + 8
 return x


class Fraction:

 def __init__(self, num, den):
   self.num = num
   self.den = den

 def __repr__(self):
   # String representation
   return '%d/%d' % (self.num, self.den)


class TIFF_file:

 def __init__(self, data):
   self.data = data
   self.endian = data[0]

 def s2n(self, offset, length, signed=0):
   slice = self.data[offset:offset+length]
   if self.endian == 'I':
     val = s2n_intel(slice)
   else:
     val = s2n_motorola(slice)
   # Sign extension ?
   if signed:
     msb = 1 << (8*length - 1)
     if val & msb:
       val = val - (msb << 1)
   return val

 def first_IFD(self):
   return self.s2n(4, 4)

 def next_IFD(self, ifd):
   entries = self.s2n(ifd, 2)
   return self.s2n(ifd + 2 + 12 * entries, 4)

 def list_IFDs(self):
   i = self.first_IFD()
   a = []
   while i:
     a.append(i)
     i = self.next_IFD(i)
   return a
  
 def dump_IFD(self, ifd):
   entries = self.s2n(ifd, 2)
   a = []
   for i in range(entries):
     entry = ifd + 2 + 12*i
     tag = self.s2n(entry, 2)
     type = self.s2n(entry+2, 2)
     if not 1 <= type <= 10:
       continue # not handled
     typelen = [ 1, 1, 2, 4, 8, 1, 1, 2, 4, 8 ] [type-1]
     count = self.s2n(entry+4, 4)
     offset = entry+8
     if count*typelen > 4:
       offset = self.s2n(offset, 4)
     if type == 2:
       # Special case: nul-terminated ASCII string
       values = self.data[offset:offset+count-1]
     else:
       values = []
       signed = (type == 6 or type >= 8)
       for j in range(count):
         if type % 5:
           # Not a fraction
           value_j = self.s2n(offset, typelen, signed)
         else:
           # The type is either 5 or 10
           value_j = Fraction(self.s2n(offset,   4, signed),
                              self.s2n(offset+4, 4, signed))
         values.append(value_j)
         offset = offset + typelen
     # Now "values" is either a string or an array
     a.append((tag,type,values))
   return a


def print_IFD(fields, dict=EXIF_TAGS):
 for (tag,type,values) in fields:
   try:
     stag = dict[tag]
   except:
     stag = '0x%04X' % tag
   stype = ['B', # BYTE
            'A', # ASCII
            'S', # SHORT
            'L', # LONG
            'R', # RATIONAL
            'SB', # SBYTE
            'U', # UNDEFINED
            'SS', # SSHORT
            'SL', # SLONG
            'SR', # SRATIONAL
           ] [type-1]
   print '  %s(%s)=%s' % (stag,stype,repr(values))

def process_file(file):
 data = file.read(12)
 if data[0:4] <> '\377\330\377\341' or data[6:10] <> 'Exif':
   print ' Not an Exif file'
   return 1
 length = ord(data[4])*256 + ord(data[5])
 print ' Exif header length: %d bytes,' % length,
 data = file.read(length-8)
 if 0:
   # that's for debugging only
   open('exif.header','wb').write(data)
 print {'I':'Intel', 'M':'Motorola'}[data[0]], 'format'
 T = TIFF_file(data)
 L = T.list_IFDs()
 for i in range(len(L)):
   print ' IFD %d' % i,
   if i == 0: print '(main image)',
   if i == 1: print '(thumbnail)',
   print 'at offset %d:' % L[i]
   IFD = T.dump_IFD(L[i])
   print_IFD(IFD)
   exif_off = gps_off = 0
   for tag,type,values in IFD:
     if tag == 0x8769:
       exif_off = values[0]
     if tag == 0x8825:
       gps_off = values[0]
   if exif_off:
     print ' Exif SubIFD at offset %d:' % exif_off
     IFD = T.dump_IFD(exif_off)
     print_IFD(IFD)
     # Recent digital cameras have a little subdirectory
     # here, pointed to by tag 0xA005. Apparently, it's the
     # "Interoperability IFD", defined in Exif 2.1 and DCF.
     intr_off = 0
     for tag,type,values in IFD:
       if tag == 0xA005:
         intr_off = values[0]
     if intr_off:
       print ' Exif Interoperability SubSubIFD at offset %d:' % intr_off
       IFD = T.dump_IFD(intr_off)
       print_IFD(IFD, dict=INTR_TAGS)
   if gps_off:
     print ' GPS SubIFD at offset %d:' % gps_off
     IFD = T.dump_IFD(gps_off)
     print_IFD(IFD, dict=GPS_TAGS)
 return 0

def main():
 if len(sys.argv) < 2:
   sys.stderr.write('Usage: %s files...\n' % sys.argv[0])
   sys.exit(2)
 for filename in sys.argv[1:]:
   print filename+':'
   try:
     file = open(filename, 'rb')
     process_file(file)
   except IOError:
     print ' Cannot open file'
 sys.exit(0)

if __name__ == '__main__':
 main()

mercoledì 8 luglio 2009

Some code for writing some metadata within a png with PIL

Some functions I've written adapting some code of Nick Galbreath from http://mail.python.org/pipermail/image-sig/2007-August/004575.html .

#convert an image to png format
def convert_to_png(imagename, destPath):

filename = os.path.abspath(destPath)+'\\'+(os.path.basename(imagename)[0:-4])+".png"

if(os.path.basename(imagename)[-3:]!="png"):

cmmd = "convert "+imagename+" "+filename
os.system(cmmd)

return filename

#save the metadata dictionary to be loaded within the sourceimage into a new name file
def pngsave(sourceImage, metadataDict, filename):

from PIL import PngImagePlugin
#reserved = sourceImage.info

meta = PngImagePlugin.PngInfo()

# copy from Image.info to new dict
for k,v in metadataDict.iteritems():

#if k in reserved: continue
meta.add_text(k, str(v), 0)

# and save
sourceImage.save(filename, "PNG", pnginfo=meta)

I should write some fixing for jpg metadata.

sabato 27 giugno 2009

Multiple view Geometry Matlab Code

http://www.robots.ox.ac.uk/~vgg/hzbook/code/

Some useful matlab code for multiple view geometry!

mercoledì 27 maggio 2009

Gaussian Blur: C++ implementation

http://www.librow.com/articles/article-9

At the link above you can find some code for implementing a C++ gaussian blur filter.

sabato 16 maggio 2009

Gaussian in python

Python implementation of gaussian:

def gauss(x):
   gaussa=math.pi(2)
   gaussb=math.sqrt(gaussa)
   gaussc=1/gaussb
   gaussd=math.exp(-0.5*-2.00**2)
   gausse= gaussc*gaussd
   return gausse
   print gausse

from http://www.experts-exchange.com/Programming/Languages/Scripting/Python/Q_23657938.html

giovedì 14 maggio 2009

Image Processing course

http://www.ph.tn.tudelft.nl/Courses/FIP/

Online Image processing course!

mercoledì 13 maggio 2009

MATlab gaussian functions

http://www-mmdb.iai.uni-bonn.de/lehre/BIT/ss03_DSP_Vorlesung/matlab_demos/index.html

Other Gaussian filter in python

http://rcjp.wordpress.com/2008/04/02/gaussian-pil-image-filter/

Here you are, another gaussian filter for py!!!!

Gaussian filtering in Python

http://www.connellybarnes.com/code/python/filterfft

Here you could find some code, about gaussian filtering. It's useful for image processing.

"""
2D image filter of numpy arrays, via FFT.

Connelly Barnes, public domain 2007.

>>> filter([[1,2],[3,4]], [[0,1,0],[1,1,1],[0,1,0]])    # Filter greyscale image
array([[  8.,  11.],
       [ 14.,  17.]])
>>> A = [[[255,0,0],[0,255,0]],[[0,0,255],[255,0,0]]]   # Filter RGB image
>>> filter(A, gaussian())                               # Sigma: 0.5 (default)
array([[[ 206.4667464 ,   24.2666268 ,   24.2666268 ],
        [  48.5332536 ,  203.57409357,    2.89265282]],

       [[  48.5332536 ,    2.89265282,  203.57409357],
        [ 206.4667464 ,   24.2666268 ,   24.2666268 ]]])
>>> K = gaussian(3)                                     # Sigma: 3
>>> K = gaussian(3, 6)                                  # Sigma: 3 with 6x6 support
(A 6x6 numpy array)

>>> # Load image via PIL, filter, and show result
>>> I = numpy.asarray(Image.open('test.jpg'))
>>> I = filter(I, gaussian(4, (10,10)))
>>> Image.fromarray(numpy.uint8(I)).show()
"""

import numpy

__version__ = '1.0.0'

def filter(I, K, cache=None):
  """
  Filter image I with kernel K.

  Image color values outside I are set equal to the nearest border color on I.

  To filter many images of the same size with the same kernel more efficiently, use:

    >>> cache = []
    >>> filter(I1, K, cache)
    >>> filter(I2, K, cache)
    ...

  An even width filter is aligned by centering the filter window around each given
  output pixel and then rounding down the window extents in the x and y directions.
  """
  def roundup_pow2(x):
    y = 1
    while y <>
      y *= 2
    return y

  I = numpy.asarray(I)
  K = numpy.asarray(K)

  if len(I.shape) == 3:
    s = I.shape[0:2]
    L = []
    ans = numpy.concatenate([filter(I[:,:,i], K, L).reshape(s+(1,))
                             for i in range(I.shape[2])], 2)
    return ans
  if len(K.shape) != 2:
    raise ValueError('kernel is not a 2D array')
  if len(I.shape) != 2:
    raise ValueError('image is not a 2D or 3D array')

  s = (roundup_pow2(K.shape[0] + I.shape[0] - 1),
       roundup_pow2(K.shape[1] + I.shape[1] - 1))
  Ipad = numpy.zeros(s)
  Ipad[0:I.shape[0], 0:I.shape[1]] = I

  if cache is not None and len(cache) != 0:
    (Kpad,) = cache
  else:
    Kpad = numpy.zeros(s)
    Kpad[0:K.shape[0], 0:K.shape[1]] = numpy.flipud(numpy.fliplr(K))
    Kpad = numpy.fft.rfft2(Kpad)
    if cache is not None:
      cache[:] = [Kpad]

  Ipad[I.shape[0]:I.shape[0]+(K.shape[0]-1)//2,:I.shape[1]] = I[I.shape[0]-1,:]
  Ipad[:I.shape[0],I.shape[1]:I.shape[1]+(K.shape[1]-1)//2] = I[:,I.shape[1]-1].reshape((I.shape[0],1))

  xoff = K.shape[0]-(K.shape[0]-1)//2-1
  yoff = K.shape[1]-(K.shape[1]-1)//2-1
  Ipad[Ipad.shape[0]-xoff:,:I.shape[1]] = I[0,:]
  Ipad[:I.shape[0],Ipad.shape[1]-yoff:] = I[:,0].reshape((I.shape[0],1))

  Ipad[I.shape[0]:I.shape[0]+(K.shape[0]-1)//2,I.shape[1]:I.shape[1]+(K.shape[1]-1)//2] = I[-1,-1]
  Ipad[Ipad.shape[0]-xoff:,I.shape[1]:I.shape[1]+(K.shape[1]-1)//2] = I[0,-1]
  Ipad[I.shape[0]:I.shape[0]+(K.shape[0]-1)//2,Ipad.shape[1]-yoff:] = I[-1,0]
  Ipad[Ipad.shape[0]-xoff:,Ipad.shape[1]-yoff:] = I[0,0]

  ans = numpy.fft.irfft2(numpy.fft.rfft2(Ipad) * Kpad, Ipad.shape)

  off = ((K.shape[0]-1)//2, (K.shape[1]-1)//2)
  ans = ans[off[0]:off[0]+I.shape[0],off[1]:off[1]+I.shape[1]]

  return ans


def gaussian(sigma=0.5, shape=None):
  """
  Gaussian kernel numpy array with given sigma and shape.

  The shape argument defaults to ceil(6*sigma).
  """
  sigma = max(abs(sigma), 1e-10)
  if shape is None:
    shape = max(int(6*sigma+0.5), 1)
  if not isinstance(shape, tuple):
    shape = (shape, shape)
  x = numpy.arange(-(shape[0]-1)/2.0, (shape[0]-1)/2.0+1e-8)
  y = numpy.arange(-(shape[1]-1)/2.0, (shape[1]-1)/2.0+1e-8)
  Kx = numpy.exp(-x**2/(2*sigma**2))
  Ky = numpy.exp(-y**2/(2*sigma**2))
  ans = numpy.outer(Kx, Ky) / (2.0*numpy.pi*sigma**2)
  return ans/sum(sum(ans))


def test():
  print 'Testing:'
  def arrayeq(A, B):
    return sum(sum(abs(A-B)))<1e-7
  A = [[1,2],[3,4]]
  assert arrayeq(filter(A, [[1]]), A)
  assert arrayeq(filter(A, [[0,1,0],[1,1,1],[0,1,0]]),
                           [[8,11],[14,17]])
  assert arrayeq(filter(A, [[1,1,1,1,1]]),
                           [[7,8],[17,18]])
  assert arrayeq(filter(A, [[1,1,1,1,1]]*5),
                           [[55,60],[65,70]])
  assert arrayeq(filter([[1]], [[0,1,0],[1,1,1],[0,1,0]]),
                               [[5]])
  assert arrayeq(filter([[2]], [[3]]), [[6]])
  B = [[1,2,3],[4,5,6],[7,8,9]]
  assert arrayeq(filter(B, [[0,1,0],[1,1,1],[0,1,0]]),
                           [[9,13,17],[21,25,29],[33,37,41]])
  assert arrayeq(filter(A, A), [[10,16],[24,30]])

  print '  filter:     OK'

  assert arrayeq(gaussian(0), [[1]])
  assert arrayeq(gaussian(0.001), [[1]])
  assert arrayeq(gaussian(-0.5), gaussian(0.5))
  assert arrayeq(gaussian(0.5), [[ 0.01134374,  0.08381951,  0.01134374],
                                 [ 0.08381951,  0.61934703,  0.08381951],
                                 [ 0.01134374,  0.08381951,  0.01134374]])
  assert abs(sum(sum(gaussian(3)))-1)<1e-7
  assert abs(sum(sum(gaussian(3,(3,3))))-1)<1e-7
  assert abs(sum(sum(gaussian(3,(1,30))))-1)<1e-7
  assert gaussian(3,(3,3)).shape == (3,3)
  assert gaussian(3,(1,30)).shape == (1,30)
  assert gaussian(3,(30,1)).shape == (30,1)
  assert arrayeq(gaussian(100), gaussian(-100))
  print '  gaussian:   OK'


if __name__ == '__main__':
  test()

NOtes on cv's algorithms

http://ssip2003.info.uvt.ro/lectures/chetverikov/

Here you can find some interesting materials from Hungary uni's classes on Computer Vision.

domenica 10 maggio 2009

Implementation of KLT in python

http://fable.wiki.sourceforge.net/PyKLT

Note sulle Lezioni: DIS ROMA

http://www.dis.uniroma1.it/~visiope/Note.htm

On the link above, infos about various interesting topics.

Structure and motion funcitons in MATLAB

http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/TORR1/index.html

On the link above usefull functions.

Shape from Texture

http://www.csse.uwa.edu.au/~angie/

Interesting Shape from texture infos.

MATLAB and Octave Functions for Computer Vision and Image Processing

http://www.csse.uwa.edu.au/~pk/Research/MatlabFns/

On the link above you can find P. Kovesi's website with lots of image processing functions and samples.

lunedì 4 maggio 2009

Computer Vision Educational Library

http://cved.org/index.php
On the link above plenty of documentation.

Numpy for Matlab users: getting in touch with Numpy!!!!

http://www.scipy.org/NumPy_for_Matlab_Users
On the link above the official documentation about compairisons between Matlab and Numpy: bloody interesting and usefull!

From MATLAB to PYTHON!!!!

http://vnoel.wordpress.com/2008/05/03/bye-matlab-hello-python-thanks-sage/
On the link above, I've found an interesting article about migrating from closed source Matlab to very very open source Python...thanks God, Guido has done that snake!

Multiple View Geometry in Computer Vision

http://www.robots.ox.ac.uk/~vgg/hzbook/

Some samples from the book "Multiple View Geometry in Computer Vision".

mercoledì 29 aprile 2009

Python to Matlab interface

http://claymore.engineer.gvsu.edu/~steriana/Python/pymat.html

lunedì 27 aprile 2009

SVD python implementation

http://stitchpanorama.sourceforge.net/Python/svd.py


# Almost exact translation of the ALGOL SVD algorithm published in
# Numer. Math. 14, 403-420 (1970) by G. H. Golub and C. Reinsch
#
# Copyright (c) 2005 by Thomas R. Metcalf, helicity314-stitch  yahoo  com
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#
# Pure Python SVD algorithm.
# Input: 2-D list (m by n) with m >= n
# Output: U,W V so that A = U*W*VT
#    Note this program returns V not VT (=transpose(V))
#    On error, a ValueError is raised.
#
# Here is the test case (first example) from Golub and Reinsch
#
# a = [[22.,10., 2.,  3., 7.],
#      [14., 7.,10.,  0., 8.],
#      [-1.,13.,-1.,-11., 3.],
#      [-3.,-2.,13., -2., 4.],
#      [ 9., 8., 1., -2., 4.],
#      [ 9., 1.,-7.,  5.,-1.],
#      [ 2.,-6., 6.,  5., 1.],
#      [ 4., 5., 0., -2., 2.]]
#
# import svd
# import math
# u,w,vt = svd.svd(a)
# print w
#
# [35.327043465311384, 1.2982256062667619e-15,
#  19.999999999999996, 19.595917942265423, 0.0]
#
# the correct answer is (the order may vary)
#
# print (math.sqrt(1248.),20.,math.sqrt(384.),0.,0.)
#
# (35.327043465311391, 20.0, 19.595917942265423, 0.0, 0.0)
#
# transpose and matrix multiplication functions are also included
# to facilitate the solution of linear systems.
#
# Version 1.0 2005 May 01


import copy
import math

def svd(a):
  '''Compute the singular value decomposition of array.'''

  # Golub and Reinsch state that eps should not be smaller than the
  # machine precision, ie the smallest number
  # for which 1+e>1.  tol should be beta/e where beta is the smallest
  # positive number representable in the computer.
  eps = 1.e-15  # assumes double precision
  tol = 1.e-64/eps
  assert 1.0+eps > 1.0 # if this fails, make eps bigger
  assert tol > 0.0     # if this fails, make tol bigger
  itmax = 50
  u = copy.deepcopy(a)
  m = len(a)
  n = len(a[0])
  #if __debug__: print 'a is ',m,' by ',n

  if m < e =" [0.0]*n" q =" [0.0]*n" v =" []" g =" 0.0" x =" 0.0" s =" 0.0" l =" i+1" g =" 0.0" f =" u[i][i]" g =" math.sqrt(s)" g =" -math.sqrt(s)" h =" f*g-s" s =" 0.0" f =" s/h" s =" 0.0" s =" s" g =" 0.0" f =" u[i][i+1]" g =" math.sqrt(s)" g =" -math.sqrt(s)" h =" f*g" s="0.0" s =" s+(u[j][k]*u[i][k])" y =" abs(q[i])+abs(e[i])">x: x=y
  # accumulation of right hand gtransformations
  for i in range(n-1,-1,-1):
      if g != 0.0:
          h = g*u[i][i+1]
          for j in range(l,n): v[j][i] = u[i][j]/h
          for j in range(l,n):
              s=0.0
              for k in range(l,n): s += (u[i][k]*v[k][j])
              for k in range(l,n): v[k][j] += (s*v[k][i])
      for j in range(l,n):
          v[i][j] = 0.0
          v[j][i] = 0.0
      v[i][i] = 1.0
      g = e[i]
      l = i
  #accumulation of left hand transformations
  for i in range(n-1,-1,-1):
      l = i+1
      g = q[i]
      for j in range(l,n): u[i][j] = 0.0
      if g != 0.0:
          h = u[i][i]*g
          for j in range(l,n):
              s=0.0
              for k in range(l,m): s += (u[k][i]*u[k][j])
              f = s/h
              for k in range(i,m): u[k][j] += (f*u[k][i])
          for j in range(i,m): u[j][i] = u[j][i]/g
      else:
          for j in range(i,m): u[j][i] = 0.0
      u[i][i] += 1.0
  #diagonalization of the bidiagonal form
  eps = eps*x
  for k in range(n-1,-1,-1):
      for iteration in range(itmax):
          # test f splitting
          for l in range(k,-1,-1):
              goto_test_f_convergence = False
              if abs(e[l]) <= eps:                    # goto test f convergence                    goto_test_f_convergence = True                    break  # break out of l loop                if abs(q[l-1]) <= eps:                    # goto cancellation                    break  # break out of l loop            if not goto_test_f_convergence:                #cancellation of e[l] if l>0
              c = 0.0
              s = 1.0
              l1 = l-1
              for i in range(l,k+1):
                  f = s*e[i]
                  e[i] = c*e[i]
                  if abs(f) <= eps:                        #goto test f convergence                        break                    g = q[i]                    h = pythag(f,g)                    q[i] = h                    c = g/h                    s = -f/h                    for j in range(m):                        y = u[j][l1]                        z = u[j][i]                        u[j][l1] = y*c+z*s                        u[j][i] = -y*s+z*c            # test f convergence            z = q[k]            if l == k:                # convergence                if z<0.0:>= itmax-1:
              if __debug__: print 'Error: no convergence.'
              # should this move on the the next k or exit with error??
              #raise ValueError,'SVD Error: No convergence.'  # exit the program with error
              break  # break out of iteration loop and move on to next k
          # shift from bottom 2x2 minor
          x = q[l]
          y = q[k-1]
          g = e[k-1]
          h = e[k]
          f = ((y-z)*(y+z)+(g-h)*(g+h))/(2.0*h*y)
          g = pythag(f,1.0)
          if f < f =" ((x-z)*(x+z)+h*(y/(f-g)-h))/x" f =" ((x-z)*(x+z)+h*(y/(f+g)-h))/x" c =" 1.0" s =" 1.0" g =" e[i]" y =" q[i]" h =" s*g" g =" c*g" z =" pythag(f,h)" c =" f/z" s =" h/z" f =" x*c+g*s" g =" -x*s+g*c" h =" y*s" y =" y*c" x =" v[j][i-1]" z =" v[j][i]" z =" pythag(f,h)" c =" f/z" s =" h/z" f =" c*g+s*y" x =" -s*g+c*y" y =" u[j][i-1]" z =" u[j][i]" vt =" transpose(v)" absa =" abs(a)" absb =" abs(b)"> absb: return absa*math.sqrt(1.0+(absb/absa)**2)
  else:
      if absb == 0.0: return 0.0
      else: return absb*math.sqrt(1.0+(absa/absb)**2)

def transpose(a):
  '''Compute the transpose of a matrix.'''
  m = len(a)
  n = len(a[0])
  at = []
  for i in range(n): at.append([0.0]*m)
  for i in range(m):
      for j in range(n):
          at[j][i]=a[i][j]
  return at

def matrixmultiply(a,b):
  '''Multiply two matrices.
  a must be two dimensional
  b can be one or two dimensional.'''

  am = len(a)
  bm = len(b)
  an = len(a[0])
  try:
      bn = len(b[0])
  except TypeError:
      bn = 1
  if an != bm:
      raise ValueError, 'matrixmultiply error: array sizes do not match.'
  cm = am
  cn = bn
  if bn == 1:
      c = [0.0]*cm
  else:
      c = []
      for k in range(cm): c.append([0.0]*cn)
  for i in range(cm):
      for j in range(cn):
          for k in range(an):
              if bn == 1:
                  c[i] += a[i][k]*b[k]
              else:
                  c[i][j] += a[i][k]*b[k][j]

  return c

##       GNU GENERAL PUBLIC LICENSE
##          Version 2, June 1991

##  Copyright (C) 1989, 1991 Free Software Foundation, Inc.
##                        59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
##  Everyone is permitted to copy and distribute verbatim copies
##  of this license document, but changing it is not allowed.

##        Preamble

##   The licenses for most software are designed to take away your
## freedom to share and change it.  By contrast, the GNU General Public
## License is intended to guarantee your freedom to share and change free
## software--to make sure the software is free for all its users.  This
## General Public License applies to most of the Free Software
## Foundation's software and to any other program whose authors commit to
## using it.  (Some other Free Software Foundation software is covered by
## the GNU Library General Public License instead.)  You can apply it to
## your programs, too.

##   When we speak of free software, we are referring to freedom, not
## price.  Our General Public Licenses are designed to make sure that you
## have the freedom to distribute copies of free software (and charge for
## this service if you wish), that you receive source code or can get it
## if you want it, that you can change the software or use pieces of it
## in new free programs; and that you know you can do these things.

##   To protect your rights, we need to make restrictions that forbid
## anyone to deny you these rights or to ask you to surrender the rights.
## These restrictions translate to certain responsibilities for you if you
## distribute copies of the software, or if you modify it.

##   For example, if you distribute copies of such a program, whether
## gratis or for a fee, you must give the recipients all the rights that
## you have.  You must make sure that they, too, receive or can get the
## source code.  And you must show them these terms so they know their
## rights.

##   We protect your rights with two steps: (1) copyright the software, and
## (2) offer you this license which gives you legal permission to copy,
## distribute and/or modify the software.

##   Also, for each author's protection and ours, we want to make certain
## that everyone understands that there is no warranty for this free
## software.  If the software is modified by someone else and passed on, we
## want its recipients to know that what they have is not the original, so
## that any problems introduced by others will not reflect on the original
## authors' reputations.

##   Finally, any free program is threatened constantly by software
## patents.  We wish to avoid the danger that redistributors of a free
## program will individually obtain patent licenses, in effect making the
## program proprietary.  To prevent this, we have made it clear that any
## patent must be licensed for everyone's free use or not licensed at all.

##   The precise terms and conditions for copying, distribution and
## modification follow.
## 
##       GNU GENERAL PUBLIC LICENSE
##    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

##   0. This License applies to any program or other work which contains
## a notice placed by the copyright holder saying it may be distributed
## under the terms of this General Public License.  The "Program", below,
## refers to any such program or work, and a "work based on the Program"
## means either the Program or any derivative work under copyright law:
## that is to say, a work containing the Program or a portion of it,
## either verbatim or with modifications and/or translated into another
## language.  (Hereinafter, translation is included without limitation in
## the term "modification".)  Each licensee is addressed as "you".

## Activities other than copying, distribution and modification are not
## covered by this License; they are outside its scope.  The act of
## running the Program is not restricted, and the output from the Program
## is covered only if its contents constitute a work based on the
## Program (independent of having been made by running the Program).
## Whether that is true depends on what the Program does.

##   1. You may copy and distribute verbatim copies of the Program's
## source code as you receive it, in any medium, provided that you
## conspicuously and appropriately publish on each copy an appropriate
## copyright notice and disclaimer of warranty; keep intact all the
## notices that refer to this License and to the absence of any warranty;
## and give any other recipients of the Program a copy of this License
## along with the Program.

## You may charge a fee for the physical act of transferring a copy, and
## you may at your option offer warranty protection in exchange for a fee.

##   2. You may modify your copy or copies of the Program or any portion
## of it, thus forming a work based on the Program, and copy and
## distribute such modifications or work under the terms of Section 1
## above, provided that you also meet all of these conditions:

##     a) You must cause the modified files to carry prominent notices
##     stating that you changed the files and the date of any change.

##     b) You must cause any work that you distribute or publish, that in
##     whole or in part contains or is derived from the Program or any
##     part thereof, to be licensed as a whole at no charge to all third
##     parties under the terms of this License.

##     c) If the modified program normally reads commands interactively
##     when run, you must cause it, when started running for such
##     interactive use in the most ordinary way, to print or display an
##     announcement including an appropriate copyright notice and a
##     notice that there is no warranty (or else, saying that you provide
##     a warranty) and that users may redistribute the program under
##     these conditions, and telling the user how to view a copy of this
##     License.  (Exception: if the Program itself is interactive but
##     does not normally print such an announcement, your work based on
##     the Program is not required to print an announcement.)
## 
## These requirements apply to the modified work as a whole.  If
## identifiable sections of that work are not derived from the Program,
## and can be reasonably considered independent and separate works in
## themselves, then this License, and its terms, do not apply to those
## sections when you distribute them as separate works.  But when you
## distribute the same sections as part of a whole which is a work based
## on the Program, the distribution of the whole must be on the terms of
## this License, whose permissions for other licensees extend to the
## entire whole, and thus to each and every part regardless of who wrote it.

## Thus, it is not the intent of this section to claim rights or contest
## your rights to work written entirely by you; rather, the intent is to
## exercise the right to control the distribution of derivative or
## collective works based on the Program.

## In addition, mere aggregation of another work not based on the Program
## with the Program (or with a work based on the Program) on a volume of
## a storage or distribution medium does not bring the other work under
## the scope of this License.

##   3. You may copy and distribute the Program (or a work based on it,
## under Section 2) in object code or executable form under the terms of
## Sections 1 and 2 above provided that you also do one of the following:

##     a) Accompany it with the complete corresponding machine-readable
##     source code, which must be distributed under the terms of Sections
##     1 and 2 above on a medium customarily used for software interchange; or,

##     b) Accompany it with a written offer, valid for at least three
##     years, to give any third party, for a charge no more than your
##     cost of physically performing source distribution, a complete
##     machine-readable copy of the corresponding source code, to be
##     distributed under the terms of Sections 1 and 2 above on a medium
##     customarily used for software interchange; or,

##     c) Accompany it with the information you received as to the offer
##     to distribute corresponding source code.  (This alternative is
##     allowed only for noncommercial distribution and only if you
##     received the program in object code or executable form with such
##     an offer, in accord with Subsection b above.)

## The source code for a work means the preferred form of the work for
## making modifications to it.  For an executable work, complete source
## code means all the source code for all modules it contains, plus any
## associated interface definition files, plus the scripts used to
## control compilation and installation of the executable.  However, as a
## special exception, the source code distributed need not include
## anything that is normally distributed (in either source or binary
## form) with the major components (compiler, kernel, and so on) of the
## operating system on which the executable runs, unless that component
## itself accompanies the executable.

## If distribution of executable or object code is made by offering
## access to copy from a designated place, then offering equivalent
## access to copy the source code from the same place counts as
## distribution of the source code, even though third parties are not
## compelled to copy the source along with the object code.
## 
##   4. You may not copy, modify, sublicense, or distribute the Program
## except as expressly provided under this License.  Any attempt
## otherwise to copy, modify, sublicense or distribute the Program is
## void, and will automatically terminate your rights under this License.
## However, parties who have received copies, or rights, from you under
## this License will not have their licenses terminated so long as such
## parties remain in full compliance.

##   5. You are not required to accept this License, since you have not
## signed it.  However, nothing else grants you permission to modify or
## distribute the Program or its derivative works.  These actions are
## prohibited by law if you do not accept this License.  Therefore, by
## modifying or distributing the Program (or any work based on the
## Program), you indicate your acceptance of this License to do so, and
## all its terms and conditions for copying, distributing or modifying
## the Program or works based on it.

##   6. Each time you redistribute the Program (or any work based on the
## Program), the recipient automatically receives a license from the
## original licensor to copy, distribute or modify the Program subject to
## these terms and conditions.  You may not impose any further
## restrictions on the recipients' exercise of the rights granted herein.
## You are not responsible for enforcing compliance by third parties to
## this License.

##   7. If, as a consequence of a court judgment or allegation of patent
## infringement or for any other reason (not limited to patent issues),
## conditions are imposed on you (whether by court order, agreement or
## otherwise) that contradict the conditions of this License, they do not
## excuse you from the conditions of this License.  If you cannot
## distribute so as to satisfy simultaneously your obligations under this
## License and any other pertinent obligations, then as a consequence you
## may not distribute the Program at all.  For example, if a patent
## license would not permit royalty-free redistribution of the Program by
## all those who receive copies directly or indirectly through you, then
## the only way you could satisfy both it and this License would be to
## refrain entirely from distribution of the Program.

## If any portion of this section is held invalid or unenforceable under
## any particular circumstance, the balance of the section is intended to
## apply and the section as a whole is intended to apply in other
## circumstances.

## It is not the purpose of this section to induce you to infringe any
## patents or other property right claims or to contest validity of any
## such claims; this section has the sole purpose of protecting the
## integrity of the free software distribution system, which is
## implemented by public license practices.  Many people have made
## generous contributions to the wide range of software distributed
## through that system in reliance on consistent application of that
## system; it is up to the author/donor to decide if he or she is willing
## to distribute software through any other system and a licensee cannot
## impose that choice.

## This section is intended to make thoroughly clear what is believed to
## be a consequence of the rest of this License.
## 
##   8. If the distribution and/or use of the Program is restricted in
## certain countries either by patents or by copyrighted interfaces, the
## original copyright holder who places the Program under this License
## may add an explicit geographical distribution limitation excluding
## those countries, so that distribution is permitted only in or among
## countries not thus excluded.  In such case, this License incorporates
## the limitation as if written in the body of this License.

##   9. The Free Software Foundation may publish revised and/or new versions
## of the General Public License from time to time.  Such new versions will
## be similar in spirit to the present version, but may differ in detail to
## address new problems or concerns.

## Each version is given a distinguishing version number.  If the Program
## specifies a version number of this License which applies to it and "any
## later version", you have the option of following the terms and conditions
## either of that version or of any later version published by the Free
## Software Foundation.  If the Program does not specify a version number of
## this License, you may choose any version ever published by the Free Software
## Foundation.

##   10. If you wish to incorporate parts of the Program into other free
## programs whose distribution conditions are different, write to the author
## to ask for permission.  For software which is copyrighted by the Free
## Software Foundation, write to the Free Software Foundation; we sometimes
## make exceptions for this.  Our decision will be guided by the two goals
## of preserving the free status of all derivatives of our free software and
## of promoting the sharing and reuse of software generally.

##        NO WARRANTY

##   11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
## FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
## OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
## PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
## OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
## MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
## TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
## PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
## REPAIR OR CORRECTION.

##   12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
## WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
## REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
## INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
## OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
## TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
## YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
## PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
## POSSIBILITY OF SUCH DAMAGES.

##        END OF TERMS AND CONDITIONS
##

SVD algorithm in Numpy

http://www.scipy.org/doc/numpy_api_docs/numpy.linalg.linalg.html#svd

Numpy routine implementing svd operation, usefull for triangulating points.svd(a, full_matrices = 1, compute_uv = 1)

Singular Value Decomposition.

u,s,vh = svd(a)

If a is an M x N array, then the svd produces a factoring of the array
into two unitary (orthogonal) 2-d arrays u (MxM) and vh (NxN) and a
min(M,N)-length array of singular values such that
 
                a == dot(u,dot(S,vh))

where S is an MxN array of zeros whose main diagonal is s.

if compute_uv == 0, then return only the singular values
if full_matrices == 0, then only part of either u or vh is
                      returned so that it is MxN

Computation of a 3D structure: some sample code

http://cs.gmu.edu/~kosecka/examples-code/compute3DStructure.m
Some sample code for traingulating points in matlab on the link above.

% [X,lambda] = compute3DStructure(p, q, R, T)
% p - first image coordinates
% q - second image  coordinates
% R, T - displacement between the views
% X - 3D coordinates of the points with respect to 1st and 2nd view
% lambda - scales with respect to the first view (lambda x = X)

% The basic linear triangulation algorithm
% for recovering the depth of a point given the projection onto
% two images and the relative pose of the cameras,
% as described in Chapter 5, "An introduction to 3-D Vision"
% by Y. Ma, S. Soatto, J. Kosecka, S. Sastry (MASKS)
%
% Code distributed free for non-commercial use
% Copyright (c) MASKS, 2003
%
% Last modified 5/5/2003


function [XP, lambda] = compute3DStructure(p, q, R, T);
nc = size(q, 2);

% linear triangulation method
M = [];
for i=1:nc
 A = [ 0 -1  p(2,i) 0;
 -1  0 p(1,i) 0;
 (-R(2,:) + q(2,i)*R(3,:)) -T(2) + q(2,i)*T(3);
 (-R(1,:) + q(1,i)*R(3,:)) -T(1) + q(1,i)*T(3)];
 [ua,sa,va] = svd(A);
 X(:,i) = va(:,4);
end
XP(:,:,1) = [X(1,:)./X(4,:);X(2,:)./X(4,:);X(3,:)./X(4,:); X(4,:)./X(4,:)];
lambda = XP(3,:,1);

XP(:,:,2) = [R, T; 0 0 0 1]*XP(:,:,1);

Two view Demo: an example

http://vision.ucla.edu//MASKS/twoview_reconstruction/TwoViewDemo.m
On the link above some sample code in matlab for fulfilling some triangulation operations.
% two view example of calibrated structure from motion
% recovery
close all; clear;

im0 = rgb2gray(imread('boxes1.bmp','bmp'));
im1 = rgb2gray(imread('boxes8.bmp', 'bmp'));
[ydim, xdim] = size(im0);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% intrinsic parameter matrix
Khat = [653.91 0 320.52; 0 649.72 220.40; 0 0 1];

%%%%%%%%%%%%%%%%%%%
% undo distortion
im0 = distortion(im0,0.03);
im1 = distortion(im1,0.03);

%%%%%%%%%%%%%%%%%%%%%
load boxpoints  % 13 points seen in two frames frames
NPOINTS =  size(xim(:,:,1),2);  

figure(1); imagesc(im0); colormap gray; hold on; axis equal; axis off;
plot(xim(1,:,1), xim(2,:,1),'w.');
figure(2); imagesc(im1); colormap gray; hold on; axis equal; axis off;
plot(xim(1,:,2), xim(2,:,2),'w.');


FRAMES = 2;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% compute retinal coordinates
for i = 1:FRAMES
 xn(:,:,i) = inv(Khat)*xim(:,:,i);
end

%%%%%%%%%%%%%%%%%%%%
% initialize motion
[T0, R0] = dessential(xn(:,:,1), xn(:,:,2));

lambda = zeros(1,NPOINTS);
XE = zeros(3,NPOINTS,FRAMES);
[X,lambda]=compute3DStructure(xn(:,:,1),xn(:,:,2), R0, T0);
figure(FRAMES + 1);
XE(:,:,1) = X(:,:,1);
XE(:,:,2) = X(:,:,2);
plot3(XE(1,:,1),XE(2,:,1),XE(3,:,1),'r.'); hold on;

% pause
[l3d, inc]  = boxseqlines(XE(1:3,:,1));
lnum = size(l3d,2);
for i=1:lnum
  line([l3d(1,i) l3d(4,i)], [l3d(2,i) l3d(5,i)], [l3d(3,i) l3d(6,i)]);
end;
xlabel('x'); ylabel('y'); zlabel('z'); % axis equal;
title('Euclidean reconstruction'); view(220,20); axis equal; box on;
hold on;
pause;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% compute residuals and reproject them
xres0=Khat*[X(1,:,1)./X(3,:,1); X(2,:,1)./X(3,:,1); ones(1,NPOINTS)];
xres1=Khat*[X(1,:,2)./X(3,:,2); X(2,:,2)./X(3,:,2); ones(1,NPOINTS)];
figure(1); hold on; plot(xres0(1,:), xres0(2,:),'y.'); 
figure(FRAMES); hold on; plot(xres1(1,:), xres1(2,:),'y.');
pause;

xd0 = xim(1:2,:,1) - xres0(1:2,:);
xd1 = xim(1:2,:,2) - xres1(1:2,:);
res = [xd0'; xd1'];
f = [norm(res'*res)/2];
% pause
% two view example of calibrated structure from motion
% recovery
close all; clear;

im0 = rgb2gray(imread('boxes1.bmp','bmp'));
im1 = rgb2gray(imread('boxes8.bmp', 'bmp'));
[ydim, xdim] = size(im0);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% intrinsic parameter matrix
Khat = [653.91 0 320.52; 0 649.72 220.40; 0 0 1];

%%%%%%%%%%%%%%%%%%%
% undo distortion
im0 = distortion(im0,0.03);
im1 = distortion(im1,0.03);

%%%%%%%%%%%%%%%%%%%%%
load boxpoints  % 13 points seen in two frames frames
NPOINTS =  size(xim(:,:,1),2);  

figure(1); imagesc(im0); colormap gray; hold on; axis equal; axis off;
plot(xim(1,:,1), xim(2,:,1),'w.');
figure(2); imagesc(im1); colormap gray; hold on; axis equal; axis off;
plot(xim(1,:,2), xim(2,:,2),'w.');


FRAMES = 2;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% compute retinal coordinates
for i = 1:FRAMES
 xn(:,:,i) = inv(Khat)*xim(:,:,i);
end

%%%%%%%%%%%%%%%%%%%%
% initialize motion
[T0, R0] = dessential(xn(:,:,1), xn(:,:,2));

lambda = zeros(1,NPOINTS);
XE = zeros(3,NPOINTS,FRAMES);
[X,lambda]=compute3DStructure(xn(:,:,1),xn(:,:,2), R0, T0);
figure(FRAMES + 1);
XE(:,:,1) = X(:,:,1);
XE(:,:,2) = X(:,:,2);
plot3(XE(1,:,1),XE(2,:,1),XE(3,:,1),'r.'); hold on;

% pause
[l3d, inc]  = boxseqlines(XE(1:3,:,1));
lnum = size(l3d,2);
for i=1:lnum
  line([l3d(1,i) l3d(4,i)], [l3d(2,i) l3d(5,i)], [l3d(3,i) l3d(6,i)]);
end;
xlabel('x'); ylabel('y'); zlabel('z'); % axis equal;
title('Euclidean reconstruction'); view(220,20); axis equal; box on;
hold on;
pause;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% compute residuals and reproject them
xres0=Khat*[X(1,:,1)./X(3,:,1); X(2,:,1)./X(3,:,1); ones(1,NPOINTS)];
xres1=Khat*[X(1,:,2)./X(3,:,2); X(2,:,2)./X(3,:,2); ones(1,NPOINTS)];
figure(1); hold on; plot(xres0(1,:), xres0(2,:),'y.'); 
figure(FRAMES); hold on; plot(xres1(1,:), xres1(2,:),'y.');
pause;

xd0 = xim(1:2,:,1) - xres0(1:2,:);
xd1 = xim(1:2,:,2) - xres1(1:2,:);
res = [xd0'; xd1'];
f = [norm(res'*res)/2];
% pause

sabato 25 aprile 2009

Face Recognition on openCV

http://opencv.willowgarage.com/wiki/FaceRecognition

Face detection with openCV python bindings

http://eclecti.cc/olpc/face-detection-on-the-olpc-xo

http://blog.jozilla.net/2008/06/27/fun-with-python-opencv-and-face-detection/

On the link above you can find the following code:

#!/usr/bin/python

# Face Detection using OpenCV. Based on sample code by Roman Stanchak
# Nirav Patel http://eclecti.cc 5/20/2008

import sys, os
from opencv.cv import *
from opencv.highgui import *

def detectObject(image):
grayscale = cvCreateImage(cvSize(640, 480), 8, 1)
cvCvtColor(image, grayscale, CV_BGR2GRAY)
storage = cvCreateMemStorage(0)
cvClearMemStorage(storage)
cvEqualizeHist(grayscale, grayscale)
cascade = cvLoadHaarClassifierCascade('haarcascade_frontalface_alt.xml',
cvSize(1,1))
faces = cvHaarDetectObjects(grayscale, cascade, storage, 1.2, 2,
CV_HAAR_DO_CANNY_PRUNING, cvSize(100,100))

if faces:
for i in faces:
cvRectangle(image, cvPoint( int(i.x), int(i.y)),
cvPoint(int(i.x+i.width), int(i.y+i.height)),
CV_RGB(0,255,0), 3, 8, 0)

def displayObject(image):
cvNamedWindow("face", 1)
cvShowImage("face", image)
cvWaitKey(0)
cvDestroyWindow("face")

def main():
# Uses xawtv. Gstreamer can be used instead, but I found it much slower
os.system("v4lctl snap jpeg 640x480 /tmp/face.jpg")
image = cvLoadImage("/tmp/face.jpg")
detectObject(image)
displayObject(image)
cvSaveImage("/tmp/face.jpg", image)

if __name__ == "__main__":
main()

Experimental funcionalities in openCV

http://www710.univ-lyon1.fr/~bouakaz/OpenCV-0.9.5/docs/ref/OpenCVRef_Experimental.htm#decl_cvFindStereoCorrespondence

On the link above some information about experimental openCV's functionalities in stereo-matching and object recognition.

OpenCV: Effort for 3D reconstruction

http://opencv.willowgarage.com/wiki/EffortFor3dReconstruction

On the link above, some documentation about openCV api's routines for 3D reconstructing processes.

giovedì 23 aprile 2009

Camera calibration and point reconstruction using DLT in python

http://www.mail-archive.com/floatcanvas@mithis.com/msg00513.html
Here you have some code for camera calibration and point reconstruction using DLT.
I've found it at the link above.

Here is an implementation of camera calibration and point
reconstruction using direct linear transformation (DLT) in python. I
intend to use that in my application with FloatCanvas.
After Larry Meyn's suggestion, the code uses Numpy and SVD for solving
the problem.
The code works for both 2 and 3 dimensional camera calibration and for
any number of views (cameras).
At the end of the code I show some simple examples (just run the code)
and I tried to comment it as much as possible. I know the code works
but probably it is not very pythonic and efficient. So, I'd appreciate
any comments and corrections to the code.

In relation to FC (actually to FC2), I will integrate this code in FC2
to find the world coordinates of points in an image after the camera
calibration step.
For a 2D case with only one view (camera), this transformation from
image projection to world (2D) coordinates (x, y) is just a 3x3
transformation (M) applied to the image coordinates (u, v) followed by
a scaling (see lines 153-158), so it would be easy to make this simple
use of DLT native in FC2.
Maybe FC2 could include the whole DLT calibration in its library; to
ensure it is correctly implemented and easy to use (I agree with Bob
Cunningham), or at least we could add it as an example in FC.

#file DLTx.py version .1
'''
Camera calibration and point reconstruction based on direct linear transformation (DLT).

The fundamental problem here is to find a mathematical relationship between the
coordinates  of a 3D point and its projection onto the image plane. The DLT
(a linear apporximation to this problem) is derived from modelling the object
and its projection on the image plane as a pinhole camera situation.
In simplistic terms, using the pinhole camera model, it can be found by similar
triangles the following relation between the image coordinates (u,v) and the 3D
point (X,Y,Z):
[ u ]   [ L1  L2  L3  L4 ] [ X ]
[ v ] = [ L5  L6  L7  L8 ] [ Y ]
[ 1 ]   [ L9 L10 L11 L12 ] [ Z ]
                           [ 1 ]
The matrix L is kwnown as the camera matrix or camera projection matrix. For a
2D point (X,Y), the last column of the matrix doesn't exist. In fact, the L12
term (or L9 for 2D DLT) is not independent from the other parameters and then
there are only 11 (or 8 for 2D DLT) independent parameters in the DLT to be
determined.

DLT is typically used in two steps: 1. camera calibration and 2. object (point)
reconstruction.
The camera calibration step consists in digitizing points with known coordiantes
in the real space.
At least 4 points are necessary for the calibration of a plane (2D DLT) and at
least 6 points for the calibration of a volume (3D DLT). For the 2D DLT, at least
one view of the object (points) must be entered. For the 3D DLT, at least 2
different views of the object (points) must be entered.
These coordinates (from the object and image(s)) are inputed to the DLTcalib
algorithm which  estimates the camera parameters (8 for 2D DLT and 11 for 3D DLT).
With these camera parameters and with the camera(s) at the same position of the
calibration step,  we now can reconstruct the real position of any point inside
the calibrated space (area for 2D DLT and volume for the 3D DLT) from the point
position(s) viewed by the same fixed camera(s).

This code can perform 2D or 3D DLT with any number of views (cameras).
For 3D DLT, at least two views (cameras) are necessary.

There are more accurate (but more complex) algorithms for camera calibration that
also consider lens distortion. For example, OpenCV and Tsai softwares have been
ported to Python. However, DLT is classic, simple, and effective (fast) for
most applications.

About DLT, see: http://kwon3d.com/theory/dlt/dlt.html

This code is based on different implementations and teaching material on DLT
found in the internet.
'''

import numpy as N

def DLTcalib(nd, xyz, uv):
 '''
 Camera calibration by DLT using known object points and their image points.

 This code performs 2D or 3D DLT camera calibration with any number of views (cameras).
 For 3D DLT, at least two views (cameras) are necessary.
 Inputs:
  nd is the number of dimensions of the object space: 3 for 3D DLT and 2 for 2D DLT.
  xyz are the coordinates in the object 3D or 2D space of the calibration points.
  uv are the coordinates in the image 2D space of these calibration points.
  The coordinates (x,y,z and u,v) are given as columns and the different points as rows.
  For the 2D DLT (object planar space), only the first 2 columns (x and y) are used.
  There must be at least 6 calibration points for the 3D DLT and 4 for the 2D DLT.
 Outputs:
  L: array of the 8 or 11 parameters of the calibration matrix
  err: error of the DLT (mean residual of the DLT transformation in units of camera coordinates).
 '''

 #Convert all variables to numpy array:
 xyz = N.asarray(xyz)
 uv = N.asarray(uv)
 #number of points:
 np = xyz.shape[0]
 #Check the parameters:
 if uv.shape[0] != np:
     raise ValueError, 'xyz (%d points) and uv (%d points) have different number of points.' %(np, uv.shape[0])
 if (nd == 2 and xyz.shape[1] != 2) or (nd == 3 and xyz.shape[1] != 3):
     raise ValueError, 'Incorrect number of coordinates (%d) for %dD DLT (it should be %d).' %(xyz.shape[1],nd,nd)
 if nd == 3 and np < nd ="=" xyzn =" Normalization(nd," uvn =" Normalization(2," a =" []" nd ="=" y =" xyzn[i,0]," v =" uvn[i,0]," nd ="=" z =" xyzn[i,0]," v =" uvn[i,0]," a =" N.asarray(A)" vh =" N.linalg.svd(A)" l =" Vh[-1,:]" h =" L.reshape(3,nd+1)" h =" N.dot(" h =" H" l =" H.flatten(0)" uv2 =" N.dot(" uv2 =" uv2/uv2[2,:]" err =" N.sqrt(" ls =" N.asarray(Ls)" ndim ="="> 1 and nc != Ls.shape[0]:
     raise ValueError, 'Number of views (%d) and number of sets of camera calibration parameters (%d) are different.' %(nc, Ls.shape[0])
 if nd == 3 and Ls.ndim == 1:
     raise ValueError, 'At least two sets of camera calibration parameters are needed for 3D point reconstruction.'

 if nc == 1: #2D and 1 camera (view), the simplest (and fastest) case
     #One could calculate inv(H) and input that to the code to speed up things if needed.
     #(If there is only 1 camera, this transformation is all Floatcanvas2 might need)
     Hinv = N.linalg.inv( Ls.reshape(3,3) )
     #Point coordinates in space:
     xyz = N.dot( Hinv,[uvs[0],uvs[1],1] )
     xyz = xyz[0:2]/xyz[2]    
 else:
     M = []
     for i in range(nc):
         L = Ls[i,:]
         u,v = uvs[i][0], uvs[i][1] #this indexing works for both list and numpy array
         if nd == 2:   
             M.append( [L[0]-u*L[6], L[1]-u*L[7], L[2]-u*L[8]] )
             M.append( [L[3]-v*L[6], L[4]-v*L[7], L[5]-v*L[8]] )
         elif nd == 3:
             M.append( [L[0]-u*L[8], L[1]-u*L[9], L[2]-u*L[10], L[3]-u*L[11]] )
             M.append( [L[4]-v*L[8], L[5]-v*L[9], L[6]-v*L[10], L[7]-v*L[11]] )
  
     #Find the xyz coordinates:
     U, S, Vh = N.linalg.svd(N.asarray(M))
     #Point coordinates in space:
     xyz = Vh[-1,0:-1] / Vh[-1,-1]

 return xyz


def Normalization(nd,x):
 '''
 Normalization of coordinates (centroid to the origin and mean distance of sqrt(2 or 3).

 Inputs:
  nd: number of dimensions (2 for 2D; 3 for 3D)
  x: the data to be normalized (directions at different columns and points at rows)
 Outputs:
  Tr: the transformation matrix (translation plus scaling)
  x: the transformed data
 '''

 x = N.asarray(x)
 m, s = N.mean(x,0), N.std(x)
 if nd==2:
     Tr = N.array([[s, 0, m[0]], [0, s, m[1]], [0, 0, 1]])
 else:
     Tr = N.array([[s, 0, 0, m[0]], [0, s, 0, m[1]], [0, 0, s, m[2]], [0, 0, 0, 1]])
  
 Tr = N.linalg.inv(Tr)
 x = N.dot( Tr, N.concatenate( (x.T, N.ones((1,x.shape[0]))) ) )
 x = x[0:nd,:].T

 return Tr, x


def test():
 #Tests of DLTx
 print ''
 print 'Test of camera calibration and point reconstruction based on direct linear transformation (DLT).'
 print '3D (x, y, z) coordinates (in cm) of the corner of a cube (the measurement error is at least 0.2 cm):'
 xyz = [[0,0,0], [0,12.3,0], [14.5,12.3,0], [14.5,0,0], [0,0,14.5], [0,12.3,14.5], [14.5,12.3,14.5], [14.5,0,14.5]]
 print N.asarray(xyz)
 print '2D (u, v) coordinates (in pixels) of 4 different views of the cube:'
 uv1 = [[1302,1147],[1110,976],[1411,863],[1618,1012],[1324,812],[1127,658],[1433,564],[1645,704]]
 uv2 = [[1094,1187],[1130,956],[1514,968],[1532,1187],[1076,854],[1109,647],[1514,659],[1523,860]]
 uv3 = [[1073,866],[1319,761],[1580,896],[1352,1016],[1064,545],[1304,449],[1568,557],[1313,668]]
 uv4 = [[1205,1511],[1193,1142],[1601,1121],[1631,1487],[1157,1550],[1139,1124],[1628,1100],[1661,1520]]
 print 'uv1:'
 print N.asarray(uv1)
 print 'uv2:'
 print N.asarray(uv2)
 print 'uv3:'
 print N.asarray(uv3)
 print 'uv4:'
 print N.asarray(uv4)

 print ''
 print 'Use 4 views to perform a 3D calibration of the camera with 8 points of the cube:'
 nd=3
 nc=4
 L1, err1 = DLTcalib(nd, xyz, uv1)
 print 'Camera calibration parameters based on view #1:'
 print L1
 print 'Error of the calibration of view #1 (in pixels):'
 print err1
 L2, err2 = DLTcalib(nd, xyz, uv2)
 print 'Camera calibration parameters based on view #2:'
 print L2
 print 'Error of the calibration of view #2 (in pixels):'
 print err2
 L3, err3 = DLTcalib(nd, xyz, uv3)
 print 'Camera calibration parameters based on view #3:'
 print L3
 print 'Error of the calibration of view #3 (in pixels):'
 print err3
 L4, err4 = DLTcalib(nd, xyz, uv4)
 print 'Camera calibration parameters based on view #4:'
 print L4
 print 'Error of the calibration of view #4 (in pixels):'
 print err4
 xyz1234 = N.zeros((len(xyz),3))
 L1234 = [L1,L2,L3,L4]
 for i in range(len(uv1)):
     xyz1234[i,:] = DLTrecon( nd, nc, L1234, [uv1[i],uv2[i],uv3[i],uv4[i]] )
 print 'Reconstruction of the same 8 points based on 4 views and the camera calibration parameters:'
 print xyz1234
 print 'Mean error of the point reconstruction using the DLT (error in cm):'
 print N.mean(N.sqrt(N.sum((N.array(xyz1234)-N.array(xyz))**2,1)))

 print ''
 print 'Test of the 2D DLT'
 print '2D (x, y) coordinates (in cm) of the corner of a square (the measurement error is at least 0.2 cm):'
 xy = [[0,0], [0,12.3], [14.5,12.3], [14.5,0]]
 print N.asarray(xy)
 print '2D (u, v) coordinates (in pixels) of 2 different views of the square:'
 uv1 = [[1302,1147],[1110,976],[1411,863],[1618,1012]]
 uv2 = [[1094,1187],[1130,956],[1514,968],[1532,1187]]
 print 'uv1:'
 print N.asarray(uv1)
 print 'uv2:'
 print N.asarray(uv2)
 print ''
 print 'Use 2 views to perform a 2D calibration of the camera with 4 points of the square:'
 nd=2
 nc=2
 L1, err1 = DLTcalib(nd, xy, uv1)
 print 'Camera calibration parameters based on view #1:'
 print L1
 print 'Error of the calibration of view #1 (in pixels):'
 print err1
 L2, err2 = DLTcalib(nd, xy, uv2)
 print 'Camera calibration parameters based on view #2:'
 print L2
 print 'Error of the calibration of view #2 (in pixels):'
 print err2
 xy12 = N.zeros((len(xy),2))
 L12 = [L1,L2]
 for i in range(len(uv1)):
     xy12[i,:] = DLTrecon( nd, nc, L12, [uv1[i],uv2[i]] )
 print 'Reconstruction of the same 4 points based on 2 views and the camera calibration parameters:'
 print xy12
 print 'Mean error of the point reconstruction using the DLT (error in cm):'
 print N.mean(N.sqrt(N.sum((N.array(xy12)-N.array(xy))**2,1)))

 print ''
 print 'Use only one view to perform a 2D calibration of the camera with 4 points of the square:'
 nd=2
 nc=1
 L1, err1 = DLTcalib(nd, xy, uv1)
 print 'Camera calibration parameters based on view #1:'
 print L1
 print 'Error of the calibration of view #1 (in pixels):'
 print err1
 xy1 = N.zeros((len(xy),2))
 for i in range(len(uv1)):
     xy1[i,:] = DLTrecon( nd, nc, L1, uv1[i] )
 print 'Reconstruction of the same 4 points based on one view and the camera calibration parameters:'
 print xy1
 print 'Mean error of the point reconstruction using the DLT (error in cm):'
 print N.mean(N.sqrt(N.sum((N.array(xy1)-N.array(xy))**2,1)))

test()

Tutorial about 3D Computer Vision's concepts

http://www.cs.unc.edu/~marc/tutorial/

Marc Pollefeys'Visual 3D Modeling from Images.

An invitation to 3-D Vision

http://vision.ucla.edu/MASKS/

Lots of documentation about 3D computer vision on the link above!

VTK: Delaunay triangulation

http://www.vtk.org/doc/release/5.0/html/c2_vtk_e_1.html#c2_vtk_e_vtkDelaunay3D

vtk: triangulation and texturizing

http://www.kino3d.com/forum/viewtopic.php?t=4705&postdays=0&postorder=asc&start=60&sid=56ed646e2abdf2c07ee06da807f2cace

Interesting link where you can find some vtk(python) code for triangulation and meshing operations.

Image to Array with numpy.asarray()

http://docs.scipy.org/doc/numpy/reference/index.html
At the link above you can find a sort of reference manual of Numpy.
Here some sample code from: http://stackoverflow.com/questions/524930/numpy-pil-adding-an-image

from PIL import Image
from numpy import *

im1 = Image.open('/Users/rem7/Desktop/_1.jpg')
im2 = Image.open('/Users/rem7/Desktop/_2.jpg')


im1arr = asarray(im1)
im2arr = asarray(im2)

Image to Array, array to Image

http://mail.python.org/pipermail/python-list/2000-August/046892.html

These are code and comments from the code above:

It is straightforward to convert Numeric arrays into PIL images:
#--------------------
from Numeric import *
from Pil import Image

# construct an arbitrary Numeric matrix
dims = (256,256)
arr = zeros(dims,Int8)
for i in xrange(dims[0]):
   arr[i,i] = 255

# and convert it to an Image
img = Image.fromstring('L',dims,arr.tostring())
img.save('foo.gif')
#--------------------

or to convert from a PIL image back to a Numeric array:

#--------------------
from Numeric import *
from Pil import Image

newImg = Image.open('foo.gif')
newArr = fromstring(newImg.tostring(),Int8)
newArr = reshape(newArr,newImg.size)
#-------------------

If you are constructing images from Numeric arrays, don't forget that
the Image values should run from 0-255, so you'll need to scale your
data in the Numeric array before creating the image.

mercoledì 22 aprile 2009

Python bindings to CGAL by INRIA

http://ralyx.inria.fr/2006/Raweb/geometrica/uid51.html

The goal of the CGAL-Python project is to provide Python bindings for the CGAL library. CGAL is the Computational Geometry Algorithms Library, a C++ library of generic, efficient and robust geometric algorithms.
You can find the code at http://cgal-python.gforge.inria.fr/.

SIFT Python Implementation

http://jesolem.blogspot.com/2009/02/sift-python-implementation.html

This is the text and the code from the link above:

I'd like to share a Python interface I wrote for David Lowe's Scale Invariant Feature Transform (SIFT) implementation. David, the inventor of SIFT, has since several years generously shared binaries with a Matlab interface on his website. Inspired by the Matlab files for reading keypoint descriptor files and for matching between images, I decided to write a Python version. Here it is: sift.py.

The file itself should be self-explanatory, especially together with the documentation that comes with Lowe's zip-file. Anyway, here's an example:

>>>from PIL import Image
>>>from numpy import *
>>>import sift

>>>sift.process_image('basmati.pgm', 'basmati.key')
>>>l1,d1 = sift.read_features_from_file('basmati.key')
>>>im = array(Image.open('basmati.pgm'))
>>>sift.plot_features(im,l1)

Py wrapper for Sift++

http://code.google.com/p/pysift/

Python wrapper for Sift ++, available on google code.

SIFT

http://www.cs.ubc.ca/~lowe/keypoints/

David Lowe's SIFT keypoint detector.

VLFeat is a MATLAB-compatible implementation common computer vision algorithms

http://www.vlfeat.org/

VLFeat is a MATLAB-compatible implementation common computer vision algorithms such as SIFT, MSER, k-means, hierarchical k-means, and agglomerative information bottleneck.

MSER for Matlab

http://www.vlfeat.org/~vedaldi/code/mser.html

MSER algorithm to find correspondences between images' features. It's gor Matlab but quite handy to rewrite into other languages.

SnapMatcher (A Near-Duplicate Image Finder)

http://monsterden.net/software/snapmatcher#news

An Images matcher application.

Pattern\Image Matching

http://www.daniweb.com/forums/thread166744.html#

This is the text and the code from the link above:

Hello all,

I've written a piece of code that lets me generate or read a grid where all cells have a value. The grid stands for a simple 2d visualisation of an area where types of housing are to be build. The values of a cell stand for some different housing types. By a floodfill algorithm the program detects 'clusters' of cells with the same value. A cluster of cells hereby stands for 1 building with the type according to the cell value of this cluster.

This is the code so far:

# Create a Python spatial grid.

from math import sqrt
import csv

class Grid(object):
markcnt = 0
"""Grid class with a width, height, length"""
def __init__(self, width, height):
self.grid = []
self.width = width
self.height = height
self.length = width * height
for x in range(self.width):
col = []
for y in range(self.height):
col.append(Cell(x, y, self.grid))
self.grid.append(col)

class Area(object):
area = 0

def getWidth(self):
return self.width

def getHeight(self):
return self.height

def getLength(self):
return self.length

def __len__(self):
return self.length

def setCellValue(self, x, y, i):
self.grid[x][y].value = i

def getCellValue(self, x, y):
return self.grid[x][y].value

def getCell(self, x, y):
return self.grid[x][y]

def __getitem__(self, (x, y)):
return self.grid[x][y]

def index(self, value):
return self.grid.index(value)

def fillBlock(self, left, right, top, bottom, value):
# You can choose if you want to iterate over a range of left - exclusive right
# or over a range of left - inclusive right. Same goes for top - exclusive bottom
# or top - inclusive bottom
# For now it's exclusive
for x in range(left, right):
for y in range(top, bottom):
self[x, y].value = value

"""
def floodFill(self, x, y, landUse):
visited = set()
self.rec_floodFill(x, y, landUse, visited)

def rec_floodFill(self, x, y, landUse, visited):
if (x <>= self.width or y >= self.height):
return

cell = self.grid[x][y]
if (x, y) in visited:
return
visited.add((x,y))

if (cell.value != landUse):
return

print visited
self.rec_floodFill(x-1, y, landUse, visited)
self.rec_floodFill(x+1, y, landUse, visited)
self.rec_floodFill(x, y-1, landUse, visited)
self.rec_floodFill(x, y+1, landUse, visited)
"""

def firstFreeCell(self):
for y in range(self.height):
for x in range(self.width):
if self.grid[x][y].clusterId == -1:
return self.grid[x][y]
return None

def reset(self):
for y in range(self.height):
for x in range(self.width):
self.grid[x][y].clusterId == -1

def floodFill(self, x, y, clusterId, landUse):
if (x <>= self.width or y >= self.height):
return

cell = self.grid[x][y]
area = 0
if (cell.clusterId != -1 or cell.value != landUse):
return

cell.setClusterId(clusterId)
area += 1

self.floodFill(x-1, y, clusterId, landUse)
self.floodFill(x+1, y, clusterId, landUse)
self.floodFill(x, y-1, clusterId, landUse)
self.floodFill(x, y+1, clusterId, landUse)

def analyze(self):
freeCell = self.firstFreeCell()
clusterId = 0
while freeCell != None:
self.floodFill(freeCell.x, freeCell.y, clusterId, freeCell.value)

freeCell = self.firstFreeCell()
clusterId += 1

def printClusterId(self, clusterId):
for y in range(self.height):
for x in range(self.width):
if self.grid[x][y].clusterId == clusterId:
print 'clusterId:',clusterId,'Cell-coordinates:','(',self.grid[x][y].x,',',self.grid[x][y].y,')'

"""
def printVisited(self, visited):
for y in range(self.height):
for x in range(self.width):
if self.grid[x][y].visited == True:
print self.grid[x][y].value,
print
"""

def printVisited(self, visited):
for y in range(self.height):
for x in range(self.width):
if self[x, y].visited:
print self[x, y].value,
else:
print "x",
print

def printGrid(self):
for y in range(self.height):
for x in range(self.width):
print self[x, y].value,
print

def load(cls, filename):
print "Loaded csv file"
loadGrid = []
reader = csv.reader(open(filename), delimiter=';')
for line in reader:
loadGrid.append(line)
width = len(loadGrid[0])
height = len(loadGrid)
grid = Grid(width, height)
for x in range(width):
for y in range(height):
grid.getCell(x, y).value = loadGrid[x][y]
return grid
load = classmethod(load)

class Cell(object):
def __init__(self, x, y, grid):
self.x = x
self.y = y
self.grid = grid
self.value = 0
#self.visited = False
self.clusterId = -1

def inspect(self):
"#"

def getClusterId(self):
return self.clusterId

def setClusterId(self, clusterId):
self.clusterId = clusterId

"""
class ParcelClusterAnalyzer(object):
def __init__(self, prevLandUseCount, landUseCount, parcel, prevAreaList, areaList, prevLandUseClusters, landUseClusters, prevClusterBoundingBox, clusterBoundingBox):
self.prevLandUseCount = {}
self.landUseCount= {}
self.parcel = parcel
self.prevAreaList = {}
self.areaList = {}
self.prevLandUseClusters = {}
self.landUseClusters = {}
self.prevClusterBoundingBox = {}
self.clusterBoundingBox = {}

class Area(object):
area = 0

def ParcelClusterAnalyzer(self):
pass

def ParcelClusterAnalyzer(self, parcel):
self.parcel = parcel
"""

my_grid = Grid(7, 7)
my_grid.printGrid()
print '-'*20

my_grid[1, 1].value = 1
my_grid[2, 0].value = 1
my_grid[2, 1].value = 1
my_grid[2, 2].value = 1

my_grid[3, 4].value = 2
my_grid[3, 5].value = 2
my_grid[4, 4].value = 2
my_grid[4, 5].value = 2
my_grid[5, 4].value = 2
my_grid[5, 5].value = 2

my_grid.printGrid()
print '-'*20

my_grid.firstFreeCell()

#my_grid.floodFill(x, y, landUse, area, boundingBox)
print my_grid[1, 1].clusterId
print '-'*20

my_grid.analyze()
print my_grid[0, 0].clusterId
print my_grid[1, 1].clusterId
print my_grid[2, 0].clusterId
print my_grid[2, 1].clusterId
print my_grid[2, 2].clusterId

print my_grid[3, 4].clusterId
print my_grid[3, 5].clusterId
print my_grid[4, 4].clusterId
print my_grid[4, 5].clusterId
print my_grid[5, 4].clusterId
print my_grid[5, 5].clusterId

my_grid.printClusterId(1)
my_grid.printClusterId(2)

"""
Uitkomst met beholp van floodFill moet volgende soort informatie geven:

class Cluster:
pass

cluster = Cluster()
cluster.cells = []
cluster.cells.append(mygrid[2, 0])
cluster.left = 1
cluster.right = 2
cluster.top = 0
cluster.bottom = 2
cluster.area = 4

cluster_list = [
cluster
]
"""

The next idea is that I create a library of 2d templates which are simple 2d representations of the floorplans of the 3d models of buildings. What I have to program is an algorithm that compares the 2d form of the cluster with all the available 2d forms of the templates where the housing type is the same. Say I have a cluster representing detached housing with an L-shape like:

1 1 1 1 1
1 1
1 1

I need to find a template from the subset of detached housing from the library where the shape of the template matches this L-shape cluster the most.

I'm wondering what is the best method of doing this kind of matching tasks. I could compare each cell from the cluster with the value of the template on that place and calculate some scoring based on the matching cells. Or I could try and make some edge detection and compare the edge shape of cluster and template. Or I could go for a real image analysis where both cluster and template are visualised as an image of 2 colors (0 and 1 values have different colors) and try and program some kind of image matching algorithm that for each pixel of the template image evaluates the value of the cluster pixel and tries to find a match like this.

I know it's a bit of a long and difficult problem to explain but maybe someone has experience with these kind of problems or algorithms and can share some insight on it. I searched the internet on terms as image analysis, pattern matching, binary image analysis and so on...but mostly I find difficult mathematical papers that seem to go to deep in the problem for what I need for my program.

Actually, very interesting.

Voodoo Camera Tracker: A tool for the integration of virtual and real scenes

http://www.digilab.uni-hannover.de/docs/manual.html

On the link above a camera tracker application by University of Hannover.

Photo Application by INRIA Alpes

http://lear.inrialpes.fr/src/yorg/doc/index.html

On the link above you can find the documentation and the link to the source code of a nice and very interesting application, implemented in Inria Alpes(oe of the most active research center in compute vision).
The application is implemented in Python and there are some examples of its capabilities.

Thought I'd share a simple Python implementation of the Harris corner detector. I have seen people looking for a python implementation for a range of applications so I'm hoping someone finds this useful.

The Harris (or Harris & Stephens) corner detection algorithm is one of the simplest corner indicators available. The general idea is to locate points where the surrounding neighborhood shows edges in more than one direction, these are then corners or interest points. The algorithm is explained here. In short, a matrix W is created from the outer product of the image gradient, this matrix is averaged over a region and then a corner response function is defined as the ratio of the determinant to the trace of W.

Let's see what this looks like in code. First we need to be able to do convolutions of 2D signals. For this NumPy is not enough and we need to use the signal module in SciPy. Create a file filtertools.py and add the following functions needed to create Gaussian derivative kernels and apply them to the image.

from scipy import *
from scipy import signal


def gauss_derivative_kernels(size, sizey=None):
 """ returns x and y derivatives of a 2D
     gauss kernel array for convolutions """
 size = int(size)
 if not sizey:
     sizey = size
 else:
     sizey = int(sizey)
 y, x = mgrid[-size:size+1, -sizey:sizey+1]

 #x and y derivatives of a 2D gaussian with standard dev half of size
 # (ignore scale factor)
 gx = - x * exp(-(x**2/float((0.5*size)**2)+y**2/float((0.5*sizey)**2)))
 gy = - y * exp(-(x**2/float((0.5*size)**2)+y**2/float((0.5*sizey)**2)))

 return gx,gy

def gauss_derivatives(im, n, ny=None):
 """ returns x and y derivatives of an image using gaussian
     derivative filters of size n. The optional argument
     ny allows for a different size in the y direction."""

 gx,gy = gauss_derivative_kernels(n, sizey=ny)

 imx = signal.convolve(im,gx, mode='same')
 imy = signal.convolve(im,gy, mode='same')

 return imx,imy

The point of using Gaussian derivative filters is that this computes a
smoothing of the image, to a scale defined by the size of the filter,
and the derivatives at the same time. The derivatives are less noisy
than if computed with a simple difference filter on the original image.

First add the corner response function to a file harris.py which will make
use of the Gaussian derivatives above.


def compute_harris_response(image):
  """ compute the Harris corner detector response function
      for each pixel in the image"""

  #derivatives
  imx,imy = filtertools.gauss_derivatives(image, 3)

  #kernel for blurring
  gauss = filtertools.gauss_kernel(3)

  #compute components of the structure tensor
  Wxx = signal.convolve(imx*imx,gauss, mode='same')
  Wxy = signal.convolve(imx*imy,gauss, mode='same')
  Wyy = signal.convolve(imy*imy,gauss, mode='same')

  #determinant and trace
  Wdet = Wxx*Wyy - Wxy**2
  Wtr = Wxx + Wyy

  return Wdet / Wtr

This gives an image with each pixel containing the value of the Harris
response function. Now it is just a matter of picking out the
information needed from this image. Picking all values above a
threshold with the additional constraint that corners must be separated
with a minimum distance is an approach that often gives good results.
To do this, take all candidate pixels, sort them in descending order of
corner response values and mark off regions too close to positions
already marked as corners. Add this function to harris.py.


def get_harris_points(harrisim, min_distance=10, threshold=0.1):
  """ return corners from a Harris response image
      min_distance is the minimum nbr of pixels separating
      corners and image boundary"""

  #find top corner candidates above a threshold
  corner_threshold = max(harrisim.ravel()) * threshold
  harrisim_t = (harrisim > corner_threshold) * 1

  #get coordinates of candidates
  candidates = harrisim_t.nonzero()
  coords = [ (candidates[0][c],candidates[1][c]) for c in range(len(candidates[0]))]
  #...and their values
  candidate_values = [harrisim[c[0]][c[1]] for c in coords]

  #sort candidates
  index = argsort(candidate_values)

  #store allowed point locations in array
  allowed_locations = zeros(harrisim.shape)
  allowed_locations[min_distance:-min_distance,min_distance:-min_distance] = 1

  #select the best points taking min_distance into account
  filtered_coords = []
  for i in index:
      if allowed_locations[coords[i][0]][coords[i][1]] == 1:
          filtered_coords.append(coords[i])
          allowed_locations[(coords[i][0]-min_distance):(coords[i][0]+min_distance),(coords[i][1]-min_distance):(coords[i][1]+min_distance)] = 0

  return filtered_coords

Now
you have all you need to detect corner points in images. To make it
easier to show the corner points in the image you can add a plotting
function using matplotlib (PyLab) as follows.

def plot_harris_points(image, filtered_coords):
  """ plots corners found in image"""
  from pylab import *

  figure()
  gray()
  imshow(image)
  plot([p[1] for p in filtered_coords],[p[0] for p in filtered_coords],'*')
  axis('off')
  show()

Try running the following commands on an example image:

>>> im = array(Image.open('empire.jpg').convert("L"))
>>> harrisim = harris.compute_harris_response(im)
>>> filtered_coords = harris.get_harris_points(harrisim,6)
>>> harris.plot_harris_points(im, filtered_coords)

The
image is opened and converted to grayscale. Then the response function
is computed and points selected based on the response values. Finally,
the points are plotted overlaid on the original image. This should give
you a plot like this.

An example of corner detection with the Harris corner detector.

An overview of different approaches to corner detection, including improvements on the Harris detector and further developments, see e.g. http://en.wikipedia.org/wiki/Corner_detection.

Informazioni personali

Etichette

Archivio blog