ELROND #1: Feature Detection in Computer Vision

The first post about my final year Electronic Engineering project. I introduce you to the topic of feature detection and what my project is about.
Feature detection is an important problem in computer vision. Computer Vision is the study of extracting information from the real world and somehow use it as an input to computer programs. A simple example of Computer Vision would be how Gollum was created for the Lord of the Rings films…

The creation of Gollum used motion capture from real world camera shots to digitally create the character.

Feature detection is the process of computer algorithms detecting interesting, or “perceptually interesting” locations of an image. Early feature detection worked by detecting edges within an image, but there were several problems with this approach.
Edge Detection
^ from Wikipedia
One of the biggest uses of feature detection is a process known as Image Registration, where a computer can find similar images and determine that they are actually of the same building/object/face/landscape/tree/etc. And this is the basis for my project.
everyday object recognition
^ originally from http://ils.intel-research.net/ (now removed)
The premise of my project, titled Scalable Landmark Recognition on Mobile Devices, is to allow anyone with a mobile phone to take a photograph of a landmark and be told what it is, where it is and some interesting facts about it. For example, you could take a picture of the Eiffel Tower and phone app would recognise the landmark and return information about when it was built, how tall it is, how much ticket prices are today, opening times etc.

^ from http://mastersofmedia.hum.uva.nl/2011/10/10/app-review-google-goggles/
This will be a downloadable mobile app for Android phones, and it will use feature detection to find similar images in it’s database and return information about the closest match. The app will require internet access to communicate with a server that will run the query and holds the database.
The project was dubbed “ELROND” by Anne, who I lived with in second year, because it’s close to SLROMD (which is what the actual abbreviation would be) and it also has Lord of the Rings references, which I’m all for!
So far I’ve got most of the feature detection working, and it should be quick enough to identify a match from 1000’s of images in under a second… but only time will tell. I’ll explain more about how my current solution works in a later post, but for now here’s a screenshot of it working on my computer…

13 usingfile

  • On the top-right you can see the query image, this would be the image taken on the phone.
  • Below it is the image matched to it from the database.
  • The white lines are lines between features matched between the query and database image (using a process called FLANN matching).
  • The green box shows where the query image would fit onto the database image if you were to stitch them together in a panorama (called a homography).
  • The top-middle window (with the red circles) shows the features found in the query image. The features are found using the SURF algorithm.
  • The left of the image shows the code output, showing the progress of the database search and how the query image matched to the other images in the database.