The goal of the competition is to extract the textual content from document images which are captured by mobile phones. The images are taken under varying conditions to provide a challenging input.
We released a sample dataset on the 1st of February 2015. The sample dataset contains 3630 mobile-phone-captured images from 20 documents, along with their ground truth. The participants were free to use it for training, testing or any other purpose related to the competition.
The test dataset containing 8470 images was released on the 23rd of March 2015. The participants had 1 week to run their methods on the test dataset and submit their results along with a description of their method.
SmartDoc – dataset capture
Sample mobile-captured document images