SmartDoc 2015 – Challenge 2 Results – SmartDoc : Series of ICDAR Competition on smartphone-captured document image analysis

We are very proud to announce the final results of the challenge 2. Congratulations to all participants, and thank you for your kind interest in the competition!

FOR DETAILED RESULTS PLEASE SEE ICDAR 2015 PROCEEDINGS.

Results summary

Rank	Name	Score (%)
1	CCC	99.93
2	LRDE	95.85
3	Digiform	95.33
4	A2iA	93.84
5	CartPerk	91.19

Methods overview

A2iA

C. Kermorvant et al. from Teklia & A2iA, France This method preprocesses the images using their second method submitted to challenge-1, dewarps the images and extracts the text lines using projection profiles. Then, an LSTM recurrent neural network is trained to recognize the binary text-lines.

CartPerk

D. Kumar from CartPerk Technologies, India This method uses the blue background to extract and dewarp the document. The local contrast is computed, and the resulting image is then binarized with a local threshold in a 64×64 window. Finally Tesseract processes the binary image.

CCC

Mohammad Reza Soheili, Mohammad Reza Yousefi, Ehsanollah Kabir and Didier StrickerGerman Research Center for Artificial Intelligence(DFKI), Kaiserslautern, GermanyTarbiat Modares University, Tehran, Iran This method uses the background color to detect and dewarp the document. The image is then binarized to extract lines, words and subwords. Those are then clustered incrementally across all the corpus. A 1D LSTM is trained on both sharp and blurry gray-scale text-lines for recognizing subwords. Clusters of subwords are labeled by majority voting.

Digiform

G. Kragoz from Kocaeli University, Turkey This method applies a strong blur followed by a canny filter to detect the corners of a document. The image is then dewarped and remapped to 300dpi. The image is binarized with an adaptive threshold. Finereader performs the OCR step.

LRDE

E. Carlinet and T. Graud from EPITA’s LRDE, France This method uses the corners of the largest centered component to dewarp the document. The document is then binarized based on a morphological thick gradient and a morphological Laplacian. Finereader performs the OCR step.