Results report

This pages provides a list of results, communicated by authors, on the performance of their methods against SmartDoc tasks. To add a reference to your work to this page, please fill our online form.

Results on SmarDoc 2015 – challenge 1

Results are reported using the mean Jaccard Index, in percentage. 100 % is the best possible score.

Reference Date Results Notes
A2iA: C. Kermorvant, A. Semenov, S. Sashov and V. Anisimov from A2iA St. Petesburg and Paris and Teklia Paris. March 2015 (original competition) 80.90% Their method starts with a Canny edge detector in the RGB space followed by an interpolation of the detected contours by Bezier curves. Quadrangles are selected depending on their squareness. If such steps fail to detect a valid quadrangle, a set of similar steps are applied to a denoised binary version of the input image.
ISPL-CVML: S. Heo, H.I. Koo and N.I. Cho from Seoul National University and Ajou University. March 2015 (original competition) 96.58% Their method starts by applying the Line Segment Dectector (LSD) to down-sampled images. Document boundaries are then generated by selecting two horizontal and vertical segments that minimize a cost function exploiting color and edge features. The final document boundaries are refined in the original high resolution image.
LRDE: E. Carlinet and T. Géraud from EPITA Research and Development Laboratory. March 2015 (original competition) 97.16% Their method relies on a hierarchical representation of the image named Tree of Shapes. In each frame of the video, an energy on the tree is computed in order to select the shape that looks the most like a papersheet. The energy involves two terms measuring how the shape fits a quadrilateral and if it has sub-contents like lines or images. Two Trees of Shapes are computed on the L and
b∗ components of the frame (converted in the La ∗ b∗ space). Shapes having the highest energies in both trees are retained as candidate objects and the location of the detection in the previous frame is used to finally select the right shape among the candidate components.
NetEase: P. Li, Y. Niw and X. Li from NetEase. March 2015 (original competition) 88.20% Their method starts by extracting line segments by the LSD method, such segments are then grouped and quadrangles are formed by selecting two horizontal and vertical segment groups. The final quadrangle is selected based on its aspect ratio, area and inner angles.
RPPDI-UPE: B.L.D. Bezerra, L. Leal and A. Junior from University of Pernambuco and Document Solutions. March 2015 (original competition) 74.08% Their method starts by using the HSV color space and filtering the hue channel in order to make the document pages stand from the background. Morphological operations followed by a Canny edge detector and a Hough transform yields a set of candidate polygons. Such polygons are then filtered according to their shape and position.
SEECS-NUST: S.A. Siddiqui, F. Asad, A.H. Khan and F. Shafait from School of Electrical Engineering and Computer Science and National University of Science and Technology. March 2015 (original competition) 73.93% Their method applies a Canny edge detection on the gray-level image to get a first estimate of the document position. A subsequent analysis of the different color channels is used to determine in which channel there is a higher contrast between document and background followed by a probabilistic Hough Transform to obtain the accurate document segmentation.
SmartEngines: A. Zhukovsky, D. Nikolaev, V. Arlazarov, V. Postnikov, D. Polevoy, N. Skoryukina, T. Chernov, J. Shemiakina, A. Mukovozov, I. Konovalenko and M. Povolotsky from Moscow Institute for Physics and Technologies, National University of Science and Technology, Institute for Systems Analysis, of Russian Academy of Sciences and Institute for Information Transmission Problems of Russian Academy of Sciences. March 2015 (original competition) 95.48% Their method starts with a segment extraction step by means of the LSD algorithm followed by a graph construction of segments. A quadrangle selection is done on such graph after applying several size and angle filters. The final candidate quadrangle is selected by fitting a motion model by using a Kalman filter powered by an inter-frame matching strategy of local descriptors based on SURF and BRIEF.
RPPDI-UPE-V2

LEAL, L. R. S. ; BEZERRA, B. L. D. . Smartphone Camera Document Detection Via Geodesic Object Proposals. In: 2016 IEEE LA-CCI (Latin American Conference on Computational Intelligence), 2016, Cartagena de Indias. Proceedings of 2016 IEEE LA-CCI, 2016. Electronic ISBN: 978-1-5090-5105-2. DOI: 10.1109/LA-CCI.2016.7885735

Nov 2016 (after main competition) The method is based on the use of  “Geodesic Object Proposals” for document image detection task.

Results on SmarDoc 2015 – challenge 2

Results are reported using the average character accuracy, in percentage. 100 % is the best possible score.

Reference Date Results Notes
A2iA: C. Kermorvant et al. from Teklia & A2iA, France  March 2015 (original competition)  93.84% This method preprocesses the images using their method submitted to challenge-1, dewarps the images and extracts the text lines using projection profiles. Then, an LSTM recurrent neural network is trained to recognize the binary text-lines.
 CartPerk: D. Kumar from CartPerk Technologies, India  March 2015 (original competition)  91.19% This method uses the blue background to extract and dewarp the document. The local contrast is computed, and the resulting image is then binarized with a local threshold in a 64×64 window. Finally Tesseract processes the binary image.
 CCC: M. Soheili et al. from DFKI, Germany and T. Modares University, Iran  March 2015 (original competition)  99.93% This method uses the background color to detect and dewarp the document. The image is then binarized to extract lines, words and subwords. Those are then clustered incrementally across all the corpus. A 1D LSTM
is trained on both sharp and blurry gray-scale text-lines for recognizing subwords. Clusters of subwords are labeled by majority voting.
 Digiform: G. Kragoz from Kocaeli University, Turkey  March 2015 (original competition)  95.33% This method applies a strong blur followed by a canny filter to detect the corners of a document. The image is then dewarped and remapped to 300dpi. The image is binarized with an adaptive threshold. Finereader performs the OCR step.
 LRDE: E. Carlinet and T. Graud from EPITA’s LRDE, France  March 2015 (original competition)  95.85% This method uses the corners of the largest centered component to dewarp the document. The document is then binarized based on a morphological thick gradient and a morphological Laplacian. Finereader performs the OCR step.
 ABBYY’s Finereader  March 2015 (original competition)  87.61% We report here the results of the publicly available version of ABBYY’s Finereader in March 2015, used without any preprocessing of the images.

Results on SmarDoc 2015 – QA

Reference Date Results Notes
 Waiting for reports.