Smartphones are replacing personal scanners. They are portable, connected, powerful and affordable. They are on their way to become the new entry point in business processing applications like document archival, ID scanning, check digitization, just to name a few. In order keep our workflows streamlined, we need to make those new capture device as reliable as batch scanners.

We believe an efficient capture process should be able to:

  1. detect and segment the relevant document object during the preview phase;
  2. assess the quality of the capture conditions and help the user improve them;
  3. optionally, trigger the capture at the perfect moment;
  4. and produce a high-quality, controlled output based on the high resolution captured image.

This competition is focused on the first step of this process:efficiently detect and segment document regions, as illustrated by following capture showing the ideal output for the preview phase of some acquisition session.

Ideal document object detection ‎(ground truth, red frame)‎

For this challenge, the input is a set of videoclips containing a document from a predefined set, and the output should be an XML file containing the quadrilateral coordinates in which we can find the document per each frame of the video.