This page specifies the input and output formats of the competition dataset and of expected results.Complying with those requirements will enable a quick and reliable evaluation of participants’ results by the organizers.
Output samples are available for download at the bottom of this page.
Feel free to contact us for further information regarding those formats
Task
Participants will have to process files in the test set, and produce for each of those files an associated result file.
Input structure
The input structure of the test set has the following structure:
test/
├── background01
│ ├── datasheet001.avi
│ ├── datasheet002.avi
│ ├── datasheet003.avi
│ ├── datasheet004.avi
│ ├── datasheet005.avi
│ ├── letter001.avi
│ ├── letter002.avi
│ ├── letter003.avi
│ ├── letter004.avi
│ ├── letter005.avi
│ ├── magazine001.avi
│ ├── magazine002.avi
│ ├── magazine003.avi
│ ├── magazine004.avi
│ ├── magazine005.avi
│ ├── paper001.avi
│ ├── paper002.avi
│ ├── paper003.avi
│ ├── paper004.avi
│ ├── paper005.avi
│ ├── patent001.avi
│ ├── patent002.avi
│ ├── patent003.avi
│ ├── patent004.avi
│ ├── patent005.avi
│ ├── tax001.avi
│ ├── tax002.avi
│ ├── tax003.avi
│ ├── tax004.avi
│ └── tax005.avi
├── background02
│ ├── datasheet001.avi
│ ...
│ └── tax005.avi
├── background03
│ ...
├── background04
│ ...
└── background05
...
└── tax005.avi
It contains 5 directories, and 150 video files (AVI container, XVID codec, no audio)
Expected output structure
The output structure must have the same structure, ie:
${participantid}-${methodid}/
├── background01
│ ├── datasheet001.segresult.xml
│ ├── datasheet002.segresult.xml
│ ├── datasheet003.segresult.xml
│ ├── datasheet004.segresult.xml
│ ├── datasheet005.segresult.xml
│ ├── letter001.segresult.xml
│ ├── letter002.segresult.xml
│ ├── letter003.segresult.xml
│ ├── letter004.segresult.xml
│ ├── letter005.segresult.xml
│ ├── magazine001.segresult.xml
│ ├── magazine002.segresult.xml
│ ├── magazine003.segresult.xml
│ ├── magazine004.segresult.xml
│ ├── magazine005.segresult.xml
│ ├── paper001.segresult.xml
│ ├── paper002.segresult.xml
│ ├── paper003.segresult.xml
│ ├── paper004.segresult.xml
│ ├── paper005.segresult.xml
│ ├── patent001.segresult.xml
│ ├── patent002.segresult.xml
│ ├── patent003.segresult.xml
│ ├── patent004.segresult.xml
│ ├── patent005.segresult.xml
│ ├── tax001.segresult.xml
│ ├── tax002.segresult.xml
│ ├── tax003.segresult.xml
│ ├── tax004.segresult.xml
│ └── tax005.segresult.xml
├── background02
│ ├── datasheet001.segresult.xml
│ ...
│ └── tax005.segresult.xml
├── background03
│ ...
├── background04
│ ...
└── background05
...
└── tax005.segresult.xml
Where:
${participantid}
is an ASCII string without space referring to the participant${methodid}
is an ASCII string without space referring to the method used (multiple methods may be proposed).
Output packaging
Those files, and only those files, must be added to a Zip archive and sent to the organizers according to the procedure indicated on the competition website. The content of the Zip file must have the same structure as the output files.
XML file format specification
Result files must be in XML format and comply with the following specifications.Real output samples are available for download at the bottom of this page.All tags and attributes are mandatory, except when explicitly stated.
1/ First line (XML header) must be:
<?xml version='1.0' encoding='utf-8'?>
2/ Second line (format tag and generation time) must be:
<seg_result version="0.2" generated="$timestamp">
where:
$timestamp
is a string representing the date and time of generation of the file. The format of this string should comply with one of the standard ISO formats. Using Python, you can generate such string withdatetime.datetime.now().isoformat()
.
3/ Third line (software description) must be:
<software_used name="$software_name" version="$software_name"/>
where:
$software_name
is a UTF-8 string describing the software$software_version
is a UTF-8 string denoting the software version
Those attributes must be present and should not be left blank.
4/ Fourth line (source file) must be:
<source_sample_file>$path</source_sample_file>
where:
$path
is a path to the input file. It should be relative to dataset root.
5/ Fifth line (frame results start) must be:
<segmentation_results>
6/ Following lines (frame results blocks)
- Following lines must be formed of blocks indicating the coordinates of the object found in each frame, or indicate a reject if it cannot be found.
- For each frame, a block should be generated.
If the object cannot be found in the current frame, the block must be:
<frame index="$frame_index" rejected="true"/>
where:
$frame_index
is the index of the frame, starting at 1.
If the object is found in the frame, the block must be:
<frame index="$frame_index" rejected="false">
<point name="bl" x="$blx" y="bly"/>
<point name="tl" x="$tlx" y="tly"/>
<point name="tr" x="$trx" y="try"/>
<point name="br" x="$brx" y="bry"/>
</frame>
where:
$frame_index
is the index of the frame, starting at 1.$blx
and$bly
are the coordinates of the bottom left point of the object$tlx
and$tly
are the coordinates of the top left point of the object$trx
and$try
are the coordinates of the top right point of the object$brx
and$bry
are the coordinates of the bottom right point of the object
The coordinates can be floating point numbers, provided the decimal separator used is the dot (‘.
‘).Coordinates are expressed in the frame (image) coordinate system:
- origin
(0,0)
is at the upper left corner of the frame - x values are increasing toward the right of the image
- y values are increasing toward the bottom of the image
- this is consistent with OpenCV and Numpy image matrix coordinate system.
7/ Last-1 line (frame results end) must be:
</segmentation_results>
8/ Last line (document end) must be:
</seg_result>
Sample output XML file (abbreviated)
<?xml version='1.0' encoding='utf-8'?>
<seg_result version="0.2" generated="2014-07-24T15:18:01.287068">
<software_used name="My program (c) CVC/ULR 2014" version="0.2"/>
<source_sample_file>background03/datasheet003.mp4</source_sample_file>
<segmentation_results>
<frame index="1" rejected="true"/>
<frame index="2" rejected="true"/>
<frame index="3" rejected="true"/>
<frame index="4" rejected="true"/>
<frame index="5" rejected="true"/>
<frame index="6" rejected="true"/>
<frame index="7" rejected="true"/>
<frame index="8" rejected="true"/>
<frame index="9" rejected="true"/>
<frame index="10" rejected="true"/>
<frame index="11" rejected="false">
<point name="bl" x="970" y="770"/>
<point name="tl" x="910" y="347"/>
<point name="tr" x="1242" y="333"/>
<point name="br" x="1367" y="743"/>
</frame>
...
<frame index="219" rejected="false">
<point name="bl" x="852" y="744"/>
<point name="tl" x="880" y="331"/>
<point name="tr" x="1212" y="352"/>
<point name="br" x="1258" y="766"/>
</frame>
<frame index="220" rejected="false">
<point name="bl" x="852" y="744"/>
<point name="tl" x="884" y="328"/>
<point name="tr" x="1217" y="352"/>
<point name="br" x="1258" y="766"/>
</frame>
</segmentation_results>
</seg_result>