Data¶
In order to participate you should first register via "Join" button and
after approved you will have the
links for downloading the data in "Dataset" page.
The dataset comprise 68 clinical PET-CT cases. Each case consists of two
PET scans (fast acquisition and enhanced version) and their
corresponding CT scan.
Note that the CT scans are not mandatory for the detection task, however
may provide important information.
The scans were acquired on Philips' Vereos PET-CT scanner, with 30
seconds for per-bed-position (pbp) time.
Suspicious "hot spots" annotations were marked in a semi-manual process
for each scan.
Annotations are provided only for 50 cases (train set), while the rest,
11 cases, are used for evaluation.
Note, that both PET scans (fast acquisition and enhanced version) have
the same annotations, as both scans are perfectly aligned.
All the scans are provided in anonymized DICOM format, while maintaining
all the necessary tags.
The data is arranged in the following way:
├── 30s/
│ ├── 1/
│ | | ── [1.dcm, 2.dcm...]
│ ├── 2/
│ | | ── [1.dcm, 2.dcm...]
├── 30s_denoised/
│ ├── 1/
│ | | ── [1.dcm, 2.dcm...]
│ ├── 2/
│ | | ── [1.dcm, 2.dcm...]
├── CT/
│ ├── 1/
│ | | ── [1.dcm, 2.dcm...]
│ ├── 2/
│ | | ── [1.dcm, 2.dcm...]
├──
Annotations are provided in a single multi-sheet excel file, each sheet
corresponds to a specific case, e.g., sheet "1" corresponds to
annotations given for case 1.
To provide further information, both the 3-D bounding boxes and the
segmentation masks of each "hot spot" are provided.
Note that the 3-D bounding boxes were actually created by their
corresponding segmentation masks which may be sufficient for training.
Each sheet has the annotations in the following format:
A single "hot spot" is defined by 7 rows, and has the name B\<#> (e.g.,
B01), with the following format:
1. B\<#> [a1,a2,a3...,aN]
2. B\<#>-x1 X1
3. B\<#>-x2 X2
4. B\<#>-y1 Y1
5. B\<#>-y2 Y2
6. B\<#>-z1 Z1
7. B\<#>-z2 Z2
where, [a1, a2, a3, ..., aN] are the linear indices of the 3-D mask
locations of the specific "hot spot" and X1,X2,Y1,Y2,Z1,Z2 are the
coordinates defining the 3-D bounding box.
Each sheet has another two additional rows:
- cols COLS
- rows ROWS
where COLS and ROWS are the dimensions of the image in the x-y plane,
i.e., COLS is the number of columns in each slice of the scan and ROWS
is the number of rows in each slice of the scan.
The additional two rows defining the dimension of the image in the x-y
plane are necessary for converting the linear mask indices to voxel
locations.
For example, assume that a1 is a linear index, its specific voxel
location (i.e., x,y,z) is:
z = a1 / (ROWS * COLS)
y = (a1 % (ROWS * COLS)) / ROWS
x = (a1 % (ROWS * COLS)) % ROWS
where % is the modulus operator.
Please see an auxiliary script for converting the linear indices to 3-D binary mask in the "Dataset" page.
*Note that the numbers of the "hot spots" are arbitrary numbers without consistency between the cases.
Submission¶
Each participant will have to detect all the suspicious "hot spots" for
each scan.
Each detection of a specific "hot spot" should be provided as a 3-D
bounding box with the following format - (x1, y1, z1, x2, y2, z2), where
x1 is the left-most horizontal coordinate, y1 is the upper-most vertical
coordinate, z1 is the first z coordinate (first slice), x2 is the
right-most horizontal coordinate y2 is the bottom-most vertical
coordinate and z2 is the last z coordinate (last slice). All detections
should be provided in a single csv file, with the following columns:
[case_id, x1, y1, z1, x2, y2, z2, score],
where, "case_id" column contains the specific case number (1,2,etc.),
x1,y1,z1,x2,y2,z2, columns are as described above and the "score" column
is the confidence of the proposed bounding box to have a "hot spot".
Evaluation¶
Submissions will be evaluated with an F1-Score [1].
An average F1-score over all the "hot spots", over the 13 test cases
will be taken and used for ranking the participants' algorithms in the
leaderboard.
References¶
[1] C. J. van Rijsbergen, Information Retrieval, London:Butterworths, 1979.