Data

In order to participate you should first register via "Join" button and after approved you will have the
links for downloading the data in "Dataset" page.

The dataset comprise 68 clinical PET-CT cases. Each case consists of two PET scans (fast acquisition and enhanced version) and their corresponding CT scan.
Note that the CT scans are not mandatory for the detection task, however may provide important information.
The scans were acquired on Philips' Vereos PET-CT scanner, with 30 seconds for per-bed-position (pbp) time.
Suspicious "hot spots" annotations were marked in a semi-manual process for each scan.
Annotations are provided only for 50 cases (train set), while the rest, 11 cases, are used for evaluation.
Note, that both PET scans (fast acquisition and enhanced version) have the same annotations, as both scans are perfectly aligned.
All the scans are provided in anonymized DICOM format, while maintaining all the necessary tags.

The data is arranged in the following way:
├── 30s/
│   ├── 1/
│    |    | ── [1.dcm, 2.dcm...]
│   ├── 2/
│    |    | ── [1.dcm, 2.dcm...]
├── 30s_denoised/
│   ├── 1/
│    |    | ── [1.dcm, 2.dcm...]
│   ├── 2/
│    |    | ── [1.dcm, 2.dcm...]
├── CT/
│   ├── 1/
│    |    | ── [1.dcm, 2.dcm...]
│   ├── 2/
│    |    | ── [1.dcm, 2.dcm...]
├──

Annotations are provided in a single multi-sheet excel file, each sheet corresponds to a specific case, e.g., sheet "1" corresponds to annotations given for case 1.
To provide further information, both the 3-D bounding boxes and the segmentation masks of each "hot spot" are provided.
Note that the 3-D bounding boxes were actually created by their corresponding segmentation masks which may be sufficient for training.

Each sheet has the annotations in the following format:

A single "hot spot" is defined by 7 rows, and has the name B\<#> (e.g., B01), with the following format:
1. B\<#>             [a1,a2,a3...,aN]
2. B\<#>-x1                X1    
3. B\<#>-x2                X2    
4. B\<#>-y1                Y1    
5. B\<#>-y2                Y2    
6. B\<#>-z1                Z1    
7. B\<#>-z2                Z2    

where, [a1, a2, a3, ..., aN] are the linear indices of the 3-D mask locations of the specific "hot spot" and X1,X2,Y1,Y2,Z1,Z2 are the coordinates defining the 3-D bounding box.
Each sheet has another two additional rows:

  1. cols      COLS
  2. rows    ROWS

where COLS and ROWS are the dimensions of the image in the x-y plane, i.e., COLS is the number of columns in each slice of the scan and ROWS is the number of rows in each slice of the scan.
The additional two rows defining the dimension of the image in the x-y plane are necessary for converting the linear mask indices to voxel locations.
For example, assume that a1 is a linear index, its specific voxel location (i.e., x,y,z) is:
z = a1 / (ROWS * COLS)
y = (a1 % (ROWS * COLS)) / ROWS
x = (a1 % (ROWS * COLS)) % ROWS
where % is the modulus operator.

Please see an auxiliary script for converting the linear indices to 3-D binary mask in the "Dataset" page.

*Note that the numbers of the "hot spots" are arbitrary numbers without consistency between the cases.

Submission

Each participant will have to detect all the  suspicious "hot spots" for each scan.
Each detection of a specific "hot spot" should be provided as a 3-D bounding box with the following format - (x1, y1, z1, x2, y2, z2), where x1 is the left-most horizontal coordinate, y1 is the upper-most vertical coordinate, z1 is the first z coordinate (first slice), x2 is the right-most horizontal coordinate y2 is the bottom-most vertical coordinate and z2 is the last z coordinate (last slice). All detections should be provided in a single csv file, with the following columns:
[case_id, x1, y1, z1, x2, y2, z2, score],
where, "case_id" column contains the specific case number (1,2,etc.),
x1,y1,z1,x2,y2,z2, columns are as described above and the "score" column is the confidence of the proposed bounding box to have a "hot spot".

Evaluation

Submissions will be evaluated with an F1-Score [1].
An average F1-score over all the "hot spots", over the 13 test cases will be taken and used for ranking the participants' algorithms in the leaderboard.

References

[1]  C. J. van Rijsbergen, Information Retrieval, London:Butterworths, 1979.