File Structure

In each scenario, excluding the test scenarios, we provide several different rooms. It is noteworthy that rooms with the same prefix are segmented from the same area, and assembling these rooms together can result in a complete area. The point cloud files for each room are stored in the PLY format, using a binary, and contain eight columns of data: ‘x’, ‘y’, ‘z’, ‘red’, ‘green’, ‘blue’, ‘sem’, ‘ins’. The ‘xyz’ are 32-bit floating-point numbers, the range of ‘rgb’ data is 0-255, and ‘sem’ and ‘ins’ use uint16. ‘sem’ represents the semantic label number corresponding to the point, and ‘ins’ represents the instance label number corresponding to the point. It is important to note that for semantic labels of 0 (i.e., points that are not annotated), their ‘ins’ label should also be 0, whereas the ‘ins’ labels for all non-unannotated points are greater than or equal to 1. In other words, if there are no unannotated points in a scene, the instance numbers start from 1; otherwise, they start from 0.

The table below shows the correspondence between semantic names and semantic labels.

semantic name semantic label
Unknown 0
Houseplant 3
Tree 4
Person 5
Floor 9
Stair 10
Ceiling 11
Pipe 12
Wall 13
Pillar 14
Window 15
Curtain 16
Door 17
Table 18
Chair 19
Sofa 20
Blackboard 21
Monitor 22
Bookshelf 23
Wardrobe 24
Bed 25
Reflection noise 26
Ghost 27
Light 29
Tabletop others 30


We also provide a point cloud annotation tool named PCAT, which allows for the rapid and convenient annotation of point clouds. Our dataset was annotated using this tool as well. More detailed information about this tool can be obtained by visiting the GitHub repository below.

Point Cloud Annotation Tool


Since the labels of our test set are not public, we use the Codabench platform to evaluate the predicted results. More detailed information can be obtained by clicking the link below.

semantic segmentation task