README.md 71.1 KB
Newer Older
Clément Pinard's avatar
Clément Pinard committed
1
2
# Photogrammetry and georegistration tools for Parrot drone videos

Clément Pinard's avatar
Clément Pinard committed
3
4
This is a set of python scripts  and c++ programs used to construct a depth validation set with a Lidar generated point cloud.
For a brief recap of what it does, see section [How it works](#how-it-works)
Clément Pinard's avatar
Clément Pinard committed
5

Clément Pinard's avatar
Clément Pinard committed
6
7
8
9
10
11
## Table of contents

* [Software Dependencies](#software-dependencies)
* [Hardware Dependencies](#hardware-dependencies)
* [How it works](#how-it-works)
* [Step by step guide](#usage)
Clément Pinard's avatar
Clément Pinard committed
12
* [Special case : adding new images to an existing constructed dataset](#special-case-adding-new-images-to-an-existing-dataset)
Clément Pinard's avatar
Clément Pinard committed
13
* [Using the constructed dataset for evaluation](#evaluation)
Clément Pinard's avatar
Clément Pinard committed
14
* [Detailed method with the manoir example](#detailed-method-with-the-manoir-example)
Clément Pinard's avatar
Clément Pinard committed
15
* [TODO](#todo)
Clément Pinard's avatar
Clément Pinard committed
16
17


Clément Pinard's avatar
Clément Pinard committed
18
19
## Software Dependencies

20
21
22
23
24
25
*Note*: There is a dockerfile in order to construct a docker image that automatically complies with all the software dependencies. You can just construct it with

```
docker build . -t my_image
```

Clément Pinard's avatar
Clément Pinard committed
26
27
These are the used tools, make sure to install them before running the scripts.

28
29
30
31
32
33
 - [CUDA](https://developer.nvidia.com/cuda-downloads) (version : 10+)
 - [OpenCV](https://opencv.org/) (version, 4.0.0+)
 - [ETH3D Dataset-pipeline](https://github.com/ETH3D/dataset-pipeline) (version : master)
 - [Pytorch](https://pytorch.org/) (version, 1.7.0+)
 - [COLMAP](https://colmap.github.io/) (version : master)
 - [PDrAW from AnafiSDK](https://developer.parrot.com/docs/pdraw/) (version : master)
Clément Pinard's avatar
Clément Pinard committed
34
35
36
37
38

Apart from CUDA, which you need to install by yourself, you can use the help script `install_dependencies.sh` to install them on ubuntu 20.04.

For PDrAW, there should be a `native-wrapper.sh` file that you to keep a track of. It's usually in `groundsdk/out/pdraw-linux/staging/native-wrapper.sh`(see [here](https://developer.parrot.com/docs/pdraw/installation.html))

39
40
41
For COLMAP, you will need a vocab tree for feature matching. You can download them at https://demuc.de/colmap/#download . In our tests, we took the 256K version.

## Hardware dependecies
Clément Pinard's avatar
Clément Pinard committed
42
43
44
45
46
47
48
49
50
51
52
53

To recreate the results of the study, you will need these hardware pieces :
 - Parrot Anafi
 - DJI Matrice 600
 - Velodyne Puck VLP16

Note that for our study, we provided the Anafi drone (\~700€), and the point cloud was created by a private company (\~3500€ for the whole scan process)


# How it works

Here are the key steps of the dataset creation :
Clément Pinard's avatar
Clément Pinard committed
54
See [Detailed method with the manoir example](#detailed-method-with-the-manoir-example) for a concrete example with options used.
Clément Pinard's avatar
Clément Pinard committed
55
56

1. Data acquisition on a particular scene
Clément Pinard's avatar
Clément Pinard committed
57
58
    - Make a photogrammetry flight plan with any drone, You can use e.g. the Anafi with the Pix4D capture app (it's free). It is important that pictures have GPS info in the exif
    - Make some natural flights in the same scene, use either a Bebop2 or a Anafi to be able to use the PDraw tool. In theory this is possible to adapt the current scripts to any IMU-GPS-powered camera.
Clément Pinard's avatar
Clément Pinard committed
59
60
    - Make a Lidar scan of this very scene, and clean the resulting 3D point cloud : this is a crucial part as Lidar data will be assumed perfect for the rest of the workflow. You need to also note the projection system used (e.g. `EPSG 2154`) for geo registration. The file will a priori be a `.las` file with float64 values.

Clément Pinard's avatar
Clément Pinard committed
61
62
63
64
65
66
67
68
2. Convert the `.las` float64 point cloud into a `.ply` float32
    - As 3D values are global, x and y will be huge. You need to make the cloud 0-centered by subtracting its centroid to it.
    - The centroid needs to be logged somewhere for future frame registration
    - This step is done by the script `las2ply.py`

3. Extract optimal frames from video for a thorough photogrammetry that will use a mix of pix4D flight plan pictures and video still frames.
    - The total number of frame must not be too high to prevent the reconstruction from lasting too long on a single desktop (we recommand between 500 an 1000 images)
    - At the same time, extract if possible information on camera parameters to identify which video sequences share the same parameters (e.g 4K videos vs 720p videos, or different levels of zooming)
Clément Pinard's avatar
Clément Pinard committed
69
    - This step is done by the script `videos_to_colmap.py` (See Step by step guide, Step 5)
Clément Pinard's avatar
Clément Pinard committed
70
71
72
73
74

4. Georeference your images.
    - For each frame with *GPS* position, convert them in *XYZ* coorindates in the projection system used by the Lidar point cloud (Here, EPSG:2154 was used)
    - Substract to these coordinates the centroid that logged when converting the LAS file to PLY.
    - Log image filename and centered *XYZ* position in a file for georegistration of the reconstruction point cloud
Clément Pinard's avatar
Clément Pinard committed
75
    - This step is also done by the script `videos_to_colmap.py` (See Step by step guide, Step 5)
Clément Pinard's avatar
Clément Pinard committed
76

Clément Pinard's avatar
Clément Pinard committed
77
4. Generate sky maps of your drone pictures to help the photogrammetry filter out noise during matching (Step 6)
Clément Pinard's avatar
Clément Pinard committed
78
79
80
    - Use a Neural Network to segment the drone picture and generate masks so that the black areas will be ignored
    - This is done with the script `generate_sky_masks.py`

Clément Pinard's avatar
Clément Pinard committed
81
3. Perform a photogrammetry on your pictures (See Step by step guide, Steps 6 - 7 - 8)
Clément Pinard's avatar
Clément Pinard committed
82
83
    - The recommended tool is COLMAP because further tools will use its output format
    - You should get a sparse 3D model, exhaustive enough to : 
Clément Pinard's avatar
Clément Pinard committed
84
85
86
87
        - Be matched with the Lidar Point cloud
        - Localize other video frames in the reconstruction

4. Change reconstructed point cloud with shift and scale to match Lidar point cloud
Clément Pinard's avatar
Clément Pinard committed
88
    - See [here](https://colmap.github.io/faq.html#geo-registration) for point cloud georegistration with colmap (See Step by step guide, Step 9)
Clément Pinard's avatar
Clément Pinard committed
89

Clément Pinard's avatar
Clément Pinard committed
90
91
92
93
94
5. Video Localization :
    - Continue the photogrammetry with video frames at a low fps (we took 1fps). We do this in order to keep the whole mapping at a linear time (See Step by step guide, Step 10.3)
    - Merge all the resulting models into one full model with thorough photogrammetry frames and all the 1fps frames (See Step by step guide, Step 10.)
    - Finish registering the remaning frames. For RAM reasons, every video is divided into chunks, so that a sequence registered is never more than e.g. 4000 frames (See Step by step guide, Step 10.)
    - Filter the final model at full framerate : remove points with absurd angular and translational accleration. Interpolate the resulting discarded points (but keep a track of them). This is done in the script `filter_colmap_model.py` (See Step by step guide, Step 10.)
Clément Pinard's avatar
Clément Pinard committed
95
96
97

6. Densify the resulting point cloud with COLMAP (see [here](https://colmap.github.io/tutorial.html#dense-reconstruction))
    - Export a PLY file along with the VIS file with `colmap stereo_fusion`
Clément Pinard's avatar
Clément Pinard committed
98

Clément Pinard's avatar
Clément Pinard committed
99
7. Match the Lidar Point cloud and full reconstruction point cloud together with ICP.
Clément Pinard's avatar
Clément Pinard committed
100
    - The georegistration of the reconstructed point cloud should be sufficient to get a good starting point.
Clément Pinard's avatar
Clément Pinard committed
101
102
103
104
105
106
107
108
109
110
    - By experience, the best method here is to use CloudCompare, but You can use ETH3D or PCL to do it
    - The resulting transformation matrix should be stored in a TXT file, the same way cloudcompare proposes to do it

8. Construct a PLY+VIS file pair based on lidar scan
    - The PLY file is basically every lidar point
    - The VIS file stores frames from which point is visible : we reuse the PLY+VIS from step 6, and assume a lidar point has the same visibility as the closest point in the denified reconstructed point.

9. Run the delauney mesher tool from COLMAP to construct an occlusion mesh

10. Construct the Splats with ETH3D
Clément Pinard's avatar
Clément Pinard committed
111

Clément Pinard's avatar
Clément Pinard committed
112
113
114
11. Construct the ground truth Depth with ETH3D

12. Visualize and Convert the resulting dataset to match the format of a more well known dataset, like KITTI.
Clément Pinard's avatar
Clément Pinard committed
115
116


Clément Pinard's avatar
Clément Pinard committed
117
## Usage
Clément Pinard's avatar
Clément Pinard committed
118

Clément Pinard's avatar
Clément Pinard committed
119
120
121
122
123
124
125
126
It is expected that you read the COLMAP tutorial in order to understand key COLMAP concepts and data structure : https://colmap.github.io/tutorial.html

### Working directories

You will be working with 3 different folders :
 - Input directory : Where you store your acquired data : Lidar point clouds, photogrammetry pictures and videos
 - Workspace : Data from input directory will be copied into this directory so that COLMAP can run several photogrammetry tasks
 - Output folder: Where the processed data will be stored? This will be the biggest directory.
Clément Pinard's avatar
Clément Pinard committed
127

Clément Pinard's avatar
Clément Pinard committed
128
#### Input directory
Clément Pinard's avatar
Clément Pinard committed
129

Clément Pinard's avatar
Clément Pinard committed
130
Structure your input folder so that it looks like this:
Clément Pinard's avatar
Clément Pinard committed
131

Clément Pinard's avatar
Clément Pinard committed
132
```
Clément Pinard's avatar
Clément Pinard committed
133
Input
Clément Pinard's avatar
Clément Pinard committed
134
├── Pictures
Clément Pinard's avatar
Clément Pinard committed
135
136
137
138
139
140
141
│   ├── folder1
│   │   ├── subfolder1
│   │   │   ├── 01.jpg
│   │   │   └── ...
│   │   ├── subfolder2
│   │   └── ..
│   └── folder2
Clément Pinard's avatar
Clément Pinard committed
142
├── Videos
Clément Pinard's avatar
Clément Pinard committed
143
│   └── no_groundtruth
Clément Pinard's avatar
Clément Pinard committed
144
145
146
└── Lidar
```

Clément Pinard's avatar
Clément Pinard committed
147
148
149
150
151
152
153
154
155
156
157
158
159
- `Pictures` contains the photogrammetry pictures. They can be in jpeg or RAW format. The subfolder means a shared camera calibration. If two pictures are taken with the same camera but with different zoom levels, they need to be in separate folders.
- `Videos` containes the videos you took which will be converted into frames and then used in the photogrammetry process.
    * `no_groundtruth` contains the videos you want to use for the whole model reconstruction (the thorough photogrammetry step), but you don't want the whole video to be localized. Thus, no depth ground truth will be generated for these videos.
    * Other videos will be used for photogrammetry AND for localization, and depth and odometry will be produced for every frame of the video.
- `Lidar` contains the lidar models. They can be `LAS` files or `PLY` files. In the case your point clouds are geo-referenced, you need to know what projection system was used in order to compute picture position with respect to the clouds using GPS coordinates.

#### Workspace directory

Your workspace (which is different from the input folder !) will be used throughout your whole dataset creation pipeline. It should be stored on a SSD and is designed this way :

```
Workspace
├── Lidar
Clément Pinard's avatar
Clément Pinard committed
160
161
├── COLMAP_img_root
│   ├──Indivudal_pictures
Clément Pinard's avatar
Clément Pinard committed
162
163
164
165
166
167
168
169
170
171
│   │   ├── folder1
│   │   ├── folder2
│   │   └── ...
│   └──Videos
│       ├── resolution1
│       │   ├── video1
│       │   └── ...
│       ├── resolution2
│       └── ...
├── Masks
Clément Pinard's avatar
Clément Pinard committed
172
│   └── Same structure as 'COLMAP_img_root' folder
Clément Pinard's avatar
Clément Pinard committed
173
174
175
176
177
178
179
180
├── Thorough
│   ├── 0
│   ├── 1
│   ├── ...
│   ├── georef
│   ├── georef_full
│   └── dense
├── Video_reconstructions
Clément Pinard's avatar
Clément Pinard committed
181
│   ├── Same structure as 'COLMAP_img_root/Videos' folder
Clément Pinard's avatar
Clément Pinard committed
182
183
│   ├── resolution1
│   │   └── video1
Clément Pinard's avatar
Clément Pinard committed
184
185
186
187
188
189
│   │       ├── lowfps
│   │       ├── chunk_0
│   │       ├── chunk_1
│   │       ├── ...
│   │       └── final
│   └── ...
Clément Pinard's avatar
Clément Pinard committed
190
191
192
193
194
195
├── scan_thorough.db
├── lidar.mlp
└── aligned.mlp
```

- `Lidar` will store the point clouds converted and centered from LAS point clouds
Clément Pinard's avatar
Clément Pinard committed
196
- `COLMAP_img_root` is the folder COLMAP will make its picture paths relative from. In every COLMAP command, it is always indicated with the option `--image_path`. Obviously, that is also where we store the pictures COLMAP will use.
Clément Pinard's avatar
Clément Pinard committed
197
198
    * `Individual pictures` contains the pictures that were not extracted from a video, e.g. from a photogrammetry flight plan. Subfolders can be used. You need to ensure that for pictures in the same sub folders share the exact same camera calibration : image size, focal length, distorsion.
    * `Videos` containes pictures extracted from videos. In each video folder, in addition to the video frame, a `metadata.csv` file is stored in order to keep metadata, such as frame timestamp, image size, image calibration (if available, this is the case for Anafi videos), image position from GPS, and colmap database id. Note that we don't need to stores all the video frames in this folder at the same time. We only need a subset most of the time when mapping the whole reconstruction. When localizing the video frames, wall the frame sare in the directory, but you can remove them (apart from the subset) as soon as you are finished with localizing the video, and copy the frames in the output folder (see below)
Clément Pinard's avatar
Clément Pinard committed
199
- `Masks` is a mirror of `COLMAP_img_root` and for each image, there is a black and white picture used discard parts of the images for feature points computation. See https://colmap.github.io/faq.html#mask-image-regions
Clément Pinard's avatar
Clément Pinard committed
200
201
202
203
204
205
206
207
208
209
210
211
212
- `Thorough` is the output of the first photogrammetry, the one that will try to reconstruct the whole model with a subset of the images we acquired, which will then be used to localized remaining images, as well as localize the Lidar point cloud with respect to the images.
    * `0`, ... , `N` are the folder containing sparse models reconstructed by COLMAP with the function `colmap mapper`. See https://colmap.github.io/tutorial.html#sparse-reconstruction
    * `georef` is the folder containing the geo-referenced model. When GPS is available, we use it to have a first guess to register the reconstructed model with the Lidar point cloud. Note that this also apply the right scale to the model. See https://colmap.github.io/faq.html#geo-registration
    * `georef_full` is the folder containing the geo-referenced model, augmented with additional frames from videos localization.
    * `dense` is the folder containing the dense mode. See https://colmap.github.io/tutorial.html#dense-reconstruction
- `Video_reconstructions` is the folder where we store the reconstruction for each video. It is based on the `georef` model. We then add a subset of the video frames (usually 1 fps) with the `colmap mapper`, augment the `georef_full`with it (see https://colmap.github.io/faq.html#merge-disconnected-models), and localize the rest of the frames. Note that once all frames are localized, you no longer use the reconstruction until ground truth creation, so you can store it in the output folder as soon as you are finished with your video
    * `lowfps` contains the model with the subset of frames
    * `chunk_0` to `chunk_N` contains the localized frames (but they do not contribute to the reconstruction). We cut the video in several chunks if the video is too long (and thus needs too much RAM)
    * `final` contains a COLMAP model with only the frames of the very video we want to localize. Everything else than images localization (3D points, 2D feature points for each image, irrelevant camera models) is removed from the model.
- `scan_thorough.db` is the database file used by COLMAP for the thorough reconstruction. Note that it can become quite big, that's why it does not contain feature for all the video frames but only a subset.
- `lidar.mlp` is the MeshLabProject containing all the PLY files from Lidar scans. The meshlab project stores multiple PLY files and transformation information (in a 4x4 matrix) in a xml-like file.
- `aligned.mlp` is the MeshLabProject after the Lidar point clouds (i.e. `lidar.mlp`) have beed registered with respect to the photogrammetry pointcloud. The transformations are the ones to apply to the PLY file in order to be aligned to the COLMAP output.

Clément Pinard's avatar
Clément Pinard committed
213
#### Output directories
Clément Pinard's avatar
Clément Pinard committed
214
215
216
217
218
219
220
221
222
223

You output directory is divided in 2 parts : 
- Raw output directory, which will contain the full size images, with the depth maps generated by ETH3D
- Converted output directory, which will contain the resized images, with depths stored in numpy files, along with videos for visualization purpose.


```
Raw_output_directory
├── calibration
├── ground_truth_depth
Clément Pinard's avatar
Clément Pinard committed
224
225
226
227
228
│   ├── video1
│   ├── video2
│   └── ...
├── COLMAP_img_root
│   └── Same structure as 'Workspace/COLMAP_img_root' folder
Clément Pinard's avatar
Clément Pinard committed
229
├── models
Clément Pinard's avatar
Clément Pinard committed
230
│   └── Same structure as 'Workspace/Videos_reconstructions' folder
Clément Pinard's avatar
Clément Pinard committed
231
├── occlusion_depth
Clément Pinard's avatar
Clément Pinard committed
232
233
234
│   ├── video1
│   ├── video2
│   └── ...
Clément Pinard's avatar
Clément Pinard committed
235
236
237
238
239
240
241
└── points
```

- `calibration` Byproduct of ETH3D `GroundTruthCreator` : where the colmap model used for the current video is stored. Overwritten for every video. All COLMAP models are already stored in the folder `models`.
- `points` Another byproduct of ETH3D `GroundTruthCreator` : Lidar point cloud where points not seen by more than 2 images are discarded. Note that as in `calibration`, the file is overwritten for every video.
- `ground_truth_depth` Folder where ETH3D `GroundTruthCreator` stored raw depth maps. Note that directory tree is not preserved, every video folder stem is stored in the root of the folder.
- `occlusion_depth` Same as `ground_truth_depth` but for occlusion depth maps instead of depth maps. Note that outputing this occlusion depth is optional and only serves visualization purpose for the conveted dataset.
Clément Pinard's avatar
Clément Pinard committed
242
- `COLMAP_img_root` is a clone of  `Workspace/COLMAP_img_root`. Contrary to the one in workspace, tt stores ALL the video frames so it can be very heavy.
Clément Pinard's avatar
Clément Pinard committed
243
244
245
246
247
- `models` is a clone of `Workspace/Video_reconstructions`, it contains all the models that were used for video localization.

```
Converted_output_directory
├── dataset
Clément Pinard's avatar
Clément Pinard committed
248
249
250
251
252
253
254
255
256
257
258
259
260
261
│   ├── resolution1
│   │   ├── video1
│   │   │    ├── poses.txt
│   │   │    ├── intrinsics.txt
│   │   │    ├── interpolated.txt
│   │   │    ├── not_registered.txt
│   │   │    ├── camera.yaml
│   │   │    ├── 0001.jpg
│   │   │    ├── 0001_intrinsics.txt
│   │   │    ├── 0001_camera.yaml
│   │   │    ├── 0001.npy
│   │   │    ├── 0002.jpg
│   │   │    └── ...
│   │   └── ...
Clément Pinard's avatar
Clément Pinard committed
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
│   ├── resolution2
│   ├── ...
│   └── individual_pictures
│       ├── folder1
│       ├── folder2
│       └── ...
└── visualization
    ├── resolution1
    │   ├── video1.mp4
    │   ├── video2.mp4
    │   └── ...
    ├── resolution2
    ├── ...
    └── individual_pictures
        ├── folder1
        │   ├── 0001.png
        │   └── ...
        ├── folder2
        └── ...
```


- `dataset` contains the information that will be used for the actual evaluation. For every video, we have a folder containing :
    * `poses.txt` for frames odometry. It is a list of lines of 12 float numbers, representing the 4x4 transformation matrix, the first matrix is always identity. This format is the same KITTI odometry. Non localized frames have a transformation matrix of `NaN` values.
    * `interpolated.txt` is a list of all the fram paths that have an interpolated odometry. That way, depth are invalid, but odometry can be used to test depth algorithm that needs frame odometry.
    * `not_registered.txt` is a list of all the fram paths that are not localized. They thus cannot be used for depth evaluation nor for odometry.
    * `intrinsics.txt` *(only if all the frames share the same camera)* contains the intrinsics 3x3 matrix in a txt file (does not contains distortion parameters)
    * `camera.yaml` *(only if all the frames share the same camera)* containes the camera parameters in COLMAP format, with distortions.
Clément Pinard's avatar
Clément Pinard committed
290
    * For every frame:
Clément Pinard's avatar
Clément Pinard committed
291
292
        * `{name}.jpg`
        * `{name}.npy`
Clément Pinard's avatar
Clément Pinard committed
293
        * `{name}_intrinsics.txt` and `{name}_camera.yaml`: only if frames don't share the same camera, same `intrinsics.txt` and  `camera.yaml` above.
Clément Pinard's avatar
Clément Pinard committed
294
295
296
297
298
    * Each folder stem in the idividual picture directry is treated as a video
- `visualization` contains images for vizualisation. Each video has a correspondign mp4 file. Individual pictures have a corresponding image file.

### How to read visualization

Clément Pinard's avatar
Clément Pinard committed
299
![h](images/example_viz.jpg)
Clément Pinard's avatar
Clément Pinard committed
300
301
302
303
304
305
306
307
308
309
310
311
312
313

Image is divided in 4 parts :

```
┌───┬───┐
│ A │ B │
├───┼───┤
│ C │ D │
└───┴───┘
```

 - `A` is raw image
 - `B` is normalized depth map. It follows [OpenCV's Rainbow colormap](https://docs.opencv.org/master/d3/d50/group__imgproc__colormap.html#gga9a805d8262bcbe273f16be9ea2055a65af7f0add024009b0e43f8c83f3ca0b923). Note that OpenCV's rainbow is NOT the same as matplotlib's rainbow_gist.
 - `C` is `A + B / 2`. It helps inspecting that depth and image are not shifted apart.
Clément Pinard's avatar
Clément Pinard committed
314
 - `D` is normalized occlusion depth map. It follows Matplotlib's Bone colormap. Its the same as [OpenCV's Bone colormap](https://docs.opencv.org/master/d3/d50/group__imgproc__colormap.html#gga9a805d8262bcbe273f16be9ea2055a65a91d58e66f015ea030150bdc8545d3b41) See https://matplotlib.org/3.1.0/tutorials/colors/colormaps.html
Clément Pinard's avatar
Clément Pinard committed
315

Clément Pinard's avatar
Clément Pinard committed
316
Note that when frame localization is interpolated, a 5 pixels wide orange frame is visible at the edge of the picture.
Clément Pinard's avatar
Clément Pinard committed
317
318
319
320
321



### Running the full script

Clément Pinard's avatar
Clément Pinard committed
322
323
You can run the whole script with ```python main_pipeline.py```. If you don't have a lidar point cloud and want to use COLMAP reconstructed cloud as Groundtruth, you can use ```python main_pipeline_no_lidar.py``` which will be very similar, minus point cloud cleaning and registration steps.

Clément Pinard's avatar
Clément Pinard committed
324
325
326
Example command :

```
327
328
329
330
331
332
333
334
335
336
337
338
339
340
python main_pipeline.py \
--input_folder /media/user/data/input_dataset/ \
--raw_output_folder /media/user/data/ground_truth_raw/ \
--converted_output_folder /media/user/data/ground_truth_converted/ \
--workspace ../workspace \
--total_frames 600 --SOR 10 5 \
--eth3d ../dataset-pipeline/build/ \
--nw ../AnafiSDK/out/pdraw-linux/staging/native-wrapper.sh \
--generic_model OPENCV \
--max_num_matches 25000 \
--match_method exhaustive \
--multiple_models --registration_method interactive \
--mesh_resolution 0.1 --splats --splat_threshold 0.05 \
--lowfps 1 --save_space --log out.log --system epsg:3949 -vv
Clément Pinard's avatar
Clément Pinard committed
341
342
```

Clément Pinard's avatar
Clément Pinard committed
343
344
#### Parameters breakdown

Clément Pinard's avatar
Clément Pinard committed
345
All the parameters for `main_pipeline.py` are defined in the file `cli_utils.ply` and can be retrieved with `python main_pipeline.py -h`. You will find below a summary :
Clément Pinard's avatar
Clément Pinard committed
346
347

1. Main options
Clément Pinard's avatar
Clément Pinard committed
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
    * `--input_folder` : Input Folder with LAS/PLY point clouds, videos, and images, defined above
    * `--workspace` : Path to workspace where COLMAP operations will be done. It needs to be on a SSD, and size needed depends on video size, but should at least be 20 Go.
    * `--raw_output_folder` : Path to output folder for raw depth maps. Must be very big, especially with 4K videos. for 4K30fps video, count around 60Go per minute of video.
    * `--converted_output_folder` : Path to output folder for converted depth maps and visualization. Must be big but usually smaller than raw output because depth map is still uncompressed, but downscaled.
    * `--show_steps` : If selected, will make a dry run just to list steps and their numbers.
    * `--skip_step` : Skip the selected steps. Can be useful an operation is done manually)
    * `--begin_step` : Skip all steps before this step. Useful when the script failed at some point
    * `--resume_work` : If selected, will try to skip video aready localized, and ground truth already generated
    * `--inspect_dataset` : If selected, will open a window to inspect the dataset at key steps. See https://github.com/ETH3D/dataset-pipeline#dataset-inspection
    * `--save_space` : If selected, will try to save space in workspace by only extracting needed frames and removing them as soon as they are no longer needed. Strongly advised.
    * `--vid_ext` : Video extensions to scrape from input folder. By default will search for `mp4` and `MP4` files
    * `--pic_ext` : Same as Video extensions, but for Image. By default will search for `jpg`, `JPG`, `png`and `PNG` files.
    * `--raw_ext` : Same as Video extensions, but for RAW image. By default will search for `ARW`, `NEF` and  `DNG` files.

2. Executable files
    * `--nw` : Native wrapper location. See https://developer.parrot.com/docs/pdraw/installation.html#run-pdraw
    * `--colmap` : Colmap exec location. Usually just `Colmap` if it has been installed system-wide.
    * `--ffmpeg` : ffmpeg exec location. Usually just `ffmpeg` if it has been installed system-wide.
    * `--eth3d` : ETH3D dataset pipeline exec files folder location. Usually at `dataset-pipeline/build/`.
    * `--pcl_util` : PCL util exec files. Usually at `pcl_util/build` (source in this repo)
    * `--log` : If set, will output stdout and stderr of these exec files to a log file, which can be read from anther terminal with `tail`.
Clément Pinard's avatar
Clément Pinard committed
369
370

3. Lidar point cloud preparation
Clément Pinard's avatar
Clément Pinard committed
371
372
373
    * `--pointcloud_resolution` : If set, will subsample the Lidar point clouds at the chosen resolution.
    * `--SOR` : Satistical Outlier Removal parameters. This accepts 2 arguments : Number of nearest neighbours and max relative distance to standard deviation. See https://pcl.readthedocs.io/projects/tutorials/en/latest/statistical_outlier.html
    * `--registration_method` : Method use for point cloud registration, chose between "simple", "eth3d" and "interactive" ("simple" by default). See Manual step by step : step 11')
Clément Pinard's avatar
Clément Pinard committed
374
375

4. Video extractor
Clément Pinard's avatar
Clément Pinard committed
376
377
378
379
380
381
382
383
    * `--total_frames` : Total number of frames that will be used for the first thorough photogrammetry. By default 500, keep this number below 1000.
    * `--orientation_weight` : Weight applied to orientation during optimal sample. Higher means two pictures with same location but different orientation will be considered further apart.
    * `--resolution_weight` : Same as orientation, but with image size.
    * `--max_sequence_length` : COLMAP needs to load ALL the feature matches to register new frames. As such, some videos are too long to fit in RAM, and we need to divide the video in Chunks that will treated separately and then merged together. This parameter is the number max of frames for a chunk. Ideal value is around 500 frames for 1Go of RAM, regardless of resolution.
    * `--num_neighbours` : number of frames overlapping between chunks. This is for merge purpose.
    * `--system` : coordinates system used for GPS, should be the same as the LAS files used.
    * `--lowfps`: framerate at which videos will be scanned WITH reconstruction. 1fps by default
    * `--include_lowfps_thorough` : if selected, will include videos frames at lowfps for thorough scan (longer). This can be useful when some videos are not GPS localized (e.g. handhel camera) and are still relevant for the thorough photogrammetry.
Clément Pinard's avatar
Clément Pinard committed
384
385
386
387
388
389

5. Photogrammetry
    * `--max_num_matches` : Max number of matches, lower it if you get GPU memory error.
    * `--vocab_tree` : Pah to vocab tree, can be downloaded [here](https://demuc.de/colmap/#download)
    * `--multiple_models` : If selected, will let colmap mapper do multiple models. The biggest one will then be chosen
    * `--more_sift_features` : If selected, will activate the COLMAP options ` SiftExtraction.domain_size_pooling` and `--SiftExtraction.estimate_affine_shape` during feature extraction. Be careful, this does not use GPU and is thus very slow. More info : https://colmap.github.io/faq.html#increase-number-of-matches-sparse-3d-points
Clément Pinard's avatar
Clément Pinard committed
390
    * `--add_new_videos` : If selected, will skip the mapping steps to directly register new video with respect to an already existing colmap model.
Clément Pinard's avatar
Clément Pinard committed
391
    * `--filter_models` : If selected, will filter video localization to smooth trajectory
Clément Pinard's avatar
Clément Pinard committed
392
393
394
395
396
397
398
399
400
401
402
    * `--stereo_min_depth` : Min depth for PatchMatch Stereo used during point cloud densification
    * `--stereo_max_depth` : Same as min depth but for max depth.

6. Occlusion Mesh
    * `--normals_method` : Method used for normal computation between radius and nearest neighbours.
    * `--normals_radius` : If radius method for normals, radius within which other points will be considered neighbours
    * `--normals_neighbours` : If nearest neighbours method chosen, number of neighbours to consider. Could be very close or very far points, but has a constant complexity.
    * `--mesh_resolution` : Mesh resolution for occlusion in meters. Higher means more coarse. (default 0.2, i.e. 20cm)
    * `--splats` : If selected, will create splats for points in the cloud that are far from the occlusion mesh.
    * `--splat_threshold` : Distance from occlusion mesh at which a splat will be created for a particular point (default, 10cm)
    * `--max_splate_size` : Splat size is defined by mean istance from its neighbours. You can define a max splat size for isolated points which otherwise would make a very large useless splat. If not set, will be `2.5*splat_threshold`.
Clément Pinard's avatar
Clément Pinard committed
403

404
405
406
7. Ground truth creation
    * `--eth3d_splat_radius` : Splat radius for occlusion mesh boundaries, radius of area (in meters) which will be defined as invalid because of occlusion uncertainty, see `splat_radius` option for ETH3D. Thumb rule here is that it should be around your point cloud precision. (default 0.01, i.e. 1cm)

Clément Pinard's avatar
Clément Pinard committed
407
408
409
410
411
412
### Manual step by step

This will essentially do the same thing as the script, in order to let you change some steps at will.

1. Point cloud preparation

Clément Pinard's avatar
Clément Pinard committed
413
    ```
Clément Pinard's avatar
Clément Pinard committed
414
415
    python las2ply.py Input/Lidar/cloud.las \
    --output_folder Workspace/Lidar
Clément Pinard's avatar
Clément Pinard committed
416
    ```
Clément Pinard's avatar
Clément Pinard committed
417

Clément Pinard's avatar
Clément Pinard committed
418
    This will save a ply file along with a centroid file
Clément Pinard's avatar
Clément Pinard committed
419
420
421
422
423
424
     - `Workspace/Lidar/cloud.ply`
     - `Workspace/Lidar/cloud_centroid.txt` This file will be used 

    *Note on centroid* : Contrary to LAS files which are 64bit floats, PLY is used with 32bit foats, which cannot store large numbers with high values. The centroid is there to make every geo-referenced position (which can be as high as 20,000 km), be it from LAS file or from GPS coordinates centered around zero and thus usable in float32. This file will thus be used when extracting GPS metadata from pictures or videos and converting the values in `XYZ` coordinates relative to your point cloud.

    Note that this also work for ply files, but most ply files are already centered around zero, so you can instead just copy the ply file and optionally write a centroid file with `0\n0\n0\n` inside.
Clément Pinard's avatar
Clément Pinard committed
425

Clément Pinard's avatar
Clément Pinard committed
426

Clément Pinard's avatar
Clément Pinard committed
427
428
2. Point Cloud Cleaning
    For each ply file :
Clément Pinard's avatar
Clément Pinard committed
429

Clément Pinard's avatar
Clément Pinard committed
430
431
    ```
    ETHD3D/build/PointCloudCleaner \
Clément Pinard's avatar
Clément Pinard committed
432
    --in Workspace/Lidar/cloud.ply \
Clément Pinard's avatar
Clément Pinard committed
433
    --filter 5,10
Clément Pinard's avatar
Clément Pinard committed
434
    ```
Clément Pinard's avatar
Clément Pinard committed
435
    (local outliers removal, doesn't necessarily remove isolated points)
Clément Pinard's avatar
Clément Pinard committed
436
437
438
    or
    ```
    pcl_util/build/CloudSOR \
Clément Pinard's avatar
Clément Pinard committed
439
440
    --input Workspace/Lidar/cloud.ply \
    --output Workspace/Lidar/cloud_filtered.ply \
Clément Pinard's avatar
Clément Pinard committed
441
442
    --knn 5 --std 6
    ```
Clément Pinard's avatar
Clément Pinard committed
443

Clément Pinard's avatar
Clément Pinard committed
444
3. Meshlab Project creation
Clément Pinard's avatar
Clément Pinard committed
445
    This step will construct a MeshLabProject file. It stores multiple PLY files and transformation information (in a 4x4 matrix) in the same file. During creation, every point cloud has identity for transformation.
Clément Pinard's avatar
Clément Pinard committed
446
447
    ```
    python meshlab_xml_writer.py create \
Clément Pinard's avatar
Clément Pinard committed
448
449
    --input_models Workspace/Lidar/cloud1_filtered.ply [.. Workspace/Lidar/cloudN_filtered.ply] \
    --output_meshlab Workspace/lidar.mlp
Clément Pinard's avatar
Clément Pinard committed
450
451
    ```

Clément Pinard's avatar
Clément Pinard committed
452
    Optionally, if we have multiple lidar scans, we can run a registration step with ETH3D
Clément Pinard's avatar
Clément Pinard committed
453

Clément Pinard's avatar
Clément Pinard committed
454
455
    This will run an ICP on the different ply models and store the resulting transformations in the `lidar.mlp` file.

Clément Pinard's avatar
Clément Pinard committed
456
457
    ```
    ETHD3D/build/ICPScanAligner \
Clément Pinard's avatar
Clément Pinard committed
458
459
    -i Workspace/lidar.mlp \
    -o Workspace/lidar.mlp
Clément Pinard's avatar
Clément Pinard committed
460
461
462
    --number_of_scales 5
    ```

Clément Pinard's avatar
Clément Pinard committed
463
4. Photogrammetry pictures preparation.
Clément Pinard's avatar
Clément Pinard committed
464

Clément Pinard's avatar
Clément Pinard committed
465
    This step can be skipped if there are only videos. It consists in 4 substeps
Clément Pinard's avatar
Clément Pinard committed
466

Clément Pinard's avatar
Clément Pinard committed
467
468
469
470
    1. Convert all your frames to either jpg or png and copy them to the folder `Workspace/COLMAP_img_root/individual_pictures`.
    Frames sharing the same subfolder MUST have the same camera calibration, i.e. taken from the same camera, with the same lense, and the same zoom level. It is advised to try to keep as few camera models as possible in order to improve COLMAP stability.

    2. (Optional) Generate sky masks in order to avoid using keypoints from clouds. See https://colmap.github.io/faq.html#mask-image-regions
Clément Pinard's avatar
Clément Pinard committed
471
472
    ```
    python generate_sky_masks.py \
Clément Pinard's avatar
Clément Pinard committed
473
474
475
    --img_dir Workspace/COLMAP_img_root \
    --colmap_img_root Workspace/COLMAP_img_root \
    --mask_root Workspace/Masks \
Clément Pinard's avatar
Clément Pinard committed
476
477
    --batch_size 8
    ```
Clément Pinard's avatar
Clément Pinard committed
478
479
480
     - This script will find recursively all image files (jpg or png) in folder given to `--img_dir` and create black and white mask images to prevent COLMAP from find feature points in the sky (where the clouds are moving).
     - `--colmap_img_root` and `mask_root` are given so that the mask for the image file located at e.g. `{--colmap_img_root}/relative_path/image.jpg` will be saved under the name `{--mask_root}/relative_path/image.jpg.png`.
     - Note that COLMAP is flexible to some images corresponding mask, and some images without. As such, you can run the script to only a subfolder of `Workspace/COLMAP_img_root`. This is especially interesting if the dataset has indoor videos where sky is never on screen and thus running script would only generate false positives.
Clément Pinard's avatar
Clément Pinard committed
481

Clément Pinard's avatar
Clément Pinard committed
482
    3. Run the feature extractor COLMAP command.
Clément Pinard's avatar
Clément Pinard committed
483
484
    ```
    colmap feature_extractor \
Clément Pinard's avatar
Clément Pinard committed
485
486
487
    --database_path Workspace/thorough_scan.db \
    --image_path Workspace/COLMAP_img_root \
    --ImageReader.mask_path Workspace/Masks \
Clément Pinard's avatar
Clément Pinard committed
488
489
490
    --ImageReader.camera_model RADIAL \
    --ImageReader.single_camera_per_folder 1 \
    ```
Clément Pinard's avatar
Clément Pinard committed
491
492
493
     - `--database_path` is the Database file that COLMAP will use throughout the whole photogrammetry process.
     - `--ImageReader.camera_model` is on of the camera models compatible with COLMAP. See https://colmap.github.io/cameras.html
     - `--ImageReader.single_camera_per_folder` makes COLMAP set every image in the same forlder to share the same camera parameters this will greatly improve the photogrammetry stability. This is only relevant if your pictures are indeed from the same camera parameters (including zoom).
Clément Pinard's avatar
Clément Pinard committed
494
495

    We don't explicitely need to extract features before having video frames, but this will populate the `/path/to/scan.db` file with the photogrammetry pictures and corresponding id that will be reserved for future version of the file.
Clément Pinard's avatar
Clément Pinard committed
496

Clément Pinard's avatar
Clément Pinard committed
497
5. Video frames addition to COLMAP db file
Clément Pinard's avatar
Clément Pinard committed
498
    ```
Clément Pinard's avatar
Clément Pinard committed
499
    python video_to_colmap.py \
Clément Pinard's avatar
Clément Pinard committed
500
    --video_folder Input/Videos \
Clément Pinard's avatar
Clément Pinard committed
501
    --system epsg:2154 \
Clément Pinard's avatar
Clément Pinard committed
502
503
    --centroid_path Workspace/Lidar/cloud_centroid.txt \
    --colmap_img_root Workspace/COLMAP_img_root \
Clément Pinard's avatar
Clément Pinard committed
504
505
506
    --nw /path/to/anafi/native-wrapper.sh \
    --fps 1 \
    --total_frames 1000 \
Clément Pinard's avatar
Clément Pinard committed
507
    --max_sequence_length 4000 \
Clément Pinard's avatar
Clément Pinard committed
508
    --save_space \
Clément Pinard's avatar
Clément Pinard committed
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
    --thorough_db Workspace/scan_thorough.db
    ```
     - `--video_folder` is the folder where you stored all your videos in MP4 files. For other video file formats, you can use the option `--vid_ext .avi` for e.g. also taking the `avi` files. The format must be readable by ffmpeg. See https://ffmpeg.org/ffmpeg-formats.html#Demuxers
     - `--system` is the gps coordinates system parameter (here `epsg:2154`). It is the one used in the LAS point cloud. The geo localized frame will then be localized inside the point cloud, which will help register the COLMAP reconstructed point with the Lidar PointCloud. See more info [here](https://en.wikipedia.org/wiki/Spatial_reference_system). It must be compatible with [Proj](https://proj.org).
     - `--nw` is the file we use to access PDraW tools, including `vmeta-extract` which will convert Anafi metadata into a csv file. See https://developer.parrot.com/docs/pdraw/userguide.html#use-the-pdraw-program
     - `--centroid_path` is the file that was created in the first step when LAS files were converted into PLY files.
     - `--fps` is the framerate at which videos will be subsampeld when running the mapper at step 10.
     - `--save_space` indicates that not all frames will be extracted from the videos, as it can become quite heavy. Only the frames that are  When localizing the video (see step 10), you will have to extract the whole video, but until this step it is not needed.
     ----
    What this script does :
     - For each MP4 file found in the folder `--video_folder`:
         1. Create the folder `{--colmap_img_root}/Videos/{resolution}/{video_name}` where we will store files related to this video frame. For example with the example command above and a full HD video named `video.mp4`, the script will create the folder `Workspace/COLMAP_img_root/Videos/1920x1080/video`
         2. Extract metadata if possible (see note below), and save in the CSV file `metadata.csv` in the folder created. See second note below for expected generic metadata. GPS data is converted to XYZ using the coordinate system given in `--system`. For videos without metadata, at least frame number, frame name, framerate and frame size are stored.
         3. Use the metadata to select a spatially optimal subset of frames from the full video for a photogrammetry with `--total_frames` pictures (1000 frames in this example). 1000 pictures might be overkill for scenes with low complexity (e.g. only one building), you can easily reduce it to 500 if reconstruction is too slow.
          * The optimal subset is obtained by taking all videos with XYZ positions and making a subset with [K-Means](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html). It is better than just taking frames at a low framerate, because camera motion is not  necessarily homogeneous (especially for drone who are often on a fixed point) and we avoid unecessarily sampling multiple almost identical frames.
          * If orientation is available in a quaternion QwQxQyQz, K-Means are applied on the 6D cloud `[X,Y,Z,Qx,Qy,Qz]`. That way, frames that share the same position but have different orientation, can be both sampled.
          * The K-means is weighted A quality index is also applied (given by `width * height / framerate`), in order to use more frames of better quality. This is used for example with Anafi videos, where high framerate videos tend to have lower quality for the same resolution.
          * If no GPS is available, but relative position is available (e.g. indoor flight data), each sequence will be optimally sampled without comparing it to other video sequences.
         4. populate the database given in `--thorough_db` with these new entries with the right camera parameters (e.g. PINHOLE camera for Anafi videos, all videos taken with the same drone and the same zoom level will share the same camera parameters). If no metadata is available, we adopt the same policy as COLMAP during feature extraction when using option `--ImageReader.single_camera_per_folder` (see step 4).
         5. Divide video into chunks of size `--max_sequence_length` with corresponding list of filepath so that we don't deal with too large sequences during the video localization step (see step 10). Each chunk will have the list of frames stored in a file `full_chunk_N.txt` inside the video folder.
         6. Extract frames from the videos and save it to the folder `{--colmap_img_root}/Videos/{resolution}/{video_name}`. if `--save_space` is selected, only extract sampled frames (use for Spar reconstruction) and frames at framerate `--fps` (used for Reconstruction densification, i.e. Multi view stereo)
         7. Create `1+n` txt files with list of file paths and save it to the same folder as the frames
            * `lowfps.txt` : list of frames at framerate approximatively `--fps`
            * `full_chunk_{n}.txt` : list of frames for each of `n` chunk into which the video was divided.
            The `georef.txt` is unused. It contains file paths and XYZ coordinates of sampled frames + video frames at low framerate. See below.
     - Create two txt files with list of file paths and save it at the root of `--colmap_img_root` :
        * `video_frames_for_thorough_scan.txt` : all images used in the first thorough photogrammetry. This will be used for feature extraction in order to not extract every feature from every frame. Indeed, the matching step in COLMAP will try to match EVERY frame in its database, and thus if we extract features from all frames, w will match every video frame (potentially thousands) which not what we want because otherwise the reconstruction would be unnecessarily long.
        * `georef.txt` : all images with XYZ position from GPS, with system and minus centroid of Lidar file. This will be used at the georeferencing step. See https://colmap.github.io/faq.html#geo-registration. This will provide a first transformation estimation for the geo registered Lidar cloud. In the case XYZ data is available without GPS (e.g. indoor flight data), the video sequence with the largest radius covered is used as "valid GPS". That way, even if the geo-registered model is not a good transformation estimate for Lidar point cloud, we at least have a good scale estimate.
     - Create a COLMAP model (`images.bin`, `points3D`, `cameras.bin`) at the root of `{--colmap_img_root}/Videos` without any 3D points, but with cameras specifications and images localizations from metadata. It can be used for inspection purpose (You will need an X server). See https://colmap.github.io/gui.html
     ```
     colmap gui \
     --import_path Workspace/COLMAP_img_root/Videos \
     --database_path Workspace/scan_thorough.db \
     --image_path Workspace/COLMAP_img_root
     ```
 ----
Clément Pinard's avatar
Clément Pinard committed
545
546
547
548
549
550
551
552
553
554
555
556
557
 *Notes*
 - This script is designed to circumvent the fact that COLMAP will try to compute the features of all the images in the database, will try to match all the feature vectors in the database, and try to map a reconstruction with all the matches. As such, if we add all the video frames to the databse we won't be able to run the photogrammetry with only a subset of the video frames. In order to do so, we add all the frames to a fictive database and note the image id numbers in the video metadata for later use. We then construct a small database with only the frames we want. And each time we want to add frames to this database, we add it with the id numbers that was generated for the fictive database, with the right camera id. See https://colmap.github.io/tutorial.html#database-management
 - This script is initially intended to be used for Anafi video, with metadata directly embedded in the video feed. However, if you have other videos with the same kind of metadata (GPS, timestamp, orientation ...), you kind manually put them in a csv file that will be named `[video_name]_metadata.csv` alongside the video file `[vide_name].mp4`. One row per frame, obligatory fields are :
    - `camera_model` : See https://colmap.github.io/cameras.html
    - `camera_params` : COLMAP format : tuples beginning with focal length(s) and then distortion params
    - `x`, `y`, `z` : Frames positions : if not known, put nan
    - `frame_quat_w`, `frame_quat_x`, `frame_quat_y`, `frame_quat_z` : Frame orientations : if not known, put nan
    - `location_valid` : Whether `x,y,z` position should be trusted as absolute with respect to the point cloud or not. If `x,y,z` positions are known but only reltive to each other, we can still leverage that data for COLMAP optimal sample, and later model rescaling after thorough photogrammetry.
    - `time` : timestamp, in microseconds.

    An exemple of this metadata csv generaton can be found with `convert_euroc.py` , which will convert EuRoC dataset to videos with readable metadata.

    Finally, if no metadata is available for your video, because e.g. it is a handheld video, the script will consider your video as generic : it won't be used for thorough photogrammetry (unless the `--include_lowfps` option is chosen), but it will try to localize it and find the cameras intrinsics. Be warned that it is not compatible with variable zoom.
Clément Pinard's avatar
Clément Pinard committed
558
559
560
561



6. Feature extraction for video frames used for thorough photogrammetry
Clément Pinard's avatar
Clément Pinard committed
562
563
    ```
    python generate_sky_masks.py \
Clément Pinard's avatar
Clément Pinard committed
564
565
566
    --img_dir Workspace/COLMAP_img_root \
    --colmap_img_root Workspace/COLMAP_img_root \
    --mask_root Workspace/Masks \
Clément Pinard's avatar
Clément Pinard committed
567
568
569
    --batch_size 8
    ```

Clément Pinard's avatar
Clément Pinard committed
570
571
    (this is the same command as step 4)

Clément Pinard's avatar
Clément Pinard committed
572
573
    ```
    colmap feature_extractor \
Clément Pinard's avatar
Clément Pinard committed
574
575
576
    --database_path Workspace/thorough_scan.db \
    --image_path Workspace/COLMAP_img_root \
    --image_list_path Workspace/COLMAP_img_root/video_frames_for_thorough_scan.txt
Clément Pinard's avatar
Clément Pinard committed
577
578
    --ImageReader.mask_path Path/to/images_mask/ \
    ```
Clément Pinard's avatar
Clément Pinard committed
579
580
581
     - `--image_list_path` is the path to the list frames for thorough scan we created at previous step. As such, shoud any other picture that the one we need for photogrammetry exist somewhere in the folder `--image_path`, it will be ignored.
     - Contrary to step 4, instead of creating new database entries for the video frames, the feature extractor will use here already existing database entries (that we have set during the videos to colmap step), making sure that the right camera id is used.
     - We recommand you also make your own vocab_tree with image indexes, this will make the next matching steps faster. You can download a vocab_tree at https://demuc.de/colmap/#download : We took the 256K version in our tests.
Clément Pinard's avatar
Clément Pinard committed
582
583
584

    ```
    colmap vocab_tree_retriever \
Clément Pinard's avatar
Clément Pinard committed
585
586
587
    --database_path Workspace/thorough_scan.db \
    --vocab_tree_path /path/to/vocab_tree.bin \
    --output_index Workspace/indexed_vocab_tree.bin
Clément Pinard's avatar
Clément Pinard committed
588
589
    ```

Clément Pinard's avatar
Clément Pinard committed
590
591
7. Feature matching.
    For less than 1000 images, you can use exhaustive matching (this will take around 2hours). If there is too much images, you can use either spatial matching or vocab tree matching
Clément Pinard's avatar
Clément Pinard committed
592
593
594

    ```
    colmap exhaustive_matcher \
Clément Pinard's avatar
Clément Pinard committed
595
    --database_path Workspace/thorough_scan.db \
Clément Pinard's avatar
Clément Pinard committed
596
597
598
599
600
    --SiftMatching.guided_matching 1
    ```
    or
    ```
    colmap spatial_matcher \
Clément Pinard's avatar
Clément Pinard committed
601
    --database_path Workspace/thorough_scan.db \
Clément Pinard's avatar
Clément Pinard committed
602
    --SiftMatching.guided_matching 1
Clément Pinard's avatar
Clément Pinard committed
603
604
    --SequentialMatching.loop_detection 1 \
    --SequentialMatching.vocab_tree_path Workspace/indexed_vocab_tree.bin
Clément Pinard's avatar
Clément Pinard committed
605
606
607
608
    ```
    or
    ```
    colmap vocab_tree_matcher \
Clément Pinard's avatar
Clément Pinard committed
609
610
    --database_path Workspace/thorough_scan.db \
    --VocabTreeMatching.vocab_tree_path Workspace/indexed_vocab_tree.bin
Clément Pinard's avatar
Clément Pinard committed
611
612
613
    --SiftMatching.guided_matching 1
    ```

Clément Pinard's avatar
Clément Pinard committed
614
615
616
    Note that `--SiftMatching.guided_matching` will take twice as much GPU memory, but will have more matches, and of higher quality.

8. Thorough mapping.
Clément Pinard's avatar
Clément Pinard committed
617
618

    ```
Clément Pinard's avatar
Clément Pinard committed
619
620
621
622
    colmap mapper \
    --database_path Workspace/thorough_scan.db \
    --output_path Workspace/Thorough \
    --image_path Workspace/COLMAP_img_root
Clément Pinard's avatar
Clément Pinard committed
623
624
    ```

Clément Pinard's avatar
Clément Pinard committed
625
    This will create multiple models located in folder named `output/sparse/N` , `N`being a number, starting from 0. Each model will be, in the form of 3 files
Clément Pinard's avatar
Clément Pinard committed
626
    ```
Clément Pinard's avatar
Clément Pinard committed
627
628
    Workspace
    └── Thorough
Clément Pinard's avatar
Clément Pinard committed
629
        └── N
Clément Pinard's avatar
Clément Pinard committed
630
631
632
633
634
635
            ├── cameras.bin
            ├── images.bin
            ├── points3D.bin
            └── project.ini
    ```

Clément Pinard's avatar
Clément Pinard committed
636
    COLMAP creates multiple models in the case the model has multiple sets of images that don't overlap. Most of the time, there will be only 1 model (named `0`). Depending on the frame used for initialization, it can happen that the biggest model is not the first. Here we will assume that it is indeed the first (`0`), but you are exepcted to change that number if it is not the most complete model COLMAP could construct.
Clément Pinard's avatar
Clément Pinard committed
637
    You can inspect the different models with the COLMAP gui (You will need an X server). See https://colmap.github.io/gui.html
Clément Pinard's avatar
Clément Pinard committed
638

Clément Pinard's avatar
Clément Pinard committed
639
640
641
642
643
644
645
646
    ```
    colmap gui \
    --import_path Workspace/Thorough/N/ \
    --database_path Workspace/scan_thorough.db \
    --image_path Workspace/COLMAP_img_root
    ```

    You can finally add a last bundle adjustment using `colmap bundle_adjuster`, which makes use of [Ceres](http://ceres-solver.org/), supposedly better than the multicore used in `colmap mapper` (albeit slower)
Clément Pinard's avatar
Clément Pinard committed
647
648
649
650
651
652
653

    ```
    colmap bundle_adjuster \
    --input_path /path/to/thorough/0
    --output_path /path/to/thorough/0
    ```

Clément Pinard's avatar
Clément Pinard committed
654
9. Georeferencing
Clément Pinard's avatar
Clément Pinard committed
655
656
657
658

    ```
    mkdir -p /path/to/geo_registered_model
    colmap model_aligner \
Clément Pinard's avatar
Clément Pinard committed
659
660
661
    --input_path Workspace/Thorough/0/ \
    --output_path Workspace/Thorough/georef \
    --ref_images_path Workspace/COLMAP_img_root/georef.txt
Clément Pinard's avatar
Clément Pinard committed
662
663
664
    --robust_alignment_max_error 5
    ```

Clément Pinard's avatar
Clément Pinard committed
665
    See https://colmap.github.io/faq.html#geo-registration
Clément Pinard's avatar
Clément Pinard committed
666
667
668
    This model will be the reference model, every further models and frames localization will be done with respect to this one.
    Even if we could, we don't run Point cloud registration right now, as the next steps will help us to have a more complete point cloud.

Clément Pinard's avatar
Clément Pinard committed
669
670
671
672
673
674
    make the first iteration of georef_full by simply copying the `georef` model into `georef_full`

    ```
    cp -r Workspace/Thorough/georef Workspace/Thorough/georef_full
    ```

Clément Pinard's avatar
Clément Pinard committed
675
10. Video Localization
Clément Pinard's avatar
Clément Pinard committed
676
677
678
679
       
       Tasks for each video.
       - `video_folder` is a placeholder for the folder `Workspace/COLMAP_img_root/Videos/{resolution}/{video_name}` where frames, frame lists and metadata are stored.
       - `video_workspace` is a placeholder for the folder `Workspace/Videos_reconstructions/{resolution}/{video_name}` where database files and models will be stored.
Clément Pinard's avatar
Clément Pinard committed
680

Clément Pinard's avatar
Clément Pinard committed
681
682
683
684
    1. (Step 10.1) Make a first copy of `scan_thorough.db` in order to populate this database with the current video frames at lowfps. This will be used for mapping (see step 10.3)
        ```
        cp -r Workspace/thorough_scan.db video_workspace/lowfps.db
        ```
Clément Pinard's avatar
Clément Pinard committed
685

Clément Pinard's avatar
Clément Pinard committed
686
    2. (Step 10.2) If `--save_space` option was used during step 5. when calling script `video_to_colmap.py` , you now need to extract all the frames of the video to same directory the `videos_to_colmap.py` script exported the frame subset of this video.
Clément Pinard's avatar
Clément Pinard committed
687
688
        ```
        ffmpeg \
Clément Pinard's avatar
Clément Pinard committed
689
        -i Input/video.mp4 \
Clément Pinard's avatar
Clément Pinard committed
690
        -vsync 0 -qscale:v 2 \
Clément Pinard's avatar
Clément Pinard committed
691
        video_folder/{video_name}_%05d.jpg
Clément Pinard's avatar
Clément Pinard committed
692
693
        ```

Clément Pinard's avatar
Clément Pinard committed
694
    3. (Step 10.3) Continue mapping from georef with low fps images, use sequential matcher
Clément Pinard's avatar
Clément Pinard committed
695
696
        ```
        python generate_sky_masks.py \
Clément Pinard's avatar
Clément Pinard committed
697
698
699
        --img_dir video_folder \
        --colmap_img_root Workspace/COLMAP_img_root \
        --maskroot Workspace/Masks \
Clément Pinard's avatar
Clément Pinard committed
700
701
702
703
        --batch_size 8
        ```
        ```
        python add_video_to_db.py \
Clément Pinard's avatar
Clément Pinard committed
704
705
706
        --frame_list video_folder/lowfps.txt \
        --metadata video_folder/metadata.csv\
        --database video_workspace/lowfps.db
Clément Pinard's avatar
Clément Pinard committed
707
        ```
Clément Pinard's avatar
Clément Pinard committed
708
        This scripts add the lowfps frames to the database and makes sure that the right camera id is given, thanks to the csv file given in `--metadata`.
Clément Pinard's avatar
Clément Pinard committed
709
710
        ```
        colmap feature_extractor \
Clément Pinard's avatar
Clément Pinard committed
711
712
713
714
        --database_path video_worskpace/lowfps.db \
        --image_path Workspace/COLMAP_img_root \
        --image_list_path video_folder/lowfps.txt
        --ImageReader.mask_path Workspace/Masks/
Clément Pinard's avatar
Clément Pinard committed
715
716
717
        ```
        ```
        colmap sequential_matcher \
Clément Pinard's avatar
Clément Pinard committed
718
        --database_path /video_workspace/lowfps.db \
Clément Pinard's avatar
Clément Pinard committed
719
        --SequentialMatching.loop_detection 1 \
Clément Pinard's avatar
Clément Pinard committed
720
        --SequentialMatching.vocab_tree_path Workspace/indexed_vocab_tree.bin
Clément Pinard's avatar
Clément Pinard committed
721
722
723
        ```
        ```
        colmap mapper \
Clément Pinard's avatar
Clément Pinard committed
724
725
        --input Workspace/Thorough/georef \
        --output video_workspace/lowfps_model \
Clément Pinard's avatar
Clément Pinard committed
726
        --Mapper.fix_existing_images 1
Clément Pinard's avatar
Clément Pinard committed
727
728
        --database_path video_workspace/lowfps.db \
        --image_path Workspace/COLMAP_img_root
Clément Pinard's avatar
Clément Pinard committed
729
        ```
Clément Pinard's avatar
Clément Pinard committed
730
        `--Mapper.fix_existing_images` prevent bundle adjuster from moving already existing images.
Clément Pinard's avatar
Clément Pinard committed
731

Clément Pinard's avatar
Clément Pinard committed
732
    4.  (Step 10.4) Re-georeference the model
Clément Pinard's avatar
Clément Pinard committed
733
734

        This is a tricky part : to ease convergence, the mapper normalizes the model, losing the initial georeferencing.
Clément Pinard's avatar
Clément Pinard committed
735
        To avoid this problem, we merge the model back to the first one. the order between input1 and input2 is important as the transformation is applied to input2.
Clément Pinard's avatar
Clément Pinard committed
736
737
        ```
        colmap model_merger \
Clément Pinard's avatar
Clément Pinard committed
738
739
740
        --input_path1 Workspace/Thorough/georef_full \
        --input_path2 video_workspace/lowfps_model \
        --output video_workspace/lowfps_model
Clément Pinard's avatar
Clément Pinard committed
741
742
        ```

Clément Pinard's avatar
Clément Pinard committed
743
    5. (Step 10.5) Add mapped frame to the full model that will be used for Lidar registration
Clément Pinard's avatar
Clément Pinard committed
744
745
        ```
        colmap model_merger \
Clément Pinard's avatar
Clément Pinard committed
746
747
748
        --input_path1 Workspace/Thorough/georef_full \
        --input_path2 video_workspace/lowfps_model \
        --output Workspace/Thorough/georef_full
Clément Pinard's avatar
Clément Pinard committed
749
        ```
Clément Pinard's avatar
Clément Pinard committed
750
        Each video reconstruction will incrementally add more and more images to the ` georef_full` model.
Clément Pinard's avatar
Clément Pinard committed
751

Clément Pinard's avatar
Clément Pinard committed
752
    6. (Step 10.6) Register the remaining frames of the videos, without mapping. This is done by chunks in order to avoid RAM problems.
Clément Pinard's avatar
Clément Pinard committed
753

Clément Pinard's avatar
Clément Pinard committed
754
        Chunks are created during step 5, when calling script `videos_to_colmap.py`. For each chunk `N`, make a copy of the scan database and do the same operations as above, minus the mapping, replaced with image registration.
Clément Pinard's avatar
Clément Pinard committed
755
756

        ```
Clément Pinard's avatar
Clément Pinard committed
757
        cp video_workspace/lowfps.db video_workspace/chunk_n.db
Clément Pinard's avatar
Clément Pinard committed
758
759
760
761
        ```

        ```
        python add_video_to_db.py \
Clément Pinard's avatar
Clément Pinard committed
762
763
764
        --frame_list video_folder/full_chunk_n.txt \
        --metadata video_folder/metadata.csv\
        --database video_workspace/chunk_n.db
Clément Pinard's avatar
Clément Pinard committed
765
766
767
768
        ```

        ```
        colmap feature_extractor \
Clément Pinard's avatar
Clément Pinard committed
769
770
771
772
        --database_path video_workspace/chunk_n.db \
        --image_path Workspace/COLMAP_img_root \
        --image_list_path video_folder/full_n.txt
        --ImageReader.mask_path Workspace/Masks
Clément Pinard's avatar
Clément Pinard committed
773
774
775
776
        ```

        ```
        colmap sequential_matcher \
Clément Pinard's avatar
Clément Pinard committed
777
        --database_path video_workspace/chunk_n.db \
Clément Pinard's avatar
Clément Pinard committed
778
        --SequentialMatching.loop_detection 1 \
Clément Pinard's avatar
Clément Pinard committed
779
        --SequentialMatching.vocab_tree_path Workspace/indexed_vocab_tree.bin
Clément Pinard's avatar
Clément Pinard committed
780
781
782
783
        ```

        ```
        colmap image_registrator \
Clément Pinard's avatar
Clément Pinard committed
784
785
786
        --database_path video_workspace/chunk_n.db \
        --input_path video_workspace/lowfps_model
        --output_path video_workspace/chunk_n_model
Clément Pinard's avatar
Clément Pinard committed
787
788
789
790
791
792
        ```

        (optional bundle adjustment)

        ```
        colmap bundle_adjuster \
Clément Pinard's avatar
Clément Pinard committed
793
794
        --input_path video_workspace/chunk_n_model \
        --output_path video_workspace/chunk_n_model \
Clément Pinard's avatar
Clément Pinard committed
795
796
        --BundleAdjustment.max_num_iterations 10
        ```
Clément Pinard's avatar
Clément Pinard committed
797
        if first chunk, simply copy `video_workspace/chunk_n_model` to `video_workspace/full_video_model`.
Clément Pinard's avatar
Clément Pinard committed
798
799
800
        Otherwise:
        ```
        colmap model_merger \
Clément Pinard's avatar
Clément Pinard committed
801
802
803
        --input1 video_workspace/full_video_model \
        --input2 video_workspace/chunk_n_model \
        --output video_workspace/full_video_model
Clément Pinard's avatar
Clément Pinard committed
804
805
806
807
        ```

        At the end of this step, you should have a model with all the (localizable) frames of the videos + the other frames that where used for the first thorough photogrammetry

Clément Pinard's avatar
Clément Pinard committed
808
    7. (Step 10.7) Extract the frame position from the resulting model
Clément Pinard's avatar
Clément Pinard committed
809
810
811

        ```
        python extract_video_from_model.py \
Clément Pinard's avatar
Clément Pinard committed
812
813
814
        --input_model video_workspace/full_video_model \
        --output_model video_workspace/final_model \
        --metadata_path video_folder/metadata.csv
Clément Pinard's avatar
Clément Pinard committed
815
816
        --output_format txt
        ```
Clément Pinard's avatar
Clément Pinard committed
817

Clément Pinard's avatar
Clément Pinard committed
818
819
    9. (Step 10.9) Save frames and models in `Raw_output_directory`

Clément Pinard's avatar
Clément Pinard committed
820
        ```
Clément Pinard's avatar
Clément Pinard committed
821
822
        cp -r video_workspace Raw_output_directory/models/{resolution}/{video_name}
        cp -r video_folder Raw_output_directory/COLMAP_img_root/Videos/{resolution}/{video_name}
Clément Pinard's avatar
Clément Pinard committed
823
        ```
Clément Pinard's avatar
Clément Pinard committed
824
825
826
827
        This will be used for Groundtruth creation step

    8. (Step 10.8) If needed, everything in `video_workspace` except `final_model` can be deleted, and every frame that is neither in `Workspace/COLMAP_img_root/video_frames_for_thorough_scan.txt` nor in `video_folder/lowfps.txt` can be deleted from `video_folder`.

Clément Pinard's avatar
Clément Pinard committed
828
    At the end of these per-video-tasks, you should have a model at `/path/to/georef_full` with all photogrammetry images + localization of video frames at 1fps, and for each video a TXT file with positions with respect to the first geo-registered reconstruction.
Clément Pinard's avatar
Clément Pinard committed
829

Clément Pinard's avatar
Clément Pinard committed
830
11. Point cloud densification
Clément Pinard's avatar
Clément Pinard committed
831

Clément Pinard's avatar
Clément Pinard committed
832
833
    ```
    colmap image_undistorter \
Clément Pinard's avatar
Clément Pinard committed
834
835
836
    --image_path Workspace/COLMAP_img_root \
    --input_path Workspace/Thorough/georef_full \
    --output_path Workspace/Thorough/dense \
Clément Pinard's avatar
Clément Pinard committed
837
838
839
    --output_type COLMAP \
    --max_image_size 1000
    ```
Clément Pinard's avatar
Clément Pinard committed
840

Clément Pinard's avatar
Clément Pinard committed
841
    `max_image_size` option is optional but recommended if you want to save space when dealing with 4K images. This command transform the `georef_full` model, in which we put the frames localized during the first photogrammetry but also all video frames at low framerate, into a model with only PINHOLE images, and prepare the workspace for the patch match stereo step which will try to compute a depth map for every frame with multi-view stereo. See https://colmap.github.io/tutorial.html#dense-reconstruction
Clément Pinard's avatar
Clément Pinard committed
842

Clément Pinard's avatar
Clément Pinard committed
843
844
    ```
    colmap patch_match_stereo \
Clément Pinard's avatar
Clément Pinard committed
845
    --workspace_path Workspace/Thorough/dense \
Clément Pinard's avatar
Clément Pinard committed
846
847
848
    --workspace_format COLMAP \
    --PatchMatchStereo.geom_consistency 1
    ```
Clément Pinard's avatar
Clément Pinard committed
849

Clément Pinard's avatar
Clément Pinard committed
850
851
    `--PatchMatchStereo.geom_consistency` is used to filter out invalid depth values. The process is twice as long, but the noise is greatly reduced, which is what we want for lidar registration.

Clément Pinard's avatar
Clément Pinard committed
852
853
    ```
    colmap stereo_fusion \
Clément Pinard's avatar
Clément Pinard committed
854
    --workspace_path Workspace/thorough/dense \
Clément Pinard's avatar
Clément Pinard committed
855
856
    --workspace_format COLMAP \
    --input_type geometric \
Clément Pinard's avatar
Clément Pinard committed
857
    --output_path Workspace/Thorough/georef_dense.ply
Clément Pinard's avatar
Clément Pinard committed
858
    ```
Clément Pinard's avatar
Clément Pinard committed
859

Clément Pinard's avatar
Clément Pinard committed
860
    This will also create a `Workspace/Thorough/georef_dense.ply.vis` file which describes frames from which each point is visible.
Clément Pinard's avatar
Clément Pinard committed
861

Clément Pinard's avatar
Clément Pinard committed
862
12. Point cloud registration
Clément Pinard's avatar
Clément Pinard committed
863

Clément Pinard's avatar
Clément Pinard committed
864
    Determine the transformation to apply to `Workspace/lidar.mlp` to get to `Workspace/Thorough/georef_dense.ply` so that we can have the pose of the cameras with respect to the lidar.
Clément Pinard's avatar
Clément Pinard committed
865

Clément Pinard's avatar
Clément Pinard committed
866
867
868
869
870
871
872
873
874
875
876
877
878
879
     - *Option 1* : construct a meshlab project similar to `Workspace/lidar.mlp` with `Workspace/Thorough/georef_dense.ply` as first mesh and run ETH3D's registration tool
        ```
        python meshlab_xml_writer.py add \
        --input_models Workspace/Thorough/georef_dense.ply \
        --start_index 0 \
        --input_meshlab Workspace/lidar.mlp \
        --output_meshlab Workspace/aligned.mlp
        ```
        ```
        ETHD3D/build/ICPScanAligner \
        -i Workspace/aligned.mlp \
        -o Workspace/aligned.mlp \
        --number_of_scales 5
        ```
Clément Pinard's avatar
Clément Pinard committed
880

Clément Pinard's avatar
Clément Pinard committed
881
        The second matrix in `Workspace/aligned.mlp` will be the matrix transform from `Workspace/lidar.mlp` to `Workspace/Thorough/georef_dense.ply`
Clément Pinard's avatar
Clément Pinard committed
882

Clément Pinard's avatar
Clément Pinard committed
883
        **Importante note** : This operation doesn't work for scale adjustments. Theoretically, if the video frames are gps localized, it should no be a problem, but it can be a problem with very large models where a small scale error will be responsible for large displacement errors locally.
Clément Pinard's avatar
Clément Pinard committed
884

Clément Pinard's avatar
Clément Pinard committed
885
886
     - *Option 2* : construct a PLY file from lidar scans and register the reconstructed cloud with respect to the lidar, with PCL or CloudCompare.
        We do this way (and not from lidar to reconstructed), because it is usually easier to register the cloud with less points with classic ICP)
Clément Pinard's avatar
Clément Pinard committed
887

Clément Pinard's avatar
Clément Pinard committed
888
889
890
891
892
893
894
        Convert meshlab project to PLY with normals :

        ```
        ETHD3D/build/NormalEstimator \
        -i Workspace/lidar.mlp \
        -o Workspace/Lidar/lidar_with_normals.ply
        ```
Clément Pinard's avatar
Clément Pinard committed
895

Clément Pinard's avatar
Clément Pinard committed
896
        And then:
Clément Pinard's avatar
Clément Pinard committed
897

Clément Pinard's avatar
Clément Pinard committed
898
         - Use PCL
Clément Pinard's avatar
Clément Pinard committed
899

Clément Pinard's avatar
Clément Pinard committed
900
901
902
903
904
905
            ```
            pcl_util/build/CloudRegistrator \
            --georef Workspace/Thorough/georef_dense.ply \
            --lidar Workspace/Lidar/lidar_with_normals.ply \
            --output_matrix Workspace/matrix_thorough.txt
            ```
Clément Pinard's avatar
Clément Pinard committed
906

Clément Pinard's avatar
Clément Pinard committed
907
908
909
910
911
912
913
914
         - Or use CloudCompare
            https://www.cloudcompare.org/doc/wiki/index.php?title=Alignment_and_Registration
            Best results were maintened with these consecutive steps :
             - Crop the Workspace/Thorough/georef_dense.ply cloud, otherwise the Octomap will be very inefficient, and the cloud usually has very far outliers. See [Cross section](https://www.cloudcompare.org/doc/wiki/index.php?title=Cross_Section).
             - Apply noise filtering on cropped cloud . See [Noise filter](https://www.cloudcompare.org/doc/wiki/index.php?title=Noise_filter).
             - (Optional, you may skip it if e.g. the frames are gps localized) Manually apply a rough registration with point pair picking. See [Align](https://www.cloudcompare.org/doc/wiki/index.php?title=Align).
             - Apply fine registration, with final overlap of 50%, scale adjustment, and Enable farthest point removal. See [ICP](https://www.cloudcompare.org/doc/wiki/index.php?title=ICP)
             - Save resulting registration matrix in `Workspace/ùmatrix_thorough.txt`
Clément Pinard's avatar
Clément Pinard committed
915

Clément Pinard's avatar
Clément Pinard committed
916
917
918
919
920
921
922
923
924
925
926
927
            For the fine registration part, as said earlier, the aligned cloud is the reconstruction and the reference cloud is the lidar

    Finally, apply the registration matrix to `/path/to/lidar/mlp` to get `/path/to/registered.mlp`
    Note that `Workspace/matrix_thorough.txt` stores the inverse of the matrix we want, so you have to invert it and save back the result.

        ```
        python meshlab_xml_writer.py transform \
        --input_meshlab Workspace/lidar.mlp \
        --output_meshlab Workspace/aligned.mlp \
        --transform Workspace/matrix_thorough.txt
        --inverse
        ```
Clément Pinard's avatar
Clément Pinard committed
928

Clément Pinard's avatar
Clément Pinard committed
929
13. Occlusion Mesh generation
Clément Pinard's avatar
Clément Pinard committed
930
931
932
933
934
935
936
937

    Use COLMAP delaunay mesher to generate a mesh from PLY + VIS.
    Normally, COLMAP expect the cloud it generated when running the `stereo_fusion` step, but we use the lidar point cloud instead.

    Get a PLY file for the registered lidar point cloud
    
    ```
    ETHD3D/build/NormalEstimator \
Clément Pinard's avatar
Clément Pinard committed
938
939
    -i Workspace/aligned.mlp \
    -o Workspace/lidar/lidar_with_normals.ply
Clément Pinard's avatar
Clément Pinard committed
940
941
942
943
    ```

    ```
    pcl_util/build/CreateVisFile \
Clément Pinard's avatar
Clément Pinard committed
944
945
946
    --georef_dense Workspace/Thorough/georef_dense.ply \
    --lidar Workspace/Lidar/lidar_with_normals.ply \
    --output_cloud Workspace/Thorough/dense/fused.ply \
Clément Pinard's avatar
Clément Pinard committed
947
948
949
950
    --resolution 0.2
    ```

    This is important to place the resulting point cloud at root of COLMAP MVS workspace `/path/to/dense` that was used for generating `/path/to/georef_dense.ply` and name it `fused.ply` because it is hardwritten on COLMAP's code.
Clément Pinard's avatar
Clément Pinard committed
951
952
    The file `/path/to/fused.ply.vis` will also be generated.
    The `--resolution` option is used to reduce the computational load of the next step.
Clément Pinard's avatar
Clément Pinard committed
953
954
955
956

    ```
    colmap delaunay_mesher \
    --input_type dense \
Clément Pinard's avatar
Clément Pinard committed
957
958
    --input_path Workspace/Thorough/dense \
    --output_path Workspace/Lidar/occlusion_mesh.ply
Clément Pinard's avatar
Clément Pinard committed
959
960
961
962
963
964
    ```

    Generate splats for lidar points outside of occlusion mesh close range. See https://github.com/ETH3D/dataset-pipeline#splat-creation

    ```
    ETH3D/build/SplatCreator \
Clément Pinard's avatar
Clément Pinard committed
965
966
967
    --point_normal_cloud_path Workspace/Lidar/lidar_with_normals.ply \
    --mesh_path Workspace/Lidar/occlusion_mesh.ply \
    --output_path Workspace/Lidar/splats.ply
Clément Pinard's avatar
Clément Pinard committed
968
    --distance_threshold 0.1
Clément Pinard's avatar
Clément Pinard committed
969
    --max_splat_size 0.25
Clément Pinard's avatar
Clément Pinard committed
970
971
    ```

Clément Pinard's avatar
Clément Pinard committed
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
    For every point in `--point_normal_cloud_path`, distance from mesh given in `--mesh_path` is computed, and if it is higher than, `--distance threshold`, it creates a square oriented by cloud normal. The square size given by minimum between the distance of the points from its 4 nearest neighbours and `--max_splat_size`. The `--max_splat_size` prevents potentially isolated points to have an enormous splat.


13. Video localization filtering and interpolation.
    
    `video_folder` and `video_workspace` are placeholders following the different convention from step 10, because they are located in Raw_output_directory.
    `video_folder`is the placeholder for `Raw_output_directory/COLMAP_img_root/Videos/{resolution}/{video_name}`
    `video_workspace` is the placeholder for `Raw_output_directory/models/{resolution}/{video_name}`
    
    Filter the image sequence to exclude frame with an absurd acceleration and interpolate them instead. We keep a track of interpolated frames, which will not be used for depth validation but can be used for depth estimation algorithms that need odometry of previous frames.
    
    For each video :

    ```
    cp video_workspace/final_model/images.txt video_workspace/final_model/images_raw.txt
    python filter_colmap_model.py \
    --input_images_colmap video_workspace/final_model/images_raw.txt \
    --output_images_colmap video_workspace/images_filtered.txt \
    --metadata video_folder/metadata.csv \
    --interpolated_frames_list video_workspace/interpolated_frames.txt
    ```
Clément Pinard's avatar
Clément Pinard committed
993

Clément Pinard's avatar
Clément Pinard committed
994
14. Raw Groundtruth generation
Clément Pinard's avatar
Clément Pinard committed
995
    
Clément Pinard's avatar
Clément Pinard committed
996
997
    `video_folder` and `video_workspace` are placeholders following the same convention as in step 13.

Clément Pinard's avatar
Clément Pinard committed
998
999
1000
    For each video :

    ```