Commit c164d84d authored by Clément Pinard's avatar Clément Pinard
Browse files

update README

parent 0246e0cf
......@@ -126,47 +126,47 @@ This will essentially do the same thing as the script, in order to let you chang
1. Point cloud preparation
```
python las2ply.py /path/to/cloud.las \
--output_ply /path/to/cloud_lidar.ply
--output_txt /path/to/centroid.txt
```
```
python las2ply.py /path/to/cloud.las \
--output_ply /path/to/cloud_lidar.ply
--output_txt /path/to/centroid.txt
```
2. Cleaning
```
ETHD3D/build/PointCloudCleaner \
--in /path/to/cloud_lidar.ply \
--filter <5,10>
```
(local outliers removal, doesn't remove isolated points)
or
```
pcl_util/build/CloudSOR \
--input/path/to/cloud_lidar.ply \
--output /path/to/cloud_lidar_filtered.ply \
--knn 5 --std 6
```
```
ETHD3D/build/PointCloudCleaner \
--in /path/to/cloud_lidar.ply \
--filter <5,10>
```
(local outliers removal, doesn't remove isolated points)
or
```
pcl_util/build/CloudSOR \
--input/path/to/cloud_lidar.ply \
--output /path/to/cloud_lidar_filtered.ply \
--knn 5 --std 6
```
3. Video frame addition to COLMAP db file
```
python video_to_colmap \
--video_folder /path/to/videos \
--system epsg:2154 \
--centroid_path /path/to/centroid.txt \
--output_folder /path/to/pictures/videos \
--nw /path/to/anafi/native-wrapper.sh \
--fps 1 \
--total_frames 1000 \
--save_space \
--thorough_db /path/to/scan.db
```
python video_to_colmap \
--video_folder /path/to/videos \
--system epsg:2154 \
--centroid_path /path/to/centroid.txt \
--output_folder /path/to/pictures/videos \
--nw /path/to/anafi/native-wrapper.sh \
--fps 1 \
--total_frames 1000 \
--save_space \
--thorough_db /path/to/scan.db
```
```
The video to colmap step will populate the scan db with new entries with the right camera parameters. And select a spatially optimal subset of frames from the full video for a photogrammetry with 1000 pictures.
It will also create several txt files with list of file paths :
The video to colmap step will populate the scan db with new entries with the right camera parameters. And select a spatially optimal subset of frames from the full video for a photogrammetry with 1000 pictures.
It will also create several txt files with list of file paths :
- `video_frames_for_thorough_scan.txt` : all images used in the first thorough photogrammetry
- `georef.txt` : all images with GPS position, and XYZ equivalent, with system and minus centroid of Lidar file.
......@@ -174,79 +174,284 @@ It will also create several txt files with list of file paths :
And finally, it will divide long videos into chunks with corresponding list of filepath so that we don't deal with too large sequences (limit here is 4000 frames)
5. First COLMAP step : feature extraction
4. First COLMAP step : feature extraction
```
python generate_sky_masks.py \
--img_dir /path/to/images \
--colmap_img_root /path/to/images \
--maskroot /path/to/images_mask \
--batch_size 8
```
```
colmap feature_extractor \
--database_path /path/to/scan.db \
--image_path /path/to/images \
--image_list_path /path/to/images/video_frames_for_thorough_scan.txt
--ImageReader.mask_path Path/to/images_mask/ \
--ImageReader.camera_model RADIAL \
--ImageReader.single_camera_per_folder 1 \
```
We also recommand you make your own vocab_tree with image indexes, this will make the next matching steps faster.
```
colmap vocab_tree_retriever \
--database_path /path/to/scan.db\
--vocab_tree_path /path/to/vocab_tree \
--output_index /path/to/indexed_vocab_tree
```
5. Second COLMAP step : matching. For less than 1000 images, you can use exhaustive matching (this will take around 2hours). If there is too much images, you can use either spatial matching or vocab tree matching
```
colmap exhaustive_matcher \
--database_path scan.db \
--SiftMatching.guided_matching 1
```
or
```
colmap spatial_matcher \
--database scan.db \
--SiftMatching.guided_matching 1
```
or
```
colmap vocab_tree_matcher \
--database scan.db \
--VocabTreeMatching.vocab_tree_path /path/to/indexed_vocab_tree
--SiftMatching.guided_matching 1
```
6. Third COLMAP step : thorough mapping.
```
mkdir -p /path/to/thorough/
colmap mapper --Mapper.multiple_models 0 --database_path scan.db --output_path /path/to/thorough/ --image_path images
```
This will create a model file in the folder `output/sparse` (or `output/sparse/0`), in the form of 3 files
```
└── thorough
└── 0
├── cameras.bin
├── images.bin
├── points3D.bin
└── project.ini
```
You can also add a last bundle adjustment using Ceres, supposedly better than the multicore used in mapper (albeit slower)
```
colmap bundle_adjuster \
--input_path /path/to/thorough/0
--output_path /path/to/thorough/0
```
7. Fourth COLMAP step : [georeferencing](https://colmap.github.io/faq.html#geo-registration)
```
mkdir -p /path/to/geo_registered_model
colmap model_aligner \
--input_path /path/to/thorough/0/ \
--output_path /path/to/geo_registered_model \
--ref_images_path /path/to/images/georef.txt
--robust_alignment_max_error 5
```
This model will be the reference model, every further models and frames localization will be done with respect to this one.
Even if we could, we don't run Point cloud registration right now, as the next steps will help us to have a more complete point cloud.
8. Video Localization
All these substep will populate the db file, which is then used for matching. So you need to make a copy for each video.
1. Extract all the frames of the video to same directory the `video_to_colmap.py` script exported the frame subset of this video.
```
ffmpeg \
-i /path/to/video.mp4 \
-vsync 0 -qscale:v 2 \
/path/to/images/videos/dir/
```
2. continue mapping with low fps images, use sequential matcher
```
python generate_sky_masks.py \
--img_dir /path/to/images/videos/dir \
--colmap_img_root /path/to/images \
--maskroot /path/to/images_mask \
--batch_size 8
```
```
python add_video_to_db.py \
--frame_list /path/to/images/videos/dir/lowfps.txt \
--metadata /path/to/images/videos/dir/metadata.csv\
--database /path/to/video_scan.db
```
```
colmap feature_extractor \
--database_path /path/to/video_scan.db \
--image_path /path/to/images \
--image_list_path /path/to/images/videos/dir/lowfps.txt
--ImageReader.mask_path Path/to/images_mask/
```
```
colmap sequential_matcher \
--database_path /path/to/video_scan.db \
--SequentialMatching.loop_detection 1 \
--SequentialMatching.vocab_tree_path /path/to/indexed_vocab_tree
```
```
colmap mapper \
--input /path/to/geo_registered_model \
--output /path/to/lowfps_model \
--Mapper.fix_existing_images 1
```
3. Re-georeference the model
This is a tricky part : to ease convergence, the mapper normalizes the model, losing the initial georeferencing.
To avoid this problem, we merge the model back to the first one. the order between input1 and input2 is important!
```
colmap model_merger \
--input1 /path/to/geo_registered_model \
--input2 /path/to/lowfps_model \
--output /path/to/lowfps_model
```
4. Add mapped frame to the full model that will be used for Lidar registration
```
colmap model_merger \
--input1 /path/to/geo_registered_model \
--input2 /path/to/lowfps_model \
--output /path/to/georef_full
```
For next videos, replace input1 with `/path/to/georef_full` , which will incrementally add more and more images to the model.
5. Register the remaining frames of the videos, without mapping. This is done by chunks in order to avoid RAM problems.
For each Chunk `n`, copy a copy of the scan database and do the same operations as above, minus the mapping, replaced with image registration.
```
cp /path/to/video_scan.db /path/to/video_scan_chunk_n.db
```
```
python add_video_to_db.py \
--frame_list /path/to/images/videos/dir/full_n.txt \
--metadata /path/to/images/videos/dir/metadata.csv\
--database /path/to/video_scan_chunk_n.db
```
```
colmap feature_extractor \
--database_path /path/to/video_scan_chunk_n.db \
--image_path /path/to/images \
--image_list_path /path/to/images/videos/dir/full_n.txt
--ImageReader.mask_path Path/to/images_mask/
```
```
colmap sequential_matcher \
--database_path /path/to/video_scan_chunk_n.db \
--SequentialMatching.loop_detection 1 \
--SequentialMatching.vocab_tree_path /path/to/indexed_vocab_tree
```
```
colmap image_registrator \
--database_path /path/to/video_scan_chunk_n.db \
--input_path /path/to/lowfps_model
--output_path /path/to/chunk_n_model
```
(optional bundle adjustment)
```
colmap bundle_adjuster \
--input_path /path/to/chunk_n_model \
--output_path /path/to/chunk_n_model \
--BundleAdjustment.max_num_iterations 10
```
if first chunk, simply copy `/path/to/chunk_n_model` to `/path/to/full_video_model`.
Otherwise:
```
colmap model_merger \
--input1 /path/to/full_video_model \
--input2 /path/to/chunk_n_model \
--output /path/to/full_video_model
```
At the end of this step, you should have a model with all the (localizable) frames of the videos + the other frames that where used for the first thorough photogrammetry
6. Extract the frame position from the resulting model
```
python extract_video_from_model.py \
--input_model /path/to/full_video_model \
--output_model /path/to/final_model \
--metadata_path /path/to/images/video/dir/metadata.csv
--output_format txt
```
7. Filter the image sequence to exclude frame with an absurd acceleration and interpolate them instead
```
python filter_colmap_model.py \
--input_images_colmap /path/to/full_video_model/images.txt \
--output_images_colmap /path/to/full_video_model/images.txt \
--metdata /path/to/images/video/dir/metadata.csv \
--interpolate
```
At the end of these per-video-tasks, you should have a model at `/path/to/georef_full` with all photogrammetry images + localization of video frames at 1fps, and for each video a TXT file with positions with respect to the first geo-registered reconstruction.
```
python generate_sky_masks.py \
--img_dir /path/to/pictures/videos \
--colmap_img_root /path/to/images \
--maskroot /path/to/images_mask \
--batch_size 8
```
9. Point cloud densification
```
colmap feature_extractor \
--database_path /path/to/scan.db \
--image_path /path/to/images \
--image_list_path /path/to/vimages/video_frames_for_thorough_scan.txt
--ImageReader.mask_path Path/to/images_mask/ \
--ImageReader.camera_model RADIAL \
--ImageReader.single_camera_per_folder 1 \
```
```
colmap image_undistorter \
--image_path /path/to/images \
--input_path /path/to/georef_full \
--output_path /path/to/dense \
--output_type COLMAP \
--max_image_size 1000
```
`max_image_size` option is optional but recommended if you want to save space when dealing with 4K images
```
colmap patch_match_stereo \
--workspace_path /path/to/dense \
--workspace_format COLMAP \
--PatchMatchStereo.geom_consistency 1
```
```
colmap stereo_fusion \
--workspace_path /path/to/dense \
--workspace_format COLMAP \
--input_type geometric \
--output_path /path/to/georef_dense.ply
```
This will also create a `/path/to/georef_dense.ply.vis` file which describes frames from which each point is visible.
10.
We also recommand you make your own vocab_tree with image indexes, this will make the next matching steps faster.
```
colmap vocab_tree_retriever \
--database_path /path/to/scan.db\
--vocab_tree_path /path/to/vocab_tree \
--output_index /path/to/indexed_vocab_tree
```
6. Second COLMAP step : matching. For less than 1000 images, you can use exhaustive matching (this will take around 2hours). If there is too much images, you can use either spatial matching or vocab tree matching
```
colmap exhaustive_matcher \
--database_path scan.db \
--SiftMatching.guided_matching 1
```
or
```
colmap spatial_matcher \
--database scan.db \
--SiftMatching.guided_matching 1
```
or
```
colmap vocab_tree_matcher \
--database scan.db \
--VocabTreeMatching.vocab_tree_path /path/to/indexed_vocab_tree
```
7. Third COLMAP step : thorough mapping.
```
mkdir -p output/sparse
colmap mapper --Mapper.multiple_models 0 --database_path scan.db --output_path /path/to/thorough/ --image_path images
```
This will create a model file in the folder `output/sparse` (or `output/sparse/0`), in the form of 3 files
```
└── sparse
└── 0
├── cameras.bin
├── images.bin
├── points3D.bin
└── project.ini
```
8. Fourth COLMAP step : [georeferencing](https://colmap.github.io/faq.html#geo-registration)
```
colmap model_aligner \
--input_path /path/to/thorough/0/ \
--output_path /path/to/geo-registered-model \
--ref_images_path /path/to/images/georef.txt
--robust_alignment_max_error 5
```
......@@ -175,7 +175,7 @@ def convert_dataset(final_model, depth_dir, images_root_folder, occ_dir, dataset
cameras = []
for i in metadata["image_path"]:
img_path = images_root_folder / i
img_path = images_root_folder / Path(i).relpath("Videos")
imgs.append(img_path)
fname = img_path.basename()
......
......@@ -150,7 +150,7 @@ def prepare_video_workspace(video_name, video_frames_folder,
output["viz_folder"] = converted_output_folder / "video" / relative_path_folder
video_env["output_env"] = output
video_env["already_localized"] = env["resume_work"] and output["model_folder"].isdir()
video_env["GT_already_done"] = env["resume_work"] and (raw_output_folder / "groundtruth_depth" / video_name.namebase).isdir()
video_env["GT_already_done"] = env["resume_work"] and (raw_output_folder / "ground_truth_depth" / video_name.namebase).isdir()
return video_env
......@@ -292,6 +292,7 @@ def main():
i += 1
if i not in args.skip_step:
print_step(i, "Registration of photogrammetric reconstruction with respect to Lidar Point Cloud")
eth3d.compute_normals(env["with_normals_path"], env["lidar_mlp"], neighbor_radius=args.normal_radius)
if args.registration_method == "simple":
pcl_util.register_reconstruction(georef=env["georefrecon_ply"],
lidar=env["with_normals_path"],
......@@ -307,11 +308,12 @@ def main():
np.savetxt(env["matrix_path"], matrix)
elif args.registration_method == "interactive":
input("Get transformation matrix and paste it in the file {}. When done, press ENTER".format(env["matrix_path"]))
input("Get transformation matrix between {0} and {1} so that we should apply it to the reconstructed point cloud to have the lidar point cloud, "
"and paste it in the file {2}. When done, press ENTER".format(env["with_normals_path"], env["georefrecon_ply"], env["matrix_path"]))
if env["matrix_path"].isfile():
env["global_registration_matrix"] = np.linalg.inv(np.fromfile(env["matrix_path"], sep=" ").reshape(4, 4))
else:
print("Error, no registration matrix can be found")
print("Error, no registration matrix can be found, identity will be used")
env["global_registration_matrix"] = np.eye(4)
if args.inspect_dataset:
......@@ -325,7 +327,6 @@ def main():
i += 1
if i not in args.skip_step:
print_step(i, "Occlusion Mesh computing")
eth3d.compute_normals(env["with_normals_path"], env["lidar_mlp"], neighbor_radius=args.normal_radius)
pcl_util.create_vis_file(env["georefrecon_ply"], env["with_normals_path"], env["matrix_path"],
output=env["with_normals_path"], resolution=args.mesh_resolution)
colmap.delaunay_mesh(env["occlusion_ply"], input_ply=env["with_normals_path"])
......
......@@ -191,7 +191,7 @@ def localize_video(video_name, video_frames_folder, thorough_db, metadata, lowfp
clean_workspace()
def generate_GT(video_name, output_folder, images_root_folder, video_frames_folder,
def generate_GT(video_name, raw_output_folder, images_root_folder, video_frames_folder,
viz_folder, kitti_format_folder, metadata, interpolated_frames_list,
final_model, global_registration_matrix, video_fps,
eth3d, colmap,
......@@ -235,8 +235,8 @@ def generate_GT(video_name, output_folder, images_root_folder, video_frames_fold
i_pv = 1
print_step(i_pv, "Creating Ground truth data with ETH3D")
# eth3d.create_ground_truth(final_lidar, final_model, output_folder,
# final_occlusions, final_splats)
eth3d.create_ground_truth(final_lidar, final_model, raw_output_folder,
final_occlusions, final_splats)
viz_folder.makedirs_p()
kitti_format_folder.makedirs_p()
......@@ -244,9 +244,9 @@ def generate_GT(video_name, output_folder, images_root_folder, video_frames_fold
print_step(i_pv, "Convert to KITTI format and create video with GT vizualisation")
cd.convert_dataset(final_model,
output_folder / "ground_truth_depth" / video_name.namebase,
raw_output_folder / "ground_truth_depth" / video_name.namebase,
images_root_folder,
output_folder / "occlusion_depth" / video_name.namebase,
raw_output_folder / "occlusion_depth" / video_name.namebase,
kitti_format_folder, viz_folder,
metadata, interpolated_frames_list,
video=True, fps=video_fps, downscale=4, threads=8, **env)
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment