Commit af25e9b1 authored by Clément Pinard's avatar Clément Pinard
Browse files

Fix typos, add options

parent 98c3efba
......@@ -173,6 +173,7 @@ All the parameters for `main_pipeline.py` are defined in the file `cli_utils.ply
* `--multiple_models` : If selected, will let colmap mapper do multiple models. The biggest one will then be chosen
* `--more_sift_features` : If selected, will activate the COLMAP options ` SiftExtraction.domain_size_pooling` and `--SiftExtraction.estimate_affine_shape` during feature extraction. Be careful, this does not use GPU and is thus very slow. More info : https://colmap.github.io/faq.html#increase-number-of-matches-sparse-3d-points
* `--add_new_videos` : If selected, will skip the mapping steps to directly register new video with respect to an already existing colmap model.
* `--filter_models` : If selected, will filter video localization to smooth trajectory
* `--stereo_min_depth` : Min depth for PatchMatch Stereo used during point cloud densification
* `--stereo_max_depth` : Same as min depth but for max depth.
......@@ -297,7 +298,7 @@ This will essentially do the same thing as the script, in order to let you chang
colmap feature_extractor \
--database_path /path/to/scan.db \
--image_path /path/to/images \
--image_list_path /path/toimages/video_frames_for_thorough_scan.txt
--image_list_path /path/to/images/video_frames_for_thorough_scan.txt
--ImageReader.mask_path Path/to/images_mask/ \
```
......@@ -365,7 +366,7 @@ This will essentially do the same thing as the script, in order to let you chang
colmap model_aligner \
--input_path /path/to/thorough/0/ \
--output_path /path/to/geo_registered_model \
--ref_images_path /path/toimages/georef.txt
--ref_images_path /path/to/images/georef.txt
--robust_alignment_max_error 5
```
......@@ -388,16 +389,16 @@ This will essentially do the same thing as the script, in order to let you chang
```
python generate_sky_masks.py \
--img_dir /path/toimages/videos/dir \
--img_dir /path/to/images/videos/dir \
--colmap_img_root /path/to/images \
--mask_root /path/to/images_mask \
--maskroot /path/to/images_mask \
--batch_size 8
```
```
python add_video_to_db.py \
--frame_list /path/toimages/videos/dir/lowfps.txt \
--metadata /path/toimages/videos/dir/metadata.csv\
--frame_list /path/to/images/videos/dir/lowfps.txt \
--metadata /path/to/images/videos/dir/metadata.csv\
--database /path/to/video_scan.db
```
......@@ -405,7 +406,7 @@ This will essentially do the same thing as the script, in order to let you chang
colmap feature_extractor \
--database_path /path/to/video_scan.db \
--image_path /path/to/images \
--image_list_path /path/toimages/videos/dir/lowfps.txt
--image_list_path /path/to/images/videos/dir/lowfps.txt
--ImageReader.mask_path Path/to/images_mask/
```
......@@ -448,7 +449,8 @@ This will essentially do the same thing as the script, in order to let you chang
For next videos, replace input1 with `/path/to/georef_full` , which will incrementally add more and more images to the model.
5. Register the remaining frames of the videos, without mapping. This is done by chunks in order to avoid RAM problems. Chunks are created during step 5, when calling script `videos_to_colmap.py`. For each chunk `N`, make a copy of the scan database and do the same operations as above, minus the mapping, replaced with image registration.
5. Register the remaining frames of the videos, without mapping. This is done by chunks in order to avoid RAM problems.
Chunks are created during step 5, when calling script `videos_to_colmap.py`. For each chunk `N`, make a copy of the scan database and do the same operations as above, minus the mapping, replaced with image registration.
```
cp /path/to/video_scan.db /path/to/video_scan_chunk_n.db
......@@ -456,8 +458,8 @@ This will essentially do the same thing as the script, in order to let you chang
```
python add_video_to_db.py \
--frame_list /path/toimages/videos/dir/full_chunk_n.txt \
--metadata /path/toimages/videos/dir/metadata.csv\
--frame_list /path/to/images/videos/dir/full_chunk_n.txt \
--metadata /path/to/images/videos/dir/metadata.csv\
--database /path/to/video_scan_chunk_n.db
```
......@@ -465,7 +467,7 @@ This will essentially do the same thing as the script, in order to let you chang
colmap feature_extractor \
--database_path /path/to/video_scan_chunk_n.db \
--image_path /path/to/images \
--image_list_path /path/toimages/videos/dir/full_n.txt
--image_list_path /path/to/images/videos/dir/full_n.txt
--ImageReader.mask_path Path/to/images_mask/
```
......@@ -510,7 +512,7 @@ This will essentially do the same thing as the script, in order to let you chang
python extract_video_from_model.py \
--input_model /path/to/full_video_model \
--output_model /path/to/final_model \
--metadata_path /path/toimages/video/dir/metadata.csv
--metadata_path /path/to/images/video/dir/metadata.csv
--output_format txt
```
......@@ -519,8 +521,8 @@ This will essentially do the same thing as the script, in order to let you chang
python filter_colmap_model.py \
--input_images_colmap /path/to/final_model/images.txt \
--output_images_colmap /path/to/final_model/images.txt \
--metadata /path/toimages/video/dir/metadata.csv \
--interpolated_frames_list /path/toimages/video/dir/interpolated_frames.txt
--metadata /path/to/images/video/dir/metadata.csv \
--interpolated_frames_list /path/to/images/video/dir/interpolated_frames.txt
```
At the end of these per-video-tasks, you should have a model at `/path/to/georef_full` with all photogrammetry images + localization of video frames at 1fps, and for each video a TXT file with positions with respect to the first geo-registered reconstruction.
......@@ -678,7 +680,22 @@ This will essentially do the same thing as the script, in order to let you chang
--compress_depth_maps 1
```
This will create for each video a folder `/path/to/raw_GT/groundtruth_depth/<video name>/` with compressed files with depth information. Option `--write_occlusion_depth` will make the folder `/path/to/raw_GT/` much heavier but is optional. It is used for inspection purpose.
This will create for each video a folder `/path/to/raw_GT/ground_truth_depth/<video name>/` with files with depth information. Option `--write_occlusion_depth` will make the folder `/path/to/raw_GT/` much heavier but is optional. It is used for inspection purpose. Option `--compress_depth_maps` will try to compress depth maps with GZip algorithm. When not using compressiong, the files will be named `[frame_name.jpg]` (even if it's not a jpeg file), and otherwise it will be named `[frame_name.jpg].gz`. Note that for non sparse depth maps (especially occlusion depth maps), the GZ compression is not very effective.
Alternatively, you can do a sanity check before creating depth maps by running dataset inspector
See https://github.com/ETH3D/dataset-pipeline#dataset-inspection
- Note that you don't need the option `--multi_res_point_cloud_directory_path`
- Also note that this will load every image of your video, so for long videos it can be very RAM demanding
```
ETH3D/build/DatasetInspector \
--scan_alignment_path /path/to/registered.mlp \
--image_base_path /path/to/images \
--state_path path/to/final_model \
--occlusion_mesh_path /path/to/occlusion_mesh.ply \
--occlusion_splats_path /path/to/splats/ply \
--max_occlusion_depth 200
```
15. Dataset conversion
......@@ -686,13 +703,13 @@ This will essentially do the same thing as the script, in order to let you chang
```
python convert_dataset.py \
--depth_dir /path/to/raw_GT/groundtruth_depth/<video name>/ \
--images_root_folder /path/toimages/ \
--depth_dir /path/to/raw_GT/ground_truth_depth/<video name>/ \
--images_root_folder /path/to/images/ \
--occ_dir /path/to/raw_GT/occlusion_depth/<video name>/ \
--metadata_path /path/toimages/videos/dir/metadata.csv \
--metadata_path /path/to/images/videos/dir/metadata.csv \
--dataset_output_dir /path/to/dataset/ \
--video_output_dir /path/to/vizualisation/ \
--interpolated_frames_list /path/toimages/video/dir/interpolated_frames.txt \
--video_output_dir /path/to/visualization/ \
--interpolated_frames_list /path/to/images/video/dir/interpolated_frames.txt \
--final_model /path/to/final_model/ \
--video \
--downscale 4 \
......
......@@ -63,6 +63,9 @@ def set_argparser():
'This is for RAM purpose, as loading feature matches of thousands of frames can take up GBs of RAM')
ve_parser.add_argument('--include_lowfps_thorough', action='store_true',
help="if selected, will include videos frames at lowfps for thorough scan (longer)")
ve_parser.add_argument('--generic_model', default='OPENCV',
help='COLMAP model for generic videos. Same zoom level assumed throughout the whole video. '
'See https://colmap.github.io/cameras.html')
exec_parser = parser.add_argument_group("Executable files")
exec_parser.add_argument('--log', default=None, type=Path)
......@@ -81,18 +84,37 @@ def set_argparser():
pm_parser.add_argument('--triangulate', action="store_true")
pm_parser.add_argument('--multiple_models', action='store_true', help='If selected, will let colmap mapper do multiple models.'
'The biggest one will then be chosen')
pm_parser.add_argument('--more_sift_features', action="store_true")
pm_parser.add_argument('--more_sift_features', action="store_true",
help="If selected, will activate the COLMAP options ` SiftExtraction.domain_size_pooling` "
" and `--SiftExtraction.estimate_affine_shape` during feature extraction. Be careful, "
"this does not use GPU and is thus very slow. More info : "
"https://colmap.github.io/faq.html#increase-number-of-matches-sparse-3d-points")
pm_parser.add_argument('--filter_models', action="store_true",
help="If selected, will filter video localization to smooth trajectory")
pm_parser.add_argument('--stereo_min_depth', type=float, default=0.1, help="Min depth for PatchMatch Stereo")
pm_parser.add_argument('--stereo_max_depth', type=float, default=100, help="Max depth for PatchMatch Stereo")
om_parser = parser.add_argument_group("Occlusion Mesh")
om_parser.add_argument('--normals_method', default="radius", choices=["radius", "neighbours"])
om_parser.add_argument('--normals_radius', default=0.2, type=float)
om_parser.add_argument('--normals_neighbours', default=8, type=int)
om_parser.add_argument('--mesh_resolution', default=0.2, type=float)
om_parser.add_argument('--splats', action='store_true')
om_parser.add_argument('--splat_threshold', default=0.1, type=float)
om_parser.add_argument('--max_occlusion_depth', default=250, type=float)
om_parser.add_argument('--normals_method', default="radius", choices=["radius", "neighbours"],
help='Method used for normal computation between radius and nearest neighbours')
om_parser.add_argument('--normals_radius', default=0.2, type=float,
help='If radius method for normals, radius within which other points will be considered neighbours')
om_parser.add_argument('--normals_neighbours', default=8, type=int,
help='If nearest neighbours method chosen, number of neighbours to consider.'
'Could be very close or very far points, but has a constant complexity')
om_parser.add_argument('--mesh_resolution', default=0.2, type=float,
help='Mesh resolution for occlusion in meters. Higher means more coarse. (in meters)')
om_parser.add_argument('--splats', action='store_true',
help='If selected, will create splats for points in the cloud that are far from the occlusion mesh')
om_parser.add_argument('--splat_threshold', default=0.1, type=float,
help='Distance from occlusion mesh at which a splat will be created for a particular point (in meters)')
om_parser.add_argument('--max_splat_size', default=None, type=float,
help='Splat size is defined by mean istance from its neighbours. You can define a max splat size for '
'isolated points which otherwise would make a very large useless splat. '
'If not set, will be `2.5*splat_threshold`.')
gt_parser = parser.add_argument_group("Ground Truth Creator")
gt_parser.add_argument('--max_occlusion_depth', default=250, type=float,
help='max depth for occlusion. Everything further will not be considered at infinity')
return parser
......@@ -120,8 +142,7 @@ per_vid_steps_1 = ["Full video extraction",
"Localizing remaining frames",
"Re-Alignment of triangulated points with Lidar point cloud"]
per_vid_steps_2 = ["Creating Ground truth data",
"Create video with GT vizualisation",
"Convert to KITTI format"]
"Create video with GT visualization and Convert to KITTI format"]
def print_workflow():
......
......@@ -7,7 +7,7 @@ import matplotlib.pyplot as plt
from scipy.spatial.transform import Rotation
parser = ArgumentParser(description='Take all the drone videos of a folder and put the frame '
'location in a COLMAP file for vizualisation',
'location in a COLMAP file for visualization',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--input_images_1', metavar='FILE', type=Path)
......
......@@ -16,7 +16,7 @@ import pandas as pd
def save_intrinsics(cameras, images, output_dir, downscale=1):
def construct_intrinsics(cam):
assert('PINHOLE' in cam.model)
# assert('PINHOLE' in cam.model)
if 'SIMPLE' in cam.model:
fx, cx, cy, *_ = cam.params
fy = fx
......@@ -105,7 +105,7 @@ def apply_cmap_and_resize(depth, colormap, downscale):
def process_one_frame(img_path, depth_path, occ_path,
dataset_output_dir, video_output_dir, downscale, interpolated,
visualization=False, viz_width=1920):
visualization=False, viz_width=1920, compressed=True):
img = imread(img_path)
if len(img.shape) == 3:
h, w, _ = img.shape
......@@ -123,7 +123,7 @@ def process_one_frame(img_path, depth_path, occ_path,
# Img goes to upper left corner of visualization
output_img[:viz_height//2, :viz_width//2] = viz_img
if depth_path is not None:
with gzip.open(depth_path, "rb") as f:
with gzip.open(depth_path, "rb") if compressed else open(depth_path, "rb") as f:
depth = np.frombuffer(f.read(), np.float32).reshape(h, w)
output_depth_name = dataset_output_dir / img_path.basename() + '.npy'
downscaled_depth, viz = apply_cmap_and_resize(depth, 'rainbow', downscale)
......@@ -139,7 +139,7 @@ def process_one_frame(img_path, depth_path, occ_path,
output_img[:viz_height//2, viz_width//2:]//2
if occ_path is not None and visualization:
with gzip.open(occ_path, "rb") as f:
with gzip.open(occ_path, "rb") if compressed else open(occ_path, "rb") as f:
occ = np.frombuffer(f.read(), np.float32).reshape(h, w)
_, occ_viz = apply_cmap_and_resize(occ, 'bone', downscale)
occ_viz_rescaled = resize(occ_viz, (viz_height//2, viz_width//2))
......@@ -173,11 +173,15 @@ parser.add_argument('--video', action='store_true',
help='If selected, will generate a video from visualization images')
parser.add_argument('--downscale', type=int, default=1, help='How much ground truth depth is downscaled in order to save space')
parser.add_argument('--threads', '-j', type=int, default=8, help='')
parser.add_argument('--compressed', action='store_true',
help='Indicates if GroundTruthCreator was used with option `--compress_depth_maps`')
parser.add_argument('--verbose', '-v', action='count', default=0)
def convert_dataset(final_model, depth_dir, images_root_folder, occ_dir,
dataset_output_dir, video_output_dir, metadata_path, interpolated_frames_path,
ffmpeg, threads=8, downscale=None, width=None, visualization=False, video=False, **env):
dataset_output_dir, video_output_dir, metadata_path, interpolated_frames,
ffmpeg, threads=8, downscale=None, compressed=True,
width=None, visualization=False, video=False, verbose=0, **env):
dataset_output_dir.makedirs_p()
video_output_dir.makedirs_p()
if video:
......@@ -194,11 +198,6 @@ def convert_dataset(final_model, depth_dir, images_root_folder, occ_dir,
save_intrinsics(cameras, images, dataset_output_dir, downscale)
save_positions(images, dataset_output_dir)
if interpolated_frames_path is None:
interpolated_frames = []
else:
with open(interpolated_frames_path, "r") as f:
interpolated_frames = [line[:-1] for line in f.readlines()]
image_df = pd.DataFrame.from_dict(images, orient="index").set_index("id")
image_df = image_df.reindex(metadata.index)
......@@ -207,29 +206,39 @@ def convert_dataset(final_model, depth_dir, images_root_folder, occ_dir,
interpolated = []
imgs = []
cameras = []
not_registered = 0
for i in metadata["image_path"]:
img_path = images_root_folder / i
imgs.append(img_path)
fname = img_path.basename()
depth_path = depth_dir / fname + ".gz"
depth_path = depth_dir / fname
occ_path = occ_dir / fname
if compressed:
depth_path += ".gz"
occ_path += ".gz"
if depth_path.isfile():
if occ_path.isfile():
occ_maps.append(occ_path)
else:
occ_maps.append(None)
depth_maps.append(depth_path)
if i in interpolated_frames:
if verbose > 2:
print("Image {} was interpolated".format(fname))
interpolated.append(True)
else:
interpolated.append(False)
else:
print("Image {} was not registered".format(fname))
if verbose > 2:
print("Image {} was not registered".format(fname))
not_registered += 1
depth_maps.append(None)
if i in interpolated_frames:
interpolated.append(True)
print("Image {} was interpolated".format(fname))
else:
interpolated.append(False)
occ_path = occ_dir / fname + ".gz"
if occ_path.isfile():
occ_maps.append(occ_path)
else:
occ_maps.append(None)
interpolated.append(False)
print('{}/{} Frames not registered ({:.2f}%)'.format(not_registered, len(metadata), 100*not_registered/len(metadata)))
print('{}/{} Frames interpolated ({:.2f}%)'.format(sum(interpolated), len(metadata), 100*sum(interpolated)/len(metadata)))
if threads == 1:
for i, d, o, n in tqdm(zip(imgs, depth_maps, occ_maps, interpolated), total=len(imgs)):
process_one_frame(i, d, o, dataset_output_dir, video_output_dir, downscale, n, visualization, viz_width=1920)
......@@ -256,5 +265,10 @@ def convert_dataset(final_model, depth_dir, images_root_folder, occ_dir,
if __name__ == '__main__':
args = parser.parse_args()
env = vars(args)
if args.interpolated_frames_path is None:
env["interpolated_frames"] = []
else:
with open(args.interpolated_frames_path, "r") as f:
env["interpolated_frames"] = [line[:-1] for line in f.readlines()]
env["ffmpeg"] = FFMpeg()
convert_dataset(**env)
......@@ -12,11 +12,11 @@ parser = ArgumentParser(description='Filter COLMAP model of a single video by di
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--input_images_colmap', metavar='FILE', type=Path, required=True,
help='Input COLMAP images.bin file to filter.')
help='Input COLMAP images.bin or images.txt file to filter.')
parser.add_argument('--metadata', metavar='FILE', type=Path, required=True,
help='Metadata CSV file of filtered video')
parser.add_argument('--output_images_colmap', metavar='FILE', type=Path, required=True,
help='Output images.bin file with filtere frame localizations')
help='Output images.bin or images.txt file with filtered frame localizations')
parser.add_argument('--interpolated_frames_list', type=Path, required=True,
help='Outpt list containing interpolated frames in order to discard them from ground-truth validation')
parser.add_argument('--filter_degree', default=3, type=int,
......@@ -279,7 +279,12 @@ def filter_colmap_model(input_images_colmap, output_images_colmap, metadata_path
camera_id=row["camera_id"],
name=row["image_path"],
xys=[], point3D_ids=[])
rm.write_images_text(smoothed_images_dict, output_images_colmap)
if output_images_colmap.ext == ".txt":
rm.write_images_text(smoothed_images_dict, output_images_colmap)
elif output_images_colmap.ext == ".bin":
rm.write_images_bin(smoothed_images_dict, output_images_colmap)
else:
print(output_images_colmap.ext)
return interpolated_frames
......
......@@ -139,7 +139,7 @@ def main():
colmap.export_model(output=env["georef_recon"] / "georef_sparse.ply",
input=env["georef_recon"])
georef_mlp = env["georef_recon"]/"georef_recon.mlp"
mxw.create_project(georef_mlp, [env["georefrecon_ply"]])
mxw.create_project(georef_mlp, [env["georef_recon"] / "georef_sparse.ply"])
colmap.export_model(output=env["georef_recon"],
input=env["georef_recon"],
output_type="TXT")
......@@ -173,10 +173,17 @@ def main():
'''Note : We use the inverse matrix here, because in general, it's easier to register the reconstructed model into the lidar one,
as the reconstructed will have less points, but in the end we need the matrix to apply to the lidar point to be aligned
with the camera positions (ie the inverse)'''
return np.linalg.inv(np.fromfile(env["matrix_path"], sep=" ").reshape(4, 4))
return np.linalg.inv(np.fromfile(path, sep=" ").reshape(4, 4))
else:
print("Error, no registration matrix can be found, identity will be used")
return np.eye(4)
print("Error, no registration matrix can be found")
print("Ensure that your registration matrix was saved under the name {}".format(path))
decision = None
while decision not in ["y", "n", ""]:
decision = input("retry ? [Y/n]")
if decision.lower() in ["y", ""]:
return get_matrix(path)
elif decision.lower() == "n":
return np.eye(4)
i += 1
if i not in args.skip_step:
print_step(i, "Registration of photogrammetric reconstruction with respect to Lidar Point Cloud")
......@@ -241,11 +248,21 @@ def main():
if args.inspect_dataset:
# First inspection : Check registration of the Lidar pointcloud wrt to COLMAP model but without the occlusion mesh
# Second inspection : Check the occlusion mesh and the splats
colmap.export_model(output=env["georef_full_recon"] / "georef_sparse.ply",
input=env["georef_full_recon"])
georef_mlp = env["georef_recon"]/"georef_recon.mlp"
mxw.create_project(georef_mlp, [env["georefrecon_ply"]])
colmap.export_model(output=env["georef_full_recon"],
input=env["georef_full_recon"],
output_type="TXT")
eth3d.inspect_dataset(scan_meshlab=georef_mlp,
colmap_model=env["georef_full_recon"],
image_path=env["image_path"])
eth3d.inspect_dataset(scan_meshlab=env["aligned_mlp"],
colmap_model=env["georef_recon"],
colmap_model=env["georef_full_recon"],
image_path=env["image_path"])
eth3d.inspect_dataset(scan_meshlab=env["aligned_mlp"],
colmap_model=env["georef_recon"],
colmap_model=env["georef_full_recon"],
image_path=env["image_path"],
occlusions=env["occlusion_ply"],
splats=env["splats_ply"])
......@@ -258,6 +275,7 @@ def main():
generate_GT(video_name=v, GT_already_done=video_env["GT_already_done"],
video_index=j+1,
step_index=i,
num_videos=len(env["videos_to_localize"]),
metadata=video_env["metadata"],
**video_env["output_env"], **env)
......
......@@ -50,6 +50,18 @@ def get_mesh(input_mlp, index):
return transform, filepath
def get_meshes(input_mlp):
with open(input_mlp, "r") as f:
to_read = etree.parse(f)
meshgroup = to_read.getroot()[0]
meshes = []
for mesh in meshgroup:
transform = np.fromstring(mesh[0].text, sep=" ").reshape(4, 4)
filepath = mesh.get("label")
meshes.append(transform, filepath)
return meshes
def add_meshes_to_project(input_mlp, output_mlp, model_paths, labels=None, transforms=None, start_index=-1):
if labels is not None:
assert(len(model_paths) == len(labels))
......
......@@ -63,7 +63,7 @@ int main (int argc, char** argv)
}
LOG(INFO) << "point clouds loaded";
LOG(INFO) << "Subsampling to have a mean distance between points of " << resolution << " m";
LOG(INFO) << "Subsampling Lidar point cloud to have a mean distance between points of " << resolution << " m";
lidar = filter<pcl::PointNormal>(lidar, resolution);
LOG(INFO) << "Loading georef_dense vis file...";
......
......@@ -77,7 +77,7 @@ def prepare_video_workspace(video_name, video_frames_folder,
output["interpolated_frames_list"] = output["model_folder"] / "interpolated_frames.txt"
output["final_model"] = output["model_folder"] / "final"
output["kitti_format_folder"] = converted_output_folder / "KITTI" / relative_path_folder
output["viz_folder"] = converted_output_folder / "vizualisation" / relative_path_folder
output["viz_folder"] = converted_output_folder / "visualization" / relative_path_folder
video_env["output_env"] = output
video_env["already_localized"] = env["resume_work"] and output["model_folder"].isdir()
video_env["GT_already_done"] = env["resume_work"] and (raw_output_folder / "ground_truth_depth" / video_name.stem).isdir()
......
......@@ -9,7 +9,7 @@ import gzip
from pebble import ProcessPool
from tqdm import tqdm
parser = ArgumentParser(description='create a vizualisation from ground truth created',
parser = ArgumentParser(description='create a visualization from ground truth created',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--img_dir', metavar='DIR', type=Path)
......
......@@ -59,7 +59,7 @@ def localize_video(video_name, video_frames_folder, thorough_db, metadata, lowfp
chunk_image_list_paths, chunk_dbs,
colmap_models_root, full_model, lowfps_model, chunk_models, final_model,
output_env, eth3d, colmap, ffmpeg, pcl_util,
step_index=None, video_index=None, num_videos=None, already_localized=False, filter_model=True,
step_index=None, video_index=None, num_videos=None, already_localized=False,
save_space=False, triangulate=False, **env):
def print_step_pv(step_number, step_name):
......@@ -173,13 +173,7 @@ def localize_video(video_name, video_frames_folder, thorough_db, metadata, lowfp
output_matrix=matrix_name, output_cloud=env["lidar_ply"],
max_distance=10)
if filter_model:
i_pv += 1
print_step_pv(i_pv, "Filtering model to have continuous localization")
(final_model / "images.txt").rename(final_model / "images_raw.txt")
interpolated_frames = filter_colmap_model(input_images_colmap=final_model / "images_raw.txt",
output_images_colmap=final_model / "images.txt",
metadata_path=metadata, **env)
(final_model / "images.txt").rename(final_model / "images_raw.txt")
output_env["video_frames_folder"].makedirs_p()
video_frames_folder.merge_tree(output_env["video_frames_folder"])
......@@ -187,10 +181,6 @@ def localize_video(video_name, video_frames_folder, thorough_db, metadata, lowfp
output_env["model_folder"].makedirs_p()
colmap_models_root.merge_tree(output_env["model_folder"])
if filter_model:
with open(output_env["interpolated_frames_list"], "w") as f:
f.write("\n".join(interpolated_frames) + "\n")
clean_workspace()
......@@ -198,14 +188,42 @@ def generate_GT(video_name, raw_output_folder, images_root_folder, video_frames_
viz_folder, kitti_format_folder, metadata, interpolated_frames_list,
final_model, aligned_mlp, global_registration_matrix,
occlusion_ply, splats_ply,
eth3d, colmap,
video_index=None, num_videos=None, GT_already_done=False,
eth3d, colmap, filter_models=True,
step_index=None, video_index=None, num_videos=None, GT_already_done=False,
save_space=False, inspect_dataset=False, **env):
def print_step_pv(step_number, step_name):
if step_index is not None and video_index is not None and num_videos is not None:
progress = "{}/{}".format(video_index, num_videos)
substep = "{}.{}".format(step_index, video_index)
else:
progress = ""
substep = ""
print_step("{}.{}".format(substep, step_number),
"[Video {}, {}] \n {}".format(video_name.basename(),
progress,
step_name))
if GT_already_done:
return
if not final_model.isdir():
print("Video not localized, rerun the script without skipping former step")
return
print("Creating GT on video {} [{}/{}]".format(video_name.basename(), video_index, num_videos))
i_pv = 1
if filter_models:
print_step_pv(i_pv, "Filtering model to have continuous localization")
interpolated_frames = filter_colmap_model(input_images_colmap=final_model / "images_raw.txt",
output_images_colmap=final_model / "images.txt",
metadata_path=metadata, **env)
with open(interpolated_frames_list, "w") as f:
f.write("\n".join(interpolated_frames) + "\n")
i_pv += 1
else:
(final_model / "images_raw.txt").copy(final_model / "images.txt")
interpolated_frames = []
model_length = len(read_images_text(final_model / "images.txt"))
if model_length < 2:
return
......@@ -243,9 +261,7 @@ def generate_GT(video_name, raw_output_folder, images_root_folder, video_frames_
eth3d.inspect_dataset(final_mlp, final_model,
final_occlusions, final_splats)
print("Creating GT on video {} [{}/{}]".format(video_name.basename(), video_index, num_videos))
i_pv = 1
print_step(i_pv, "Creating Ground truth data with ETH3D")
print_step_pv(i_pv, "Creating Ground truth data with ETH3D")
eth3d.create_ground_truth(final_mlp, final_model, raw_output_folder,
final_occlusions, final_splats)
......@@ -253,15 +269,15 @@ def generate_GT(video_name, raw_output_folder, images_root_folder, video_frames_
kitti_format_folder.makedirs_p()
i_pv += 1
print_step(i_pv, "Convert to KITTI format and create video with GT vizualisation")
print_step_pv(i_pv, "Convert to KITTI format and create video with GT visualization")
cd.convert_dataset(final_model,
raw_output_folder / "ground_truth_depth" / video_name.stem,
images_root_folder,
raw_output_folder / "occlusion_depth" / video_name.stem,
kitti_format_folder, viz_folder,
metadata, interpolated_frames_list,
vizualisation=True, video=True, downscale=4, threads=8, **env)
metadata, interpolated_frames,
visualization=True, video=True, downscale=4, threads=8, **env)
interpolated_frames_list.copy(kitti_format_folder)
if save_space:
(raw_output_folder / "occlusion_depth" / video_name.stem).rmtree_p()
......
......@@ -13,7 +13,7 @@ from tqdm import tqdm
import tempfile
parser = ArgumentParser(description='Take all the drone videos of a folder and put the frame '
'location in a COLMAP file for vizualisation',
'location in a COLMAP file for visualization',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--video_folder', metavar='DIR',
......@@ -24,7 +24,7 @@ parser.add_argument('--centroid_path', default=None, help="path to centroid gene
parser.add_argument('--colmap_img_root', metavar='DIR', type=Path,
help="folder that will be used as \"image_path\" parameter when using COLMAP", required=True)
parser.add_argument('--output_format', metavar='EXT', default="bin", choices=["bin", "txt"],
help='format of the COLMAP file that will be outputed, used for vizualisation only')
help='format of the COLMAP file that will be outputed, used for visualization only')
parser.add_argument('--vid_ext', nargs='+', default=[".mp4", ".MP4"],