Commit 60e7c9b4 authored by Clément Pinard's avatar Clément Pinard
Browse files

Update documentation

parent ac58f29d
......@@ -3,6 +3,15 @@
This is a set of python scripts and c++ programs used to construct a depth validation set with a Lidar generated point cloud.
For a brief recap of what it does, see section [How it works](#how-it-works)
## Table of contents
* [Software Dependencies](#software-dependencies)
* [Hardware Dependencies](#hardware-dependencies)
* [How it works](#how-it-works)
* [Step by step guide](#usage)
* [Detailed method with the manoir example](#detailed-method-with-the-manoir-example)
## Software Dependencies
These are the used tools, make sure to install them before running the scripts.
......@@ -33,6 +42,7 @@ Note that for our study, we provided the Anafi drone (\~700€), and the point c
# How it works
Here are the key steps of the dataset creation :
See [Detailed method with the manoir example](#detailed-method-with-the-manoir-example) for a concrete example with options used.
1. Data acquisition on a particular scene
- Make a photogrammetry flight plan with any drone, You can use e.g. the Anafi with the Pix4D capture app (it's free). It is important that pictures have GPS info in the exif
......@@ -47,13 +57,13 @@ Here are the key steps of the dataset creation :
3. Extract optimal frames from video for a thorough photogrammetry that will use a mix of pix4D flight plan pictures and video still frames.
- The total number of frame must not be too high to prevent the reconstruction from lasting too long on a single desktop (we recommand between 500 an 1000 images)
- At the same time, extract if possible information on camera parameters to identify which video sequences share the same parameters (e.g 4K videos vs 720p videos, or different levels of zooming)
- This step is done by the script `video_to_colmap.py`
- This step is done by the script `videos_to_colmap.py`
4. Georeference your images.
- For each frame with *GPS* position, convert them in *XYZ* coorindates in the projection system used by the Lidar point cloud (Here, EPSG:2154 was used)
- Substract to these coordinates the centroid that logged when converting the LAS file to PLY.
- Log image filename and centered *XYZ* position in a file for georegistration of the reconstruction point cloud
- This step is also done by the script `video_to_colmap.py`
- This step is also done by the script `videos_to_colmap.py`
4. Generate sky maps of your drone pictures to help the photogrammetry filter out noise during matching
- Use a Neural Network to segment the drone picture and generate masks so that the black areas will be ignored
......@@ -95,12 +105,7 @@ Here are the key steps of the dataset creation :
12. Visualize and Convert the resulting dataset to match the format of a more well known dataset, like KITTI.
## Detailed method with the "Manoir" example
### Scene presentation
### Data acquisition
## Usage
### Running the full script
......@@ -110,7 +115,7 @@ Structure your input folder so that it looks like this:
├── Pictures
│ ├── anafi
│ │ ├── raw
│ │ ── rectilinear
│ │ ── rectilinear
│ └── apn
├── Videos
│ ├── 4K30
......@@ -118,9 +123,72 @@ Structure your input folder so that it looks like this:
└── Lidar
```
You can run the whole script with ```python main_pipeline.py```
#### Parameters brakdown
You can run the whole script with ```python main_pipeline.py```. If you don't have a lidar point cloud and want to use COLMAP reconstructed cloud as Groundtruth, you can use ```python main_pipeline_no_lidar.py``` which will be very similar, minus point cloud cleaning and registration steps.
#### Parameters breakdown
All the parameters for `main_pipeline.py` are defined in the file `cli_utils.ply`.You will find below a summary :
1. Main options
* `--input_folder` : Input Folder with LAS/PLY point clouds, videos, and images, defined above
* `--workspace` : Path to workspace where COLMAP operations will be done. It needs to be on a SSD, and size needed depends on video size, but should at least be 20 Go.
* `--raw_output_folder` : Path to output folder for raw depth maps. Must be very big, especially with 4K videos. for 4K30fps video, count ~60Go per minute of video.
* `--converted_output_folder` : Path to output folder for converted depth maps and visualization. Must be big but usually smaller than raw output because depth map is still uncompressed, but downscaled.
* `--show_steps` : If selected, will make a dry run just to list steps and their numbers.
* `--skip_step` : Skip the selected steps. Can be useful an operation is done manually)
* `--begin_step` : Skip all steps before this step. Useful when the script failed at some point
* `--resume_work` : If selected, will try to skip video aready localized, and ground truth already generated
* `--inspect_dataset` : If selected, will open a window to inspect the dataset at key steps. See https://github.com/ETH3D/dataset-pipeline#dataset-inspection
* `--save_space` : If selected, will try to save space in workspace by only extracting needed frames and removing them as soon as they are no longer needed. Strongly advised.
* `--vid_ext` : Video extensions to scrape from input folder. By default will search for `mp4` and `MP4` files
* `--pic_ext` : Same as Video extensions, but for Image. By default will search for `jpg`, `JPG`, `png`and `PNG` files.
* `--raw_ext` : Same as Video extensions, but for RAW image. By default will search for `ARW`, `NEF` and `DNG` files.
2. Eexecutable files
* `--nw` : Native wrapper location. See https://developer.parrot.com/docs/pdraw/installation.html#run-pdraw
* `--colmap` : Colmap exec location. Usually just `Colmap` if it has been installed system-wide.
* `--ffmpeg` : ffmpeg exec location. Usually just `ffmpeg` if it has been installed system-wide.
* `--eth3d` : ETH3D dataset pipeline exec files folder location. Usually at `dataset-pipeline/build/`.
* `--pcl_util` : PCL util exec files. Usually at `pcl_util/build` (source in this repo)
* `--log` : If set, will output stdout and stderr of these exec files to a log file, which can be read from anther terminal with `tail`.
3. Lidar point cloud preparation
* `--pointcloud_resolution` : If set, will subsample the Lidar point clouds at the chosen resolution.
* `--SOR` : Satistical Outlier Removal parameters. This accepts 2 arguments : Number of nearest neighbours and max relative distance to standard deviation. See https://pcl.readthedocs.io/projects/tutorials/en/latest/statistical_outlier.html
* `--registration_method` : Method use for point cloud registration, chose between "simple", "eth3d" and "interactive" ("simple" by default). See Manual step by step : step 11')
4. Video extractor
* `--total_frames` : Total number of frames that will be used for the first thorough photogrammetry. By default 500, keep this number below 1000.
* `--orientation_weight` : Weight applied to orientation during optimal sample. Higher means two pictures with same location but different orientation will be considered further apart.
* `--resolution_weight` : Same as orientation, but with image size.
* `--max_sequence_length` : COLMAP needs to load ALL the feature matches to register new frames. As such, some videos are too long to fit in RAM, and we need to divide the video in Chunks that will treated separately and then merged together. This parameter is the number max of frames for a chunk. Ideal value is around 500 frames for 1Go of RAM, regardless of resolution.
* `--num_neighbours` : number of frames overlapping between chunks. This is for merge purpose.
* `--system` : coordinates system used for GPS, should be the same as the LAS files used.
* `--lowfps`: framerate at which videos will be scanned WITH reconstruction. 1fps by default
* `--include_lowfps_thorough` : if selected, will include videos frames at lowfps for thorough scan (longer). This can be useful when some videos are not GPS localized (e.g. handhel camera) and are still relevant for the thorough photogrammetry.
5. Photogrammetry
* `--max_num_matches` : Max number of matches, lower it if you get GPU memory error.
* `--vocab_tree` : Pah to vocab tree, can be downloaded [here](https://demuc.de/colmap/#download)
* `--multiple_models` : If selected, will let colmap mapper do multiple models. The biggest one will then be chosen
* `--more_sift_features` : If selected, will activate the COLMAP options ` SiftExtraction.domain_size_pooling` and `--SiftExtraction.estimate_affine_shape` during feature extraction. Be careful, this does not use GPU and is thus very slow. More info : https://colmap.github.io/faq.html#increase-number-of-matches-sparse-3d-points
pm_parser.add_argument('--add_new_videos', action="store_true")
* `--stereo_min_depth` : Min depth for PatchMatch Stereo used during point cloud densification
* `--stereo_max_depth` : Same as min depth but for max depth.
6. Occlusion Mesh
* `--normals_method` : Method used for normal computation between radius and nearest neighbours.
* `--normals_radius` : If radius method for normals, radius within which other points will be considered neighbours
* `--normals_neighbours` : If nearest neighbours method chosen, number of neighbours to consider. Could be very close or very far points, but has a constant complexity.
* `--mesh_resolution` : Mesh resolution for occlusion in meters. Higher means more coarse. (default 0.2, i.e. 20cm)
* `--splats` : If selected, will create splats for points in the cloud that are far from the occlusion mesh.
* `--splat_threshold` : Distance from occlusion mesh at which a splat will be created for a particular point (default, 10cm)
* `--max_splate_size` : Splat size is defined by mean istance from its neighbours. You can define a max splat size for isolated points which otherwise would make a very large useless splat. If not set, will be `2.5*splat_threshold`.
### Manual step by step
......@@ -130,18 +198,21 @@ This will essentially do the same thing as the script, in order to let you chang
```
python las2ply.py /path/to/cloud.las \
--output_ply /path/to/cloud_lidar.ply
--output_txt /path/to/centroid.txt
--output_folder /path/to/output_folder
```
This will save a ply file along with a centroid file
- `/path/to/output_folder/cloud.ply`
- `/path/to/output_folder/centroid.txt`
2. Point Cloud Cleaning
For each ply file :
```
ETHD3D/build/PointCloudCleaner \
--in /path/to/cloud_lidar.ply \
--filter <5,10>
--in /path/to/output_folder/cloud.ply \
--filter 5,10
```
(local outliers removal, doesn't necessarily remove isolated points)
or
......@@ -159,7 +230,7 @@ This will essentially do the same thing as the script, in order to let you chang
--output_meshlab /path/to/lidar.mlp
```
Optionally, if we have multiple lidar scans (which is not the case here), we can run a registration step with ETH3D
Optionally, if we have multiple lidar scans, we can run a registration step with ETH3D
```
ETHD3D/build/ICPScanAligner \
......@@ -187,16 +258,16 @@ This will essentially do the same thing as the script, in order to let you chang
--ImageReader.single_camera_per_folder 1 \
```
We don't need to extract features before having video frames, but this will populate the `/path/to/scan.db` file with the photogrammetry pictures and corresponding id that will be reserved for future version of the file. Besides, it automatically set a camera per folder too.
We don't explicitely need to extract features before having video frames, but this will populate the `/path/to/scan.db` file with the photogrammetry pictures and corresponding id that will be reserved for future version of the file. Besides, it automatically set a camera per folder too.
5. Video frame addition to COLMAP db file
```
python video_to_colmap \
python video_to_colmap.py \
--video_folder /path/to/videos \
--system epsg:2154 \
--centroid_path /path/to/centroid.txt \
--output_folder /path/to/pictures/videos \
--colmap_img_root /path/to/images \
--nw /path/to/anafi/native-wrapper.sh \
--fps 1 \
--total_frames 1000 \
......@@ -204,11 +275,13 @@ This will essentially do the same thing as the script, in order to let you chang
--thorough_db /path/to/scan.db
```
The video to colmap step will populate the scan db with new entries with the right camera parameters. And select a spatially optimal subset of frames from the full video for a photogrammetry with 1000 pictures.
The video to colmap step will populate the scan db with new entries with the right camera parameters, and select a spatially optimal subset of frames from the full video for a photogrammetry with 1000 pictures.
It will also create several txt files with list of file paths :
- `video_frames_for_thorough_scan.txt` : all images used in the first thorough photogrammetry
- `georef.txt` : all images with GPS position, and XYZ equivalent, with system and minus centroid of Lidar file.
The system parameter (here epsg:2154) is the one used in the point cloud. The geo localized frame will then be localized inside the point cloud, which will help register the COLMAP reconstructed point with the Lidar PointCloud. See more info [here](https://en.wikipedia.org/wiki/Spatial_reference_system). It must be compatible with [Proj](https://proj.org).
And finally, it will divide long videos into chunks with corresponding list of filepath so that we don't deal with too large sequences (limit here is 4000 frames)
......@@ -267,20 +340,22 @@ This will essentially do the same thing as the script, in order to let you chang
```
mkdir -p /path/to/thorough/
colmap mapper --Mapper.multiple_models 0 --database_path scan.db --output_path /path/to/thorough/ --image_path images
colmap mapper --database_path scan.db --output_path /path/to/thorough/ --image_path images
```
This will create a model file in the folder `output/sparse` (or `output/sparse/0`), in the form of 3 files
This will create multiple models located in folder named `output/sparse/N` , `N`being a number, starting from 0. Each model will be, in the form of 3 files
```
└── thorough
└── 0
└── N
├── cameras.bin
├── images.bin
├── points3D.bin
└── project.ini
```
You can also add a last bundle adjustment using Ceres, supposedly better than the multicore used in mapper (albeit slower)
COLMAP creates multiple models in the case the model has multiple sets of images that don't overlap. Most of the time, there will be only 1 model (named `0`). Depending on the frame used for initialization, it can happen that the biggest model is not the first. Here we will assume that it is indeed the first (`0`), but you are exepcted to change that number if it is not the most complete model COLMAP could construct.
You can finally add a last bundle adjustment using Ceres, supposedly better than the multicore used in mapper (albeit slower)
```
colmap bundle_adjuster \
......@@ -305,7 +380,7 @@ This will essentially do the same thing as the script, in order to let you chang
10. Video Localization
All these substep will populate the db file, which is then used for matching. So you need to make a copy for each video.
1. Extract all the frames of the video to same directory the `video_to_colmap.py` script exported the frame subset of this video.
1. Extract all the frames of the video to same directory the `videos_to_colmap.py` script exported the frame subset of this video.
```
ffmpeg \
......@@ -443,7 +518,7 @@ This will essentially do the same thing as the script, in order to let you chang
--output_format txt
```
7. Filter the image sequence to exclude frame with an absurd acceleration and interpolate them instead
7. Filter the image sequence to exclude frame with an absurd acceleration and interpolate them instead. We keep a track of interpolated frames, which will not be used for depth validation but can be used for depth estimation algorithms that need odometry of previous frames.
```
python filter_colmap_model.py \
--input_images_colmap /path/to/final_model/images.txt \
......@@ -506,7 +581,7 @@ This will essentially do the same thing as the script, in order to let you chang
The second matrix in `/path/to/register.mlp` will be the matrix transform from `/path/to/lidar.mlp` to `/path/to/georef_dense.ply`
Importante note : This operation doesn't work for scale adjustments which can be a problem with very large models.
Importante note : This operation doesn't work for scale adjustments. Theoretically, if the video frames are gps localized, it should no be a problem, but it can be a problem with very large models where a small scale error will be responsible for large displacement errors locally.
Option 2 : construct a PLY file from lidar scans and register the reconstructed cloud with respect to the lidar, with PCL or CloudCompare. We do this way (and not from lidar to reconstructed), because it is usually easier to register the cloud with less points with classic ICP)
```
......@@ -526,10 +601,11 @@ This will essentially do the same thing as the script, in order to let you chang
Or use CloudCompare : https://www.cloudcompare.org/doc/wiki/index.php?title=Alignment_and_Registration
Best results were maintened with these consecutive steps :
- Crop the /path/georef_dense.ply cloud, otherwise the Octomap will be very inefficient, and the cloud usually has very far outliers
- Apply noise filtering on cropped cloud
- Apply fine registration, with final overlap of 50%, scale adjustment, and Enable farthest point removal
- Save inverse of resulting registration
- Crop the /path/georef_dense.ply cloud, otherwise the Octomap will be very inefficient, and the cloud usually has very far outliers. See [Cross section](https://www.cloudcompare.org/doc/wiki/index.php?title=Cross_Section).
- Apply noise filtering on cropped cloud . See [Noise filter](https://www.cloudcompare.org/doc/wiki/index.php?title=Noise_filter).
- (Optional, especially if the frames are gps localized) Manually apply a rough registration with point pair picking. See [Align](https://www.cloudcompare.org/doc/wiki/index.php?title=Align).
- Apply fine registration, with final overlap of 50%, scale adjustment, and Enable farthest point removal. See [ICP](https://www.cloudcompare.org/doc/wiki/index.php?title=ICP)
- Save resulting registration matrix
For the fine registration part, as said earlier, the aligned cloud is the reconstruction and the reference cloud is the lidar
......@@ -540,6 +616,7 @@ This will essentially do the same thing as the script, in order to let you chang
--input_meshlab /path/to/lidar.mlp \
--output_meshlab /path/to/registered.mlp \
--transform /path/to/registration_matrix.txt
--inverse
```
13. Occlusion Mesh generation
......@@ -581,6 +658,7 @@ This will essentially do the same thing as the script, in order to let you chang
--mesh_path /path/to/occlusion_mesh.ply \
--output_path /path/to/splats.ply
--distance_threshold 0.1
--max_splat_size 0.25
```
The ideal distance threshold is what is considered close range of the occlusion mesh, and the distance from which a splat (little square surface) will be created.
......@@ -623,4 +701,10 @@ This will essentially do the same thing as the script, in order to let you chang
--video \
--downscale 4 \
--threads 8
```
\ No newline at end of file
```
## Detailed method with the "Manoir" example
### Scene presentation
### Data acquisition
\ No newline at end of file
......@@ -5,12 +5,13 @@ import pandas as pd
import numpy as np
from sqlite3 import IntegrityError
from tqdm import tqdm
parser = ArgumentParser(description='Create vizualisation for specified video',
parser = ArgumentParser(description='Add video frames to a colmap db file. '
'Needs original colmap db file to link the frames to the right cameras',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--frame_list', metavar='PATH',
help='path to list with relative path to images', type=Path, default=None)
parser.add_argument('--metadata', metavar='PATH',
parser.add_argument('--metadata', metavar='PATH', required=True,
help='path to metadata csv file', type=Path)
parser.add_argument('--database', metavar='DB', required=True,
help='path to colmap database file, to get the image ids right')
......
......@@ -6,7 +6,7 @@ import pandas as pd
import numpy as np
from pyproj import Proj
parser = ArgumentParser(description='Create vizualisation for specified video',
parser = ArgumentParser(description='Add GPS localized video to colmap model (Note : Localization is not precise enough)',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--video_list', metavar='PATH',
......
......@@ -16,36 +16,55 @@ def set_argparser():
main_parser.add_argument('--converted_output_folder', metavar='PATH', default=Path("."),
help='path to output folder : must be big !', type=Path)
main_parser.add_argument('--skip_step', metavar="N", nargs="*", default=[], type=int)
main_parser.add_argument('--skip_step', metavar="N", nargs="*", default=[], type=int,
help='Skip selected steps')
main_parser.add_argument('--begin_step', metavar="N", type=int, default=None)
main_parser.add_argument('--show_steps', action="store_true")
main_parser.add_argument('-v', '--verbose', action="count", default=0)
main_parser.add_argument('--vid_ext', nargs='+', default=[".mp4", ".MP4"])
main_parser.add_argument('--pic_ext', nargs='+', default=[".jpg", ".JPG", ".png", ".PNG"])
main_parser.add_argument('--raw_ext', nargs='+', default=[".ARW", ".NEF", ".DNG"])
main_parser.add_argument('--resume_work', action="store_true")
main_parser.add_argument('--inspect_dataset', action="store_true")
main_parser.add_argument('--registration_method', choices=["simple", "eth3d", "interactive"], default="simple")
main_parser.add_argument('--vid_ext', nargs='+', default=[".mp4", ".MP4"],
help='Video extensions to scrape from input folder')
main_parser.add_argument('--pic_ext', nargs='+', default=[".jpg", ".JPG", ".png", ".PNG"],
help='Image extensions to scrape from input folder')
main_parser.add_argument('--raw_ext', nargs='+', default=[".ARW", ".NEF", ".DNG"],
help='Raw Image extensions to scrape from input folder')
main_parser.add_argument('--resume_work', action="store_true",
help='If selected, will try to skip video aready localized, and ground truth already generated')
main_parser.add_argument('--inspect_dataset', action="store_true",
help='If selected, will open a window to inspect the dataset. '
'See https://github.com/ETH3D/dataset-pipeline#dataset-inspection')
main_parser.add_argument('--registration_method', choices=["simple", "eth3d", "interactive"], default="simple",
help='Method used for point cloud registration. See README, Manual step by step : step 11')
pcp_parser = parser.add_argument_group("PointCLoud preparation")
pcp_parser.add_argument("--pointcloud_resolution", default=0.1, type=float)
pcp_parser.add_argument("--SOR", default=[10, 6], nargs=2, type=float)
pcp_parser.add_argument("--pointcloud_resolution", default=None, type=float,
help='If set, will subsample the Lidar point clouds at the chosen resolution')
pcp_parser.add_argument("--SOR", default=[10, 6], nargs=2, type=float,
help="Satistical Outlier Removal parameters : Number of nearest neighbours, max relative distance to standard deviation")
ve_parser = parser.add_argument_group("Video extractor")
ve_parser.add_argument('--total_frames', default=500, type=int)
ve_parser.add_argument('--orientation_weight', default=1, type=float)
ve_parser.add_argument('--resolution_weight', default=1, type=float)
ve_parser.add_argument('--num_neighbours', default=10, type=int)
ve_parser.add_argument('--system', default="epsg:2154")
ve_parser.add_argument('--lowfps', default=1, type=int)
ve_parser.add_argument('--max_sequence_length', default=4000, type=int)
ve_parser.add_argument('--orientation_weight', default=1, type=float,
help="Weight applied to orientation during optimal sample. "
"Higher means two pictures with same location but different "
"orientation will be considered farer apart")
ve_parser.add_argument('--resolution_weight', default=1, type=float,
help="same as orientation, but with image size")
ve_parser.add_argument('--num_neighbours', default=10, type=int,
help='Number of frame shared between subsequent chunks')
ve_parser.add_argument('--system', default="epsg:2154",
help='coordinates system used for GPS, should be the same as the LAS files used')
ve_parser.add_argument('--lowfps', default=1, type=int,
help="framerate at which videos will be scanned WITH reconstruction")
ve_parser.add_argument('--max_sequence_length', default=4000, type=int,
help='Number max of frames for a chunk. '
'This is for RAM purpose, as loading feature matches of thousands of frames can take up GBs of RAM')
ve_parser.add_argument('--include_lowfps_thorough', action='store_true',
help="if selected, will include videos frames at lowfps for thorough scan (longer)")
exec_parser = parser.add_argument_group("Executable files")
exec_parser.add_argument('--log', default=None, type=Path)
exec_parser.add_argument('--nw', default="native-wrapper.sh", type=Path,
help="native-wrapper.sh file location")
help="native-wrapper.sh file location (see Anafi SDK documentation)")
exec_parser.add_argument("--colmap", default="colmap", type=Path,
help="colmap exec file location")
exec_parser.add_argument("--eth3d", default="../dataset-pipeline/build",
......
from argparse import ArgumentParser, ArgumentDefaultsHelpFormatter
from path import Path
from imageio import imread, imwrite
from skimage.transform import rescale
from skimage.transform import rescale, resize
from skimage.measure import block_reduce
from colmap_util import read_model as rm
import numpy as np
......@@ -104,19 +104,24 @@ def apply_cmap_and_resize(depth, colormap, downscale):
def process_one_frame(img_path, depth_path, occ_path,
dataset_output_dir, video_output_dir, downscale, interpolated):
dataset_output_dir, video_output_dir, downscale, interpolated,
visualization=False, viz_width=1920):
img = imread(img_path)
if len(img.shape) == 3:
h, w, _ = img.shape
elif len(img.shape) == 2:
h, w = img.shape
img = img.reshape(h, w, 1)
assert((h/downscale).is_integer() and (w/downscale).is_integer())
output_img = np.zeros((2*(h//downscale), 2*(w//downscale), 3), dtype=np.uint8)
assert(viz_width % 2 == 0)
viz_height = int(viz_width * h / (2*w)) * 2
output_img = np.zeros((viz_height, viz_width, 3), dtype=np.uint8)
rescaled_img = rescale(img, 1/downscale, multichannel=True)*255
# Img goes to upper left corner of vizualisation
output_img[:h//downscale, :w//downscale] = rescaled_img
imwrite(dataset_output_dir / img_path.basename(), rescaled_img.astype(np.uint8))
if visualization:
viz_img = resize(img, (viz_height//2, viz_width//2))*255
# Img goes to upper left corner of visualization
output_img[:viz_height//2, :viz_width//2] = viz_img
if depth_path is not None:
with gzip.open(depth_path, "rb") as f:
depth = np.frombuffer(f.read(), np.float32).reshape(h, w)
......@@ -124,47 +129,69 @@ def process_one_frame(img_path, depth_path, occ_path,
downscaled_depth, viz = apply_cmap_and_resize(depth, 'rainbow', downscale)
if not interpolated:
np.save(output_depth_name, downscaled_depth)
# Depth colormap goes to upper right corner
output_img[:h//downscale, w//downscale:] = viz
# Mix Depth / image goest to lower left corner
output_img[h//downscale:, :w//downscale] = \
output_img[:h//downscale, :w//downscale]//2 + \
output_img[:h//downscale, w//downscale:]//2
if occ_path is not None:
if visualization:
viz_rescaled = resize(viz, (viz_height//2, viz_width//2))
# Depth colormap goes to upper right corner
output_img[:viz_height//2, viz_width//2:] = viz_rescaled
# Mix Depth / image goest to lower left corner
output_img[viz_height//2:, :viz_width//2] = \
output_img[:viz_height//2, :viz_width//2]//2 + \
output_img[:viz_height//2, viz_width//2:]//2
if occ_path is not None and visualization:
with gzip.open(occ_path, "rb") as f:
occ = np.frombuffer(f.read(), np.float32).reshape(h, w)
_, occ_viz = apply_cmap_and_resize(occ, 'bone', downscale)
# Occlusion depthmap vizualisation goes to lower right corner
output_img[h//downscale:, w//downscale:] = occ_viz
occ_viz_rescaled = resize(occ_viz, (viz_height//2, viz_width//2))
# Occlusion depthmap visualization goes to lower right corner
output_img[viz_height//2:, viz_width//2:] = occ_viz_rescaled
if interpolated:
output_img[:5] = output_img[-5:] = output_img[:, :5] = output_img[:, -5:] = [255, 128, 0]
imwrite(video_output_dir/img_path.stem + '.png', output_img)
if visualization:
imwrite(video_output_dir/img_path.stem + '.png', output_img)
parser = ArgumentParser(description='create a vizualisation from ground truth created',
parser = ArgumentParser(description='Convert dataset to KITTI format, optionnally create a visualization video',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--depth_dir', metavar='DIR', type=Path)
parser.add_argument('--images_root_folder', metavar='DIR', type=Path)
parser.add_argument('--occ_dir', metavar='DIR', type=Path)
parser.add_argument('--metadata_path', type=Path)
parser.add_argument('--dataset_output_dir', metavar='DIR', default=None, type=Path)
parser.add_argument('--depth_dir', metavar='DIR', type=Path, required=True,
help='folder where depth maps generated by ETH3D are stored Usually ends with "ground_truth_depth/<video name>"')
parser.add_argument('--images_root_folder', metavar='DIR', type=Path, required=True,
help='folder where video frames are stored')
parser.add_argument('--occ_dir', metavar='DIR', type=Path,
help='folder where occlusion depth maps generated by ETH3D are stored. Usually ends with "occlusion_depth/<video name>"')
parser.add_argument('--metadata_path', type=Path, required=True,
help='path to metadata CSV file generated during video_to_colmap.py')
parser.add_argument('--dataset_output_dir', metavar='DIR', default=None, type=Path, required=True)
parser.add_argument('--video_output_dir', metavar='DIR', default=None, type=Path)
parser.add_argument('--interpolated_frames_path', metavar='TXT', type=Path)
parser.add_argument('--final_model', metavar='DIR', type=Path)
parser.add_argument('--video', action='store_true')
parser.add_argument('--downscale', type=int, default=1)
parser.add_argument('--threads', type=int, default=8)
parser.add_argument('--visualize', action='store_true',
help='If selected, will generate images with depth colorized for visualization purpose')
parser.add_argument('--video', action='store_true',
help='If selected, will generate a video from visualization images')
parser.add_argument('--downscale', type=int, default=1, help='How much ground truth depth is downscaled in order to save space')
parser.add_argument('--threads', '-j', type=int, default=8, help='')
def convert_dataset(final_model, depth_dir, images_root_folder, occ_dir,
dataset_output_dir, video_output_dir, metadata_path, interpolated_frames_path,
downscale, ffmpeg, threads=8, video=False, **env):
ffmpeg, threads=8, downscale=None, width=None, visualization=False, video=False, **env):
dataset_output_dir.makedirs_p()
video_output_dir.makedirs_p()
if video:
visualization = True
cameras, images, _ = rm.read_model(final_model, '.txt')
metadata = pd.read_csv(metadata_path).set_index("db_id", drop=False).sort_values("time")
framerate = metadata["framerate"].values[0]
if downscale is None:
assert(width is not None)
input_width = metadata["width"].values[0]
downscale = width / input_width
save_intrinsics(cameras, images, dataset_output_dir, downscale)
save_positions(images, dataset_output_dir)
if interpolated_frames_path is None:
......@@ -173,8 +200,6 @@ def convert_dataset(final_model, depth_dir, images_root_folder, occ_dir,
with open(interpolated_frames_path, "r") as f:
interpolated_frames = [line[:-1] for line in f.readlines()]
metadata = pd.read_csv(metadata_path).set_index("db_id", drop=False).sort_values("time")
framerate = metadata["framerate"].values[0]
image_df = pd.DataFrame.from_dict(images, orient="index").set_index("id")
image_df = image_df.reindex(metadata.index)
depth_maps = []
......@@ -207,12 +232,13 @@ def convert_dataset(final_model, depth_dir, images_root_folder, occ_dir,
occ_maps.append(None)
if threads == 1:
for i, d, o, n in tqdm(zip(imgs, depth_maps, occ_maps, interpolated), total=len(imgs)):
process_one_frame(i, d, o, dataset_output_dir, video_output_dir, downscale, n)
process_one_frame(i, d, o, dataset_output_dir, video_output_dir, downscale, n, visualization, viz_width=1920)
else:
with ProcessPool(max_workers=threads) as pool:
tasks = pool.map(process_one_frame, imgs, depth_maps, occ_maps,
[dataset_output_dir]*len(imgs), [video_output_dir]*len(imgs),
[downscale]*len(imgs), interpolated)
[downscale]*len(imgs), interpolated,
[visualization]*len(imgs), [1920]*len(imgs))
try:
for _ in tqdm(tasks.result(), total=len(imgs)):
pass
......@@ -221,9 +247,10 @@ def convert_dataset(final_model, depth_dir, images_root_folder, occ_dir,
raise e
if video:
video_path = str(video_output_dir/'{}_groundtruth_viz.mp4'.format(video_output_dir.stem))
video_path = str(video_output_dir.parent/'{}_groundtruth_viz.mp4'.format(video_output_dir.stem))
glob_pattern = str(video_output_dir/'*.png')
ffmpeg.create_video(video_path, glob_pattern, framerate)
video_output_dir.rmtree_p()
if __name__ == '__main__':
......
......@@ -59,7 +59,7 @@ def edges2triangles(edges):
return vertices, faces
parser = ArgumentParser(description='convert a dxf file with only edges to a faced mesh, only counting triangles',
parser = ArgumentParser(description='Convert a dxf file with only edges to a faced mesh, only counting triangles',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--dxf', default="manoir.dxf",
......@@ -67,7 +67,7 @@ parser.add_argument('--dxf', default="manoir.dxf",
parser.add_argument('--centroid_path', default="centroid.txt",
help='txt containing the centroid computed with las2ply.py')
parser.add_argument('--output', default=None,
help="output file name. By default, will be dxf path with \".dxf\" replace with \"ply\"")
help="output file name. By default, will be dxf path with \".dxf\" replaced with \"ply\"")
def main():
......
......@@ -8,7 +8,7 @@ from tqdm import tqdm
from pyntcloud import PyntCloud
from itertools import islice
parser = ArgumentParser(description='create a vizualisation from ground truth created',
parser = ArgumentParser(description='Convert EuroC dataset to COLMAP',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--root', metavar='DIR', type=Path)
......
......@@ -6,10 +6,14 @@ import pandas as pd
parser = ArgumentParser(description='create a new colmap model with only the frames of selected video',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--input_model', metavar='DIR', type=Path)
parser.add_argument('--output_model', metavar='DIR', default=None, type=Path)
parser.add_argument('--input_model', metavar='DIR', type=Path, required=True,
help='folder where the cameras.bin and images.bin are located')
parser.add_argument('--output_model', metavar='DIR', type=Path, required=True,
help='Output folder where the modified COLMAP model will be saved')
parser.add_argument('--output_format', choices=['.txt', '.bin'], default='.txt')
parser.add_argument('--metadata_path', metavar="CSV", type=Path)
parser.add_argument('--metadata_path', metavar="CSV", type=Path, required=True,
help='Path to metadata CSV file of the desired video. '
'Usually in /pictures/Videos/<size>/<video_name>/metadata.csv')
def extract_video(input, output, video_metadata_path, output_format='.bin'):
......
......@@ -7,15 +7,27 @@ import matplotlib.pyplot as plt