Commit 0246e0cf authored by Clément Pinard's avatar Clément Pinard
Browse files

Update README (WIP)

parent 4549fdf7
# Photogrammetry and georegistration tools for Parrot drone videos
This is a set of python scripts used to construct a depth validation set with a Lidar generated point cloud.
This is a set of python scripts and c++ programs used to construct a depth validation set with a Lidar generated point cloud.
For a brief recap of what it does, see section [How it works](#how-it-works)
To be extended but roughly, here are the key steps of the dataset creation :
## Software Dependencies
These are the used tools, make sure to install them before running the scripts.
- [CUDA](https://developer.nvidia.com/cuda-downloads)
- [OpenCV](https://opencv.org/)
- [ETH3D Dataset-pipeline](https://github.com/ETH3D/dataset-pipeline)
- [Pytorch](https://pytorch.org/)
- [COLMAP](https://colmap.github.io/)
- [PDrAW from AnafiSDK](https://developer.parrot.com/docs/pdraw/)
Apart from CUDA, which you need to install by yourself, you can use the help script `install_dependencies.sh` to install them on ubuntu 20.04.
For PDrAW, there should be a `native-wrapper.sh` file that you to keep a track of. It's usually in `groundsdk/out/pdraw-linux/staging/native-wrapper.sh`(see [here](https://developer.parrot.com/docs/pdraw/installation.html))
## Hardaware dependecies
To recreate the results of the study, you will need these hardware pieces :
- Parrot Anafi
- DJI Matrice 600
- Velodyne Puck VLP16
Note that for our study, we provided the Anafi drone (\~700€), and the point cloud was created by a private company (\~3500€ for the whole scan process)
# How it works
Here are the key steps of the dataset creation :
1. Data acquisition on a particular scene
- Make a photogrammetry flight plan with any drone, You can use e.g. the Anafi with the Pix4D capture app. It is important that pictures have GPS iunfo in the exif
- Make some natural flights in the same scene, use either a Bebop2 or a Anafi to be able to use the PDraw tool
- Make a photogrammetry flight plan with any drone, You can use e.g. the Anafi with the Pix4D capture app (it's free). It is important that pictures have GPS info in the exif
- Make some natural flights in the same scene, use either a Bebop2 or a Anafi to be able to use the PDraw tool. In theory this is possible to adapt the current scripts to any IMU-GPS-powered camera.
- Make a Lidar scan of this very scene, and clean the resulting 3D point cloud : this is a crucial part as Lidar data will be assumed perfect for the rest of the workflow. You need to also note the projection system used (e.g. `EPSG 2154`) for geo registration. The file will a priori be a `.las` file with float64 values.
2. Convert the `.las` float64 point cloud into a `.ply` float32
- As 3D values are global, x and y will be huge. You need to make the cloud 0-centered by subtracting its centroid to it.
- The centroid needs to be logged somewhere for future frame registration
- This step is done by the script `las2ply.py`
2. Convert the `.las` float64 point cloud into a `.ply` float32
- As 3D values are global, x and y will be huge. You need to make the cloud 0-centered by subtracting its centroid to it.
- The centroid needs to be logged somewhere for future frame registration
- This step is done by the script `las2ply.py`
3. Extract optimal frames from video for a thorough photogrammetry that will use a mix of pix4D flight plan pictures and video still frames.
- The total number of frame must not be too high to prevent the reconstruction from lasting too long on a single desktop (we recommand between 500 an 1000 images)
- At the same time, extract if possible information on camera parameters to identify which video sequences share the same parameters (e.g 4K videos vs 720p videos, or different levels of zooming)
- This step is done by the script `video_to_colmap.py`
4. Georeference your images.
- For each frame with *GPS* position, convert them in *XYZ* coorindates in the projection system used by the Lidar point cloud (Here, EPSG:2154 was used)
- Substract to these coordinates the centroid that logged when converting the LAS file to PLY.
- Log image filename and centered *XYZ* position in a file for georegistration of the reconstruction point cloud
- This step is also done by the script `video_to_colmap.py`
2. Generate sky maps of your drone pictures to help the photogrammetry filter out noise during matching
4. Generate sky maps of your drone pictures to help the photogrammetry filter out noise during matching
- Use a Neural Network to segment the drone picture and generate masks so that the black areas will be ignored
- This is done with the script `generate_sky_masks.py`
3. Perform a photogrammetry on your pictures
- The recommended tool is Colmap because further tools will use its output format
- Use your photogrammetry dedicated pictures (obtained with pix4D capture) with a small subset of frames from your piloting videos to get a sparse 3D model, exhaustive enough to :
- The recommended tool is COLMAP because further tools will use its output format
- You should get a sparse 3D model, exhaustive enough to :
- Be matched with the Lidar Point cloud
- Localize other video frames in the reconstruction
- This picture set should be around 1000 frames to prevent the reconstruction from lasting too long.
- The photogrammetry picture should already have GPS data in the exif to help colmap, but you also need to add them to the extracted frames from videos. This is done with the script `extract_video_with_gps.py`
4. Change reconstructed point cloud with shift and scale to match Lidar point cloud
- Get the GPS coordinates of each picture from EXIF data, and deduce the *XYZ* coordinates in the projection system used by the Lidar point cloud
- Substract to these coordinates the centroid that logged when converting the LAS file to PLY.
- Log image filename and centered *XYZ* position in a file for georegistration of the reconstruction point cloud
- This is done with the script `align_frames_with_lidar.py`
- See [here]() for point cloud georegistration with colmap
- Save the reconstructed point cloud to a ply file
- See [here](https://colmap.github.io/faq.html#geo-registration) for point cloud georegistration with colmap
5. For each video : continue the photogrammetry with video frames at a low fps (we took 1fps)
- We do this in order to keep the whole mapping at a linear time
- Merge all the resulting models into one full model with thorough photogrammetry frames and all the 1fps frames
- Finish registering the remaning frames. For RAM reasons, every video is divided into chunks, so that a sequence registered is never more than 4000 frames.
- Filter the final model at full framerate : remove points with absurd angular and translational accleration. Interpolate the resulting discarded points (but keep a track of them). This is done in the script `filter_colmap_model.py`
6. Densify the resulting point cloud with COLMAP (see [here](https://colmap.github.io/tutorial.html#dense-reconstruction))
- Export a PLY file along with the VIS file with `colmap stereo_fusion`
5. Match the Lidar Point cloud and Reconstruction point cloud together with ICP.
7. Match the Lidar Point cloud and full reconstruction point cloud together with ICP.
- The georegistration of the reconstructed point cloud should be sufficient to get a good starting point.
- Done with PCL (to extend)
- By experience, the best method here is to use CloudCompare, but You can use ETH3D or PCL to do it
- The resulting transformation matrix should be stored in a TXT file, the same way cloudcompare proposes to do it
8. Construct a PLY+VIS file pair based on lidar scan
- The PLY file is basically every lidar point
- The VIS file stores frames from which point is visible : we reuse the PLY+VIS from step 6, and assume a lidar point has the same visibility as the closest point in the denified reconstructed point.
9. Run the delauney mesher tool from COLMAP to construct an occlusion mesh
10. Construct the Splats with ETH3D
6. For each video : register all the video frames in the reconstructed model
- It is not recommended to do it with all the videos at the same time, or the last bundle_adjustement will be very slow.
- See [here]() for more details on frames registration
11. Construct the ground truth Depth with ETH3D
12. Visualize and Convert the resulting dataset to match the format of a more well known dataset, like KITTI.
## Detailed method with the "Manoir" example
......@@ -50,52 +100,135 @@ To be extended but roughly, here are the key steps of the dataset creation :
### Data acquisition
### Set-by-step commands
### Running the full script
1. Extract frames from the videos. For the moment, extract only frames that will help the general reconstruction.
2. Put images in the following tree strcture:
Structure your input folder so that it looks like this:
```
├── images
├── Pictures
│   ├── anafi
│   │   ├── raw
│   │   ├── rectilinear
│   │   └── video
│   │   ├── 4K30
│   │   └── 720p120
│   └── apn
└── images_mask
├── Videos
│ ├── 4K30
│ └── 720p120
└── Lidar
```
You can run the whole script with ```python main_pipeline.py```
#### Parameters brakdown
### Manual step by step
This will essentially do the same thing as the script, in order to let you change some steps at will.
1. Point cloud preparation
```
python las2ply.py /path/to/cloud.las \
--output_ply /path/to/cloud_lidar.ply
--output_txt /path/to/centroid.txt
```
2. Cleaning
```
ETHD3D/build/PointCloudCleaner \
--in /path/to/cloud_lidar.ply \
--filter <5,10>
```
(local outliers removal, doesn't remove isolated points)
or
```
pcl_util/build/CloudSOR \
--input/path/to/cloud_lidar.ply \
--output /path/to/cloud_lidar_filtered.ply \
--knn 5 --std 6
```
3. Video frame addition to COLMAP db file
```
python video_to_colmap \
--video_folder /path/to/videos \
--system epsg:2154 \
--centroid_path /path/to/centroid.txt \
--output_folder /path/to/pictures/videos \
--nw /path/to/anafi/native-wrapper.sh \
--fps 1 \
--total_frames 1000 \
--save_space \
--thorough_db /path/to/scan.db
```
3. Generate masks in the folder `images_mask`
The video to colmap step will populate the scan db with new entries with the right camera parameters. And select a spatially optimal subset of frames from the full video for a photogrammetry with 1000 pictures.
It will also create several txt files with list of file paths :
- `video_frames_for_thorough_scan.txt` : all images used in the first thorough photogrammetry
- `georef.txt` : all images with GPS position, and XYZ equivalent, with system and minus centroid of Lidar file.
And finally, it will divide long videos into chunks with corresponding list of filepath so that we don't deal with too large sequences (limit here is 4000 frames)
`python generate_sky_map.py --root images`
5. First COLMAP step : feature extraction
```
python generate_sky_masks.py \
--img_dir /path/to/pictures/videos \
--colmap_img_root /path/to/images \
--maskroot /path/to/images_mask \
--batch_size 8
```
```
colmap feature_extractor --database scan.db --image
colmap feature_extractor \
--database_path /path/to/scan.db \
--image_path /path/to/images \
--image_list_path /path/to/vimages/video_frames_for_thorough_scan.txt
--ImageReader.mask_path Path/to/images_mask/ \
--ImageReader.camera_model RADIAL \
--ImageReader.single_camera_per_folder 1 \
```
We also recommand you make your own vocab_tree with image indexes, this will make the next matching steps faster.
```
colmap vocab_tree_retriever \
--database_path /path/to/scan.db\
--vocab_tree_path /path/to/vocab_tree \
--output_index /path/to/indexed_vocab_tree
```
6. Second COLMAP step : matching. For less than 1000 images, you can use exhaustive matching (this will take around 2hours). If there is too much images, you can use either spatial matching or vocab tree matching
```
colmap exhaustive_matcher --database scan.db
colmap exhaustive_matcher \
--database_path scan.db \
--SiftMatching.guided_matching 1
```
or
```
colmap spatial_matcher --database scan.db
colmap spatial_matcher \
--database scan.db \
--SiftMatching.guided_matching 1
```
or
```
colmap vocab_tree_matcher --database scan.db
colmap vocab_tree_matcher \
--database scan.db \
--VocabTreeMatching.vocab_tree_path /path/to/indexed_vocab_tree
```
7. Third COLMAP step : mapping.
7. Third COLMAP step : thorough mapping.
```
mkdir -p output/sparse
colmap mapper --Mapper.multiple_models 0 --database_path scan.db --output_path output/sparse/ --image_path images
colmap mapper --Mapper.multiple_models 0 --database_path scan.db --output_path /path/to/thorough/ --image_path images
```
This will create a model file in the folder `output/sparse` (or `output/sparse/0`), in the form of 3 files
......@@ -108,12 +241,12 @@ This will create a model file in the folder `output/sparse` (or `output/sparse/0
└── project.ini
```
8. Third COLMAP step : [georeferencing](https://colmap.github.io/faq.html#geo-registration)
* Create an `images.txt` with the image name and their XYZ position, relative to the centered Lidar Point using the file `cloud_centroid.txt`created at step 4
```
python align_frames_with_lidar.py --root images --centroid_path cloud_centroid.txt --output images.txt
```
This will create a file with image paths and corresponding XYZ coordinates at each line.
* ```
mkdir -p output/georeferenced
colmap model_aligner --input_path output/sparse --output_sparse output/georeferenced --ref_images_path images.txt
8. Fourth COLMAP step : [georeferencing](https://colmap.github.io/faq.html#geo-registration)
```
colmap model_aligner \
--input_path /path/to/thorough/0/ \
--output_path /path/to/geo-registered-model \
--ref_images_path /path/to/images/georef.txt
--robust_alignment_max_error 5
```
#!/bin/bash
cd pcl_util \
&& mkdir build \
&& cd build \
&& cmake .. \
&& make -j8
\ No newline at end of file
......@@ -85,18 +85,20 @@ def process_folder(folder_to_process, image_path, mask_path, pic_ext, verbose=Fa
parser = ArgumentParser(description='sky mask generator using ENet trained on cityscapes',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument('--root', metavar='DIR', default="~/Images/scan_manoir",
help='path to image folder root')
parser.add_argument('--img_dir', metavar='DIR', default="workspace/Pictures",
help='path to image folder root', type=Path)
parser.add_argument('--colmap_img_root', metavar='DIR', default="workspace/Pictures", type=Path,
help='image_path you will give to colmap when extracting feature')
parser.add_argument('--mask_root', metavar='DIR', default="workspace/Masks",
help='where to store the generated_masks', type=Path)
parser.add_argument("--batch_size", "-b", type=int, default=8)
if __name__ == '__main__':
args = parser.parse_args()
network = prepare_network()
if args.root[-1] == "/":
args.root = args.root[:-1]
root = Path(args.root).expanduser()
mask_root = root + '_mask'
mask_root.mkdir_p()
folders = [root] + list(root.walkdirs())
if args.img_dir[-1] == "/":
args.img_dir = args.root[:-1]
args.mask_root.makedirs_p()
file_exts = ['jpg', 'JPG']
process_folder(root, root, mask_root, file_exts, True)
process_folder(args.img_dir, args.colmap_img_root, args.mask_root, file_exts, True, args.batchsize)
#!/bin/bash
sudo apt update
sudo apt install -y git \
repo \
cmake \
build-essential \
pkg-config \
libboost-all-dev \
libeigen3-dev \
libsuitesparse-dev \
libfreeimage-dev \
libgoogle-glog-dev \
libgflags-dev \
libglew-dev \
qtbase5-dev \
libqt5opengl5-dev \
libcgal-dev \
libvtk6-dev \
libflann-dev \
cmake libgmp-dev \
libgoogle-glog-dev \
qt5-default \
libqwt-qt5-dev \
libpcl-dev \
libproj-dev \
libcgal-qt5-dev \
libatlas-base-dev \
libsuitesparse-dev \
zlib1g-dev \
libglfw3-dev \
libsdl2-dev \rsync
git clone https://github.com/laurentkneip/opengv.git
cd opengv \
&& mkdir build \
&& cd build \
&& cmake ..
&& make -j8
sudo make install
cd ../../
git clone https://github.com/opencv/opencv.git
cd opencv \
&& git checkout 4.1.2 \
&& mkdir build \
&& cd build \
&& cmake -D WITH_CUDA=OFF .. \
&& make -j8
sudo make install
cd ../../
git clone https://github.com/ETH3D/dataset-pipeline
cd dataset-pipeline \
&& mkdir -p build \
&& cd build \
&& cmake .. \
&& make -j8 \
&& cd ../../
git clone https://ceres-solver.googlesource.com/ceres-solver
cd ceres-solver \
&& git checkout $(git describe --tags) \
&& mkdir build \
&& cd build
&& cmake .. -DBUILD_TESTING=OFF -DBUILD_EXAMPLES=OFF
&& make -j8
sudo make install
cd ../../
git clone https://github.com/colmap/colmap.git
cd colmap \
&& git checkout dev \
&& mkdir build
&& cd build \
&& cmake .. \
&& make -j8
sudo make install
mkdir -p groundsdk \
&& cd groundsdk \
&& repo init -u https://github.com/Parrot-Developers/groundsdk-manifest -m release.xml \
&& repo sync \
&& ./build.sh -p pdraw-linux -t build -j/1
\ No newline at end of file
#include <glog/logging.h>
#include <iostream>
#include <fstream>
#include <pcl/common/transforms.h>
#include <pcl/console/parse.h>
#include <pcl/point_types.h>
#include <pcl/filters/statistical_outlier_removal.h>
#include <pcl/features/normal_3d.h>
#include <pcl/io/ply_io.h>
#include <colmap/mvs/fusion.h>
#include <colmap/util/endian.h>
#include "pointcloud_subsampler.h"
Eigen::Matrix4f readMatrix(std::string filename)
{
int cols = 4, rows = 4;
Eigen::Matrix4f M;
std::ifstream infile;
infile.open(filename);
int row = 0;
while (! infile.eof() && row < 4)
{
std::string line;
std::getline(infile, line);
std::stringstream stream(line);
stream >> M(row, 0) >> M(row, 1) >> M(row, 2) >> M(row, 3);
row ++;
}
infile.close();
return M;
};
int main (int argc, char** argv)
{
FLAGS_logtostderr = 1;
google::InitGoogleLogging(argv[0]);
pcl::console::setVerbosityLevel(pcl::console::L_DEBUG);
// Parse arguments.
int dummy;
if (argc <= 1 ||
pcl::console::parse_argument(argc, argv, "-h", dummy) >= 0 ||
pcl::console::parse_argument(argc, argv, "--help", dummy) >= 0) {
LOG(INFO) << "Usage: " << argv[0] << " --georef_dense <file.ply> --lidar <file.ply> "
<< "--georef_matrix <file.txt> --resolution <float> (--output_cloud <file.ply>)";
return EXIT_FAILURE;
}
std::string georef_dense_path;
pcl::console::parse_argument(argc, argv, "--georef_dense", georef_dense_path);
std::string lidar_path;
pcl::console::parse_argument(argc, argv, "--lidar", lidar_path);
std::string georef_matrix_path;
pcl::console::parse_argument(argc, argv, "--georef_matrix", georef_matrix_path);
std::string output_cloud_path;
pcl::console::parse_argument(argc, argv, "--output_cloud", output_cloud_path);
float resolution = 0.2; //20cm resolution
pcl::console::parse_argument(argc, argv, "--resolution", resolution);
if (georef_matrix_path.empty() && output_cloud_path.empty()){
LOG(ERROR) << "No output path was given";
LOG(INFO) << "Usage: " << argv[0] << " --georef_dense <file.ply> --lidar <file.ply> "
<< "--georef_matrix <matrix.txt> --output_cloud <output.ply>";
return EXIT_FAILURE;
}
// Load point cloud with normals.
LOG(INFO) << "Loading point clouds ...";
pcl::PointCloud<pcl::PointNormal>::Ptr georef_dense(
new pcl::PointCloud<pcl::PointNormal>());
if (pcl::io::loadPLYFile(georef_dense_path, *georef_dense) < 0) {
return EXIT_FAILURE;
}
pcl::PointCloud<pcl::PointNormal>::Ptr lidar(
new pcl::PointCloud<pcl::PointNormal>());
if (pcl::io::loadPLYFile(lidar_path, *lidar) < 0) {
return EXIT_FAILURE;
}
LOG(INFO) << "point clouds loaded";
LOG(INFO) << "Subsampling to have a mean distance between points of " << resolution << " m";
lidar = filter<pcl::PointNormal>(lidar, resolution);
LOG(INFO) << "Loading transformation matrix ...";
Eigen::Matrix4f M = readMatrix(georef_matrix_path);
LOG(INFO) << "Matrix loaded";
LOG(INFO) << M;
pcl::transformPointCloudWithNormals (*lidar, *lidar, M.inverse());
LOG(INFO) << "Loading georef_dense vis file...";
const std::string input_vis_path = georef_dense_path + ".vis";
std::fstream input_vis_file(input_vis_path, std::ios::in | std::ios::binary);
CHECK(input_vis_file.is_open()) << input_vis_path;
const size_t vis_num_points = colmap::ReadBinaryLittleEndian<uint64_t>(&input_vis_file);
CHECK_EQ(vis_num_points, georef_dense->size());
std::vector<std::vector<int>> input_vis_points;
input_vis_points.reserve(georef_dense->size());
for (auto it=georef_dense->begin(); it!=georef_dense->end(); it++) {
std::vector<int> image_idx;
int num_visible_images =
colmap::ReadBinaryLittleEndian<int>(&input_vis_file);
image_idx.reserve(num_visible_images);
for (uint32_t i = 0; i < num_visible_images; ++i) {
image_idx.push_back(colmap::ReadBinaryLittleEndian<uint32_t>(&input_vis_file));
}
input_vis_points.push_back(image_idx);
}
LOG(INFO) << "visible images ids ready to be transferred";
pcl::KdTree<pcl::PointNormal>::Ptr tree (new pcl::KdTreeFLANN<pcl::PointNormal>);
tree->setInputCloud(georef_dense);
std::vector<int> nn_indices (1);
std::vector<float> nn_dists (1);
std::vector<std::vector<int>> output_vis_points;
output_vis_points.reserve(lidar->size());
for(auto it = lidar->begin(); it != lidar->end(); it++){
tree->nearestKSearch(*it, 1, nn_indices, nn_dists);
std::vector<int> image_idx = input_vis_points.at(nn_indices[0]);
output_vis_points.push_back(image_idx);
}
if (!output_cloud_path.empty()) {
// Note : Instead of using pcl::savePLYFileBinary function,
// we use a PLYwriter so that can set the camera parameter to false,
// to not add its element in the header, because COLMAP doesn't like
// PLY files with unknown headers.
pcl::PLYWriter writer;
const bool binary=true, use_camera=false;
writer.write<pcl::PointNormal>(output_cloud_path, *lidar, binary, use_camera);
const std::string output_vis_path = output_cloud_path + ".vis";
colmap::mvs::WritePointsVisibility(output_vis_path, output_vis_points);
}
return (0);
}
\ No newline at end of file
#include <glog/logging.h>
#include <iostream>
#include <fstream>
#include <pcl/common/transforms.h>
#include <pcl/console/parse.h>
#include <pcl/point_types.h>
#include <pcl/filters/statistical_outlier_removal.h>
#include <pcl/features/normal_3d.h>
#include <pcl/io/ply_io.h>
Eigen::Matrix4f readMatrix(std::string filename)
{
int cols = 4, rows = 4;
Eigen::Matrix4f M;
std::ifstream infile;
infile.open(filename);
int row = 0;
while (! infile.eof() && row < 4)
{
std::string line;
std::getline(infile, line);
std::stringstream stream(line);
stream >> M(row, 0) >> M(row, 1) >> M(row, 2) >> M(row, 3);
row ++;
}
infile.close();
return M;
};
int main (int argc, char** argv)
{
FLAGS_logtostderr = 1;
google::InitGoogleLogging(argv[0]);
pcl::console::setVerbosityLevel(pcl::console::L_DEBUG);
// Parse arguments.
int dummy;
if (argc <= 1 ||
pcl::console::parse_argument(argc, argv, "-h", dummy) >= 0 ||
pcl::console::parse_argument(argc, argv, "--help", dummy) >= 0) {
LOG(INFO) << "Usage: " << argv[0] << " --georef <file.ply> --lidar <file.ply> "
<< "--max_distance <int> --output_matrix <file.txt> (--output_cloud <file.ply>)";
return EXIT_FAILURE;
}
std::string georef_dense_path;
pcl::console::parse_argument(argc, argv, "--georef_dense", georef_dense_path);
std::string lidar_path;
pcl::console::parse_argument(argc, argv, "--lidar", lidar_path);
std::string georef_matrix_path;
pcl::console::parse_argument(argc, argv, "--georef_matrix", georef_matrix_path);
std::string output_cloud_path;
pcl::console::parse_argument(argc, argv, "--output_cloud", output_cloud_path);
if (georef_matrix_path.empty() && output_cloud_path