This repo is for Self-Supervised Monocular Depth Estimation with Internal Feature Fusion(arXiv), BMVC2021
A new backbone for self-supervised depth estimation.
If you think it is a useful work, please consider citing it. ``` @inproceedings{zhou_diffnet, title={Self-Supervised Monocular Depth Estimation with Internal Feature Fusion}, author={Zhou, Hang and Greenwood, David and Taylor, Sarah}, booktitle={British Machine Vision Conference (BMVC)}, year={2021} }
```
[16-05-2022] Adding cityscapes trainining and testing based on Manydepth.
[22-01-2022] A model diffnet_649x192 uploaded (slightly improved than that of orginal paper)
| Methods |abs rel|sq rel| RMSE |rmse log | D1 | D2 | D3 | | :----------- | :-----: | :----: | :---: | :------: | :--------: |:--------: |:--------: | 1024x320|0.097|0.722|4.345|0.174|0.907|0.967|0.984| 1024_320_ms|0.094|0.678|4.250|0.172|0.911|0.968|0.984| 1024x320_ms_ttr|0.079|0.640|3.934|0.159|0.932|0.971|0.984 | 640x192|0.102|0.753|4.459|0.179|0.897|0.965|0.983| 640x192_ms|0.101|0.749|4.445|0.179|0.898|0.965|0.983|
sh start2train.sh
sh disp_evaluation.sh
sh test_sample.sh
Thanks the authors for their works: - monodepth2 - HRNet
This is a great work. I have a question about "run-time FPS". In Table.3 of your paper, you claim that the run-time is 87FPS. Under what circumstances do you get this value? It takes at least 53ms for me to use GPU Nvidia RTX3090 to process a picture (640x192).
Thanks for your working.
Here are something detials i want 2 ask you . Here are my torch
torch 1.7.1+cu110 torchaudio 0.7.2 torchsummary 1.5.1
torchvision 0.8.2+cu110
I found when i set the initial learning rate as 10−4 for the first 14 epochs and then 10−5 for last 5 epochs ,my experimental results are very different from yours . Is it the reason for different PyTorch versions?Or my training process wrong?
Can we still get that environment file, though?
Thanks for your work of DIFFNet! I want to evaluate the results of the training in my PC, but the file "splits/eigen/gt_depths.npz" is required. I can't find it in the document. Could you please provide this file? Thanks!
Hi. First, thank you for opening your nice paper and source code.
Could you share checkpoints that were pretrained on Cityscapes and fine-tuned on KITTI (i.e., CS → K)?
I would like to know whether DiffNet that I pretrained on Cityscapes is correct.
Thanks!
Hello,
Thank you for sharing your work, and I want to use libtorch to deploy this network in C++, but when using torch::jit::trace(), I get this error(executing test_sample.py can run successfully):
Because torch::jit::trace() cannot handle dictionary, I changed the output of depth_decoder to list, and there is a line "import hr_networks" in test_sample.py, but I did not find hr_networks, I don't know if this affectstorch::jit::trace().
Thank you very much!
self-supervised monocular-depth-estimation representation-learning bmvc cityscapes kitti