A minimal PyTorch implementation of YOLOv3, with support for training, inference and evaluation.
YOLOv4 and YOLOv7 weights are also compatible with this implementation.
For normal training and evaluation we recommend installing the package from source using a poetry virtual environment.
bash
git clone https://github.com/eriklindernoren/PyTorch-YOLOv3
cd PyTorch-YOLOv3/
pip3 install poetry --user
poetry install
You need to join the virtual environment by running poetry shell
in this directory before running any of the following commands without the poetry run
prefix.
Also have a look at the other installing method, if you want to use the commands everywhere without opening a poetry-shell.
bash
./weights/download_weights.sh
bash
./data/get_coco_dataset.sh
This installation method is recommended, if you want to use this package as a dependency in another python project.
This method only includes the code, is less isolated and may conflict with other packages.
Weights and the COCO dataset need to be downloaded as stated above.
See API for further information regarding the packages API.
It also enables the CLI tools yolo-detect
, yolo-train
, and yolo-test
everywhere without any additional commands.
bash
pip3 install pytorchyolo --user
Evaluates the model on COCO test dataset. To download this dataset as well as weights, see above.
bash
poetry run yolo-test --weights weights/yolov3.weights
| Model | mAP (min. 50 IoU) | | ----------------------- |:-----------------:| | YOLOv3 608 (paper) | 57.9 | | YOLOv3 608 (this impl.) | 57.3 | | YOLOv3 416 (paper) | 55.3 | | YOLOv3 416 (this impl.) | 55.5 |
Uses pretrained weights to make predictions on images. Below table displays the inference times when using as inputs images scaled to 256x256. The ResNet backbone measurements are taken from the YOLOv3 paper. The Darknet-53 measurement marked shows the inference time of this implementation on my 1080ti card.
| Backbone | GPU | FPS | | ----------------------- |:--------:|:--------:| | ResNet-101 | Titan X | 53 | | ResNet-152 | Titan X | 37 | | Darknet-53 (paper) | Titan X | 76 | | Darknet-53 (this impl.) | 1080ti | 74 |
bash
poetry run yolo-detect --images data/samples/
For argument descriptions have a look at poetry run yolo-train --help
To train on COCO using a Darknet-53 backend pretrained on ImageNet run:
bash
poetry run yolo-train --data config/coco.data --pretrained_weights weights/darknet53.conv.74
Track training progress in Tensorboard: * Initialize training * Run the command below * Go to http://localhost:6006/
bash
poetry run tensorboard --logdir='logs' --port=6006
Storing the logs on a slow drive possibly leads to a significant training speed decrease.
You can adjust the log directory using --logdir <path>
when running tensorboard
and yolo-train
.
Run the commands below to create a custom model definition, replacing <num-classes>
with the number of classes in your dataset.
bash
./config/create_custom_model.sh <num-classes> # Will create custom model 'yolov3-custom.cfg'
Add class names to data/custom/classes.names
. This file should have one row per class name.
Move the images of your dataset to data/custom/images/
.
Move your annotations to data/custom/labels/
. The dataloader expects that the annotation file corresponding to the image data/custom/images/train.jpg
has the path data/custom/labels/train.txt
. Each row in the annotation file should define one bounding box, using the syntax label_idx x_center y_center width height
. The coordinates should be scaled [0, 1]
, and the label_idx
should be zero-indexed and correspond to the row number of the class name in data/custom/classes.names
.
In data/custom/train.txt
and data/custom/valid.txt
, add paths to images that will be used as train and validation data respectively.
To train on the custom dataset run:
bash
poetry run yolo-train --model config/yolov3-custom.cfg --data config/custom.data
Add --pretrained_weights weights/darknet53.conv.74
to train using a backend pretrained on ImageNet.
You are able to import the modules of this repo in your own project if you install the pip package pytorchyolo
.
An example prediction call from a simple OpenCV python script would look like this:
```python import cv2 from pytorchyolo import detect, models
model = models.load_model(
"
img = cv2.imread("
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
boxes = detect.detect_image(model, img)
print(boxes)
```
For more advanced usage look at the method's doc strings.
Joseph Redmon, Ali Farhadi
Abstract
We present some updates to YOLO! We made a bunch
of little design changes to make it better. We also trained
this new network that’s pretty swell. It’s a little bigger than
last time but more accurate. It’s still fast though, don’t
worry. At 320 × 320 YOLOv3 runs in 22 ms at 28.2 mAP,
as accurate as SSD but three times faster. When we look
at the old .5 IOU mAP detection metric YOLOv3 is quite
good. It achieves 57.9 AP50 in 51 ms on a Titan X, compared
to 57.5 AP50 in 198 ms by RetinaNet, similar performance
but 3.8× faster. As always, all the code is online at
https://pjreddie.com/yolo/.
[Paper] [Project Webpage] [Authors' Implementation]
@article{yolov3,
title={YOLOv3: An Incremental Improvement},
author={Redmon, Joseph and Farhadi, Ali},
journal = {arXiv},
year={2018}
}
YOEO extends this repo with the ability to train an additional semantic segmentation decoder. The lightweight example model is mainly targeted towards embedded real-time applications.
I am testing on/data/sample using the weights trained on the coco dataset, but the results obtained are somewhat different from those shown by the author on the homepage. That is why?
I'm trying to train my own data set using pretrained weights, BDD100K. Can you give me some advice? (All parameters unchanged, e.g. learning rate and step of learning rate decay) Below is the result of my 12 epoch training. I think map is a little too low, but the values of each loss function shown in the diagram are very small. I am worried that if I continue without any hyper parameter changes, the model will be overfitting. I currently set the number of training rounds to 250 epoch, which is evaluated every 10 rounds, and every 50 epochs saves the model.The following image shows the losses I have trained for 10 epochs and the evaluation results on the validation set
The code is elegant and concise, but the training performance on coco val2014 is poor. The mAP is only 0.00912 after 24 epochs when I train the model from scratch.
Thank you for the exhaustive resources that you put in here for the community to learn from!
I've been wondering what the difference between the implementation of mAP for pascal voc from this repository and the ones from https://github.com/longcw/faster_rcnn_pytorch.git
After spending sometime 'debugging' I came to the conclusion that both should be identical. However, when comparing the result, I get better mAP using the implementation from this repo (75.6) than using the other one (71.2).
I wonder if anyone have had similar problem or dilemma and could share their experiences?
I'm using the lateset code to train COCO2017 dataset. But only get very low mAP. By using the yolov3.weights, I can get mAP=65 in validation dataset and mAP=67 in training dataset using test.py. which can prove that my dataset setup is right. But when i train the dataset with the same yolov3.weights module, the mAP will fall to nearly 0 after 1 or 2 epochs. Can you analyse why this happen? If I set the leraning rate very litlle such as 1e-6, the mAP still go down each training epoch.
Hello there,
After learning your code, I still have some doubts. If one label has a chance to appear in each layer(13 13, 26 26, 52 52) and many anchors. I am seeing that the loss will be added for each layer. I don't know the reason behind this.
If there is a big object which is better to be handled for grid 13 by 13. But it may also become a target in another layer?
Thanks, ZD
ML engineer at Apple. Excited about machine learning, basketball and building things.
GitHub Repository