Minimal PyTorch implementation of YOLOv3

eriklindernoren, updated 🕥 2023-03-23 15:40:04

PyTorch YOLO

A minimal PyTorch implementation of YOLOv3, with support for training, inference and evaluation.

YOLOv4 and YOLOv7 weights are also compatible with this implementation.

CI PyPI pyversions PyPI license

Installation

Installing from source

For normal training and evaluation we recommend installing the package from source using a poetry virtual environment.

bash git clone https://github.com/eriklindernoren/PyTorch-YOLOv3 cd PyTorch-YOLOv3/ pip3 install poetry --user poetry install

You need to join the virtual environment by running poetry shell in this directory before running any of the following commands without the poetry run prefix. Also have a look at the other installing method, if you want to use the commands everywhere without opening a poetry-shell.

Download pretrained weights

bash ./weights/download_weights.sh

Download COCO

bash ./data/get_coco_dataset.sh

Install via pip

This installation method is recommended, if you want to use this package as a dependency in another python project. This method only includes the code, is less isolated and may conflict with other packages. Weights and the COCO dataset need to be downloaded as stated above. See API for further information regarding the packages API. It also enables the CLI tools yolo-detect, yolo-train, and yolo-test everywhere without any additional commands.

bash pip3 install pytorchyolo --user

Test

Evaluates the model on COCO test dataset. To download this dataset as well as weights, see above.

bash poetry run yolo-test --weights weights/yolov3.weights

| Model | mAP (min. 50 IoU) | | ----------------------- |:-----------------:| | YOLOv3 608 (paper) | 57.9 | | YOLOv3 608 (this impl.) | 57.3 | | YOLOv3 416 (paper) | 55.3 | | YOLOv3 416 (this impl.) | 55.5 |

Inference

Uses pretrained weights to make predictions on images. Below table displays the inference times when using as inputs images scaled to 256x256. The ResNet backbone measurements are taken from the YOLOv3 paper. The Darknet-53 measurement marked shows the inference time of this implementation on my 1080ti card.

| Backbone | GPU | FPS | | ----------------------- |:--------:|:--------:| | ResNet-101 | Titan X | 53 | | ResNet-152 | Titan X | 37 | | Darknet-53 (paper) | Titan X | 76 | | Darknet-53 (this impl.) | 1080ti | 74 |

bash poetry run yolo-detect --images data/samples/

Train

For argument descriptions have a look at poetry run yolo-train --help

Example (COCO)

To train on COCO using a Darknet-53 backend pretrained on ImageNet run:

bash poetry run yolo-train --data config/coco.data --pretrained_weights weights/darknet53.conv.74

Tensorboard

Track training progress in Tensorboard: * Initialize training * Run the command below * Go to http://localhost:6006/

bash poetry run tensorboard --logdir='logs' --port=6006

Storing the logs on a slow drive possibly leads to a significant training speed decrease.

You can adjust the log directory using --logdir <path> when running tensorboard and yolo-train.

Train on Custom Dataset

Custom model

Run the commands below to create a custom model definition, replacing <num-classes> with the number of classes in your dataset.

bash ./config/create_custom_model.sh <num-classes> # Will create custom model 'yolov3-custom.cfg'

Classes

Add class names to data/custom/classes.names. This file should have one row per class name.

Image Folder

Move the images of your dataset to data/custom/images/.

Annotation Folder

Move your annotations to data/custom/labels/. The dataloader expects that the annotation file corresponding to the image data/custom/images/train.jpg has the path data/custom/labels/train.txt. Each row in the annotation file should define one bounding box, using the syntax label_idx x_center y_center width height. The coordinates should be scaled [0, 1], and the label_idx should be zero-indexed and correspond to the row number of the class name in data/custom/classes.names.

Define Train and Validation Sets

In data/custom/train.txt and data/custom/valid.txt, add paths to images that will be used as train and validation data respectively.

Train

To train on the custom dataset run:

bash poetry run yolo-train --model config/yolov3-custom.cfg --data config/custom.data

Add --pretrained_weights weights/darknet53.conv.74 to train using a backend pretrained on ImageNet.

API

You are able to import the modules of this repo in your own project if you install the pip package pytorchyolo.

An example prediction call from a simple OpenCV python script would look like this:

```python import cv2 from pytorchyolo import detect, models

Load the YOLO model

model = models.load_model( "/yolov3.cfg", "/yolov3.weights")

Load the image as a numpy array

img = cv2.imread("")

Convert OpenCV bgr to rgb

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

Runs the YOLO model on the image

boxes = detect.detect_image(model, img)

print(boxes)

Output will be a numpy array in the following format:

[[x1, y1, x2, y2, confidence, class]]

```

For more advanced usage look at the method's doc strings.

Credit

YOLOv3: An Incremental Improvement

Joseph Redmon, Ali Farhadi

Abstract
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that’s pretty swell. It’s a little bigger than last time but more accurate. It’s still fast though, don’t worry. At 320 × 320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 AP50 in 51 ms on a Titan X, compared to 57.5 AP50 in 198 ms by RetinaNet, similar performance but 3.8× faster. As always, all the code is online at https://pjreddie.com/yolo/.

[Paper] [Project Webpage] [Authors' Implementation]

@article{yolov3, title={YOLOv3: An Incremental Improvement}, author={Redmon, Joseph and Farhadi, Ali}, journal = {arXiv}, year={2018} }

Other

YOEO — You Only Encode Once

YOEO extends this repo with the ability to train an additional semantic segmentation decoder. The lightweight example model is mainly targeted towards embedded real-time applications.

Issues

i'm using yolov3.weights to test data/sample/, i found there are some wrong bboxes which are different from this projects

opened on 2023-03-29 04:20:36 by J-LINC

I am testing on/data/sample using the weights trained on the coco dataset, but the results obtained are somewhat different from those shown by the author on the homepage. That is why? 2023-03-29 12-18-00 的屏幕截图 2023-03-29 12-18-22 的屏幕截图 2023-03-29 12-18-38 的屏幕截图

how to adjust learning rates or something to be better when using a pretrained model to train my dataset?please give me some advice

opened on 2023-03-22 15:18:59 by J-LINC

I'm trying to train my own data set using pretrained weights, BDD100K. Can you give me some advice? (All parameters unchanged, e.g. learning rate and step of learning rate decay) Below is the result of my 12 epoch training. I think map is a little too low, but the values of each loss function shown in the diagram are very small. I am worried that if I continue without any hyper parameter changes, the model will be overfitting. I currently set the number of training rounds to 250 epoch, which is evaluated every 10 rounds, and every 50 epochs saves the model.The following image shows the losses I have trained for 10 epochs and the evaluation results on the validation set 2023-03-22 23-13-28 的屏幕截图

2023-03-22 23-16-25 的屏幕截图

very low mAP on coco val2014 when training from scratch

opened on 2023-03-19 02:40:46 by fyw1999

The code is elegant and concise, but the training performance on coco val2014 is poor. The mAP is only 0.00912 after 24 epochs when I train the model from scratch.

Difference implementations between this repo and the faster_rcnn ones

opened on 2022-10-05 14:38:22 by ksmdnl

Thank you for the exhaustive resources that you put in here for the community to learn from!

I've been wondering what the difference between the implementation of mAP for pascal voc from this repository and the ones from https://github.com/longcw/faster_rcnn_pytorch.git

After spending sometime 'debugging' I came to the conclusion that both should be identical. However, when comparing the result, I get better mAP using the implementation from this repo (75.6) than using the other one (71.2).

I wonder if anyone have had similar problem or dilemma and could share their experiences?

Can't train successfully with COCO2017 dataset

opened on 2022-09-19 07:50:55 by metalgear54

I'm using the lateset code to train COCO2017 dataset. But only get very low mAP. By using the yolov3.weights, I can get mAP=65 in validation dataset and mAP=67 in training dataset using test.py. which can prove that my dataset setup is right. But when i train the dataset with the same yolov3.weights module, the mAP will fall to nearly 0 after 1 or 2 epochs. Can you analyse why this happen? If I set the leraning rate very litlle such as 1e-6, the mAP still go down each training epoch.

Having some doubts for build_target

opened on 2022-06-08 05:29:50 by d5423197

Hello there,

After learning your code, I still have some doubts. If one label has a chance to appear in each layer(13 13, 26 26, 52 52) and many anchors. I am seeing that the loss will be added for each layer. I don't know the reason behind this.

If there is a big object which is better to be handled for grid 13 by 13. But it may also become a target in another layer?

Thanks, ZD

Erik Linder-Norén

ML engineer at Apple. Excited about machine learning, basketball and building things.

GitHub Repository