yellowdolphin
3974d725b6
Fix warmup `accumulate` ( #3722 )
* gradient accumulation during warmup in train.py
Context:
`accumulate` is the number of batches/gradients accumulated before calling the next optimizer.step().
During warmup, it is ramped up from 1 to the final value nbs / batch_size.
Although I have not seen this in other libraries, I like the idea. During warmup, as grads are large, too large steps are more of on issue than gradient noise due to small steps.
The bug:
The condition to perform the opt step is wrong
> if ni % accumulate == 0:
This produces irregular step sizes if `accumulate` is not constant. It becomes relevant when batch_size is small and `accumulate` changes many times during warmup.
This demo also shows the proposed solution, to use a ">=" condition instead:
https://colab.research.google.com/drive/1MA2z2eCXYB_BC5UZqgXueqL_y1Tz_XVq?usp=sharing
Further, I propose not to restrict the number of warmup iterations to >= 1000. If the user changes hyp['warmup_epochs'], this causes unexpected behavior. Also, it makes evolution unstable if this parameter was to be optimized.
* replace last_opt_step tracking by do_step(ni)
* add docstrings
* move down nw
* Update train.py
* revert math import move
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
před 3 roky
Glenn Jocher
92d49fde35
Update seeds for single-GPU reproducibility ( #3789 )
For seed=0 on single-GPU.
před 3 roky
Piotr Skalski
09246a5a33
fix/incorrect_fitness_import ( #3770 )
před 3 roky
Glenn Jocher
f2d97ebb25
Remove DDP MultiHeadAttention fix ( #3768 )
před 3 roky
Glenn Jocher
f79d7479da
Add optional dataset.yaml `path` attribute ( #3753 )
* Add optional dataset.yaml `path` attribute
@KalenMike
* pass locals to python scripts
* handle lists
* update coco128.yaml
* Capitalize first letter
* add test key
* finalize GlobalWheat2020.yaml
* finalize objects365.yaml
* finalize SKU-110K.yaml
* finalize SKU-110K.yaml
* finalize VisDrone.yaml
* NoneType fix
* update download comment
* voc to VOC
* update
* update VOC.yaml
* update VOC.yaml
* remove dashes
* delete get_voc.sh
* force coco and coco128 to ../datasets
* Capitalize Argoverse_HD.yaml
* Capitalize Objects365.yaml
* update Argoverse_HD.yaml
* coco segments fix
* VOC single-thread
* update Argoverse_HD.yaml
* update data_dict in test handling
* create root
před 3 roky
Glenn Jocher
ae4261c774
Force non-zero hyp evolution weights `w` ( #3748 )
Fix for https://github.com/ultralytics/yolov5/issues/3741
před 3 roky
Glenn Jocher
fdc22398fa
Create `data/hyps` directory ( #3747 )
před 3 roky
Glenn Jocher
1f69d12591
Update 4 main ops for paths and .run() ( #3715 )
* Add yolov5/ to path
* rename functions to run()
* cleanup
* rename fix
* CI fix
* cleanup find models/export.py
před 3 roky
Ayush Chaurasia
75c0ff43af
[x]W&B: Don't resume transfer learning runs ( #3604 )
* Allow config cahnge
* Allow val change in wandb config
* Don't resume transfer learning runs
* Add entity in log dataset
před 3 roky
Glenn Jocher
e8810a53e8
Update DDP backend `if dist.is_nccl_available()` ( #3705 )
před 3 roky
Glenn Jocher
fbf41e0913
Add `train.run()` method ( #3700 )
* Update train.py explicit arguments
* Update train.py
* Add run method
před 3 roky
Glenn Jocher
c1af67dcd4
Add torch DP warning ( #3698 )
před 3 roky
Glenn Jocher
b3e2f4e08d
Eliminate `total_batch_size` variable ( #3697 )
* Eliminate `total_batch_size` variable
* cleanup
* Update train.py
před 3 roky
Glenn Jocher
fad27c0046
Update DDP for `torch.distributed.run` with `gloo` backend ( #3680 )
* Update DDP for `torch.distributed.run`
* Add LOCAL_RANK
* remove opt.local_rank
* backend="gloo|nccl"
* print
* print
* debug
* debug
* os.getenv
* gloo
* gloo
* gloo
* cleanup
* fix getenv
* cleanup
* cleanup destroy
* try nccl
* return opt
* add --local_rank
* add timeout
* add init_method
* gloo
* move destroy
* move destroy
* move print(opt) under if RANK
* destroy only RANK 0
* move destroy inside train()
* restore destroy outside train()
* update print(opt)
* cleanup
* nccl
* gloo with 60 second timeout
* update namespace printing
před 3 roky
lb-desupervised
bfb2276b1d
Slightly modify CLI execution ( #3687 )
* Slightly modify CLI execution
This simple change makes it easier to run the primary functions of this
repo (train/detect/test) from within Python. An object which represents
`opt` can be constructed and fed to the `main` function of each of these
modules, rather than having to call the lower level functions directly,
or run the module as a script.
* Update export.py
Add CLI parsing update for more convenient module usage within Python.
Co-authored-by: Lewis Belcher <lb@desupervised.io>
před 3 roky
Glenn Jocher
2296f1546f
Update `WORLD_SIZE` and `RANK` retrieval ( #3670 )
před 3 roky
Glenn Jocher
045d5d8629
Update TensorBoard ( #3669 )
před 3 roky
Glenn Jocher
fa201f968e
Update `train(hyp, *args)` to accept `hyp` file or dict ( #3668 )
před 3 roky
Glenn Jocher
6d6e2ca65f
Update train.py ( #3667 )
před 3 roky
Wei Quan
4c5d9bff80
Fix incorrect end epoch comment ( #3612 )
před 3 roky
Glenn Jocher
4984cf54be
train.py GPU memory fix ( #3590 )
* train.py GPU memory fix
* ema
* cuda
* cuda
* zeros input
* to device
* batch index 0
před 3 roky
Glenn Jocher
4695ca8314
Refactoring cleanup ( #3565 )
* Refactoring cleanup
* Update test.py
před 3 roky
Glenn Jocher
5948f20a3d
Update test.py profiling ( #3555 )
* Update test.py profiling
* half_precision to half
* inplace
před 3 roky
Glenn Jocher
63157d214d
Remove `is_coco` argument from `test()` ( #3553 )
před 3 roky
Glenn Jocher
958ab92dc1
Remove `opt` from `create_dataloader()`` ( #3552 )
před 3 roky
Glenn Jocher
ef0b5c9d29
On-demand `pycocotools` pip install ( #3547 )
před 3 roky
Glenn Jocher
f3c3d2ce5d
Merge `develop` branch into `master` ( #3518 )
* update ci-testing.yml (#3322 )
* update ci-testing.yml
* update greetings.yml
* bring back os matrix
* update ci-testing.yml (#3322 )
* update ci-testing.yml
* update greetings.yml
* bring back os matrix
* Enable direct `--weights URL` definition (#3373 )
* Enable direct `--weights URL` definition
@KalenMike this PR will enable direct --weights URL definition. Example use case:
```
python train.py --weights https://storage.googleapis.com/bucket/dir/model.pt
```
* cleanup
* bug fixes
* weights = attempt_download(weights)
* Update experimental.py
* Update hubconf.py
* return bug fix
* comment mirror
* min_bytes
* Update tutorial.ipynb (#3368 )
add Open in Kaggle badge
* `cv2.imread(img, -1)` for IMREAD_UNCHANGED (#3379 )
* Update datasets.py
* comment
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
* COCO evolution fix (#3388 )
* COCO evolution fix
* cleanup
* update print
* print fix
* Create `is_pip()` function (#3391 )
Returns `True` if file is part of pip package. Useful for contextual behavior modification.
```python
def is_pip():
# Is file in a pip package?
return 'site-packages' in Path(__file__).absolute().parts
```
* Revert "`cv2.imread(img, -1)` for IMREAD_UNCHANGED (#3379 )" (#3395 )
This reverts commit 21a9607e00
.
* Update FLOPs description (#3422 )
* Update README.md
* Changing FLOPS to FLOPs.
Co-authored-by: BuildTools <unconfigured@null.spigotmc.org>
* Parse URL authentication (#3424 )
* Parse URL authentication
* urllib.parse.unquote()
* improved error handling
* improved error handling
* remove %3F
* update check_file()
* Add FLOPs title to table (#3453 )
* Suppress jit trace warning + graph once (#3454 )
* Suppress jit trace warning + graph once
Suppress harmless jit trace warning on TensorBoard add_graph call. Also fix multiple add_graph() calls bug, now only on batch 0.
* Update train.py
* Update MixUp augmentation `alpha=beta=32.0` (#3455 )
Per VOC empirical results https://github.com/ultralytics/yolov5/issues/3380#issuecomment-853001307 by @developer0hye
* Add `timeout()` class (#3460 )
* Add `timeout()` class
* rearrange order
* Faster HSV augmentation (#3462 )
remove datatype conversion process that can be skipped
* Add `check_git_status()` 5 second timeout (#3464 )
* Add check_git_status() 5 second timeout
This should prevent the SSH Git bug that we were discussing @KalenMike
* cleanup
* replace timeout with check_output built-in timeout
* Improved `check_requirements()` offline-handling (#3466 )
Improve robustness of `check_requirements()` function to offline environments (do not attempt pip installs when offline).
* Add `output_names` argument for ONNX export with dynamic axes (#3456 )
* Add output names & dynamic axes for onnx export
Add output_names and dynamic_axes names for all outputs in torch.onnx.export. The first four outputs of the model will have names output0, output1, output2, output3
* use first output only + cleanup
Co-authored-by: Samridha Shrestha <samridha.shrestha@g42.ai>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
* Revert FP16 `test.py` and `detect.py` inference to FP32 default (#3423 )
* fixed inference bug ,while use half precision
* replace --use-half with --half
* replace space and PEP8 in detect.py
* PEP8 detect.py
* update --half help comment
* Update test.py
* revert space
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
* Add additional links/resources to stale.yml message (#3467 )
* Update stale.yml
* cleanup
* Update stale.yml
* reformat
* Update stale.yml HUB URL (#3468 )
* Stale `github.actor` bug fix (#3483 )
* Explicit `model.eval()` call `if opt.train=False` (#3475 )
* call model.eval() when opt.train is False
call model.eval() when opt.train is False
* single-line if statement
* cleanup
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
* check_requirements() exclude `opencv-python` (#3495 )
Fix for 3rd party or contrib versions of installed OpenCV as in https://github.com/ultralytics/yolov5/issues/3494 .
* Earlier `assert` for cpu and half option (#3508 )
* early assert for cpu and half option
early assert for cpu and half option
* Modified comment
Modified comment
* Update tutorial.ipynb (#3510 )
* Reduce test.py results spacing (#3511 )
* Update README.md (#3512 )
* Update README.md
Minor modifications
* 850 width
* Update greetings.yml
revert greeting change as PRs will now merge to master.
Co-authored-by: Piotr Skalski <SkalskiP@users.noreply.github.com>
Co-authored-by: SkalskiP <piotr.skalski92@gmail.com>
Co-authored-by: Peretz Cohen <pizzaz93@users.noreply.github.com>
Co-authored-by: tudoulei <34886368+tudoulei@users.noreply.github.com>
Co-authored-by: chocosaj <chocosaj@users.noreply.github.com>
Co-authored-by: BuildTools <unconfigured@null.spigotmc.org>
Co-authored-by: Yonghye Kwon <developer.0hye@gmail.com>
Co-authored-by: Sam_S <SamSamhuns@users.noreply.github.com>
Co-authored-by: Samridha Shrestha <samridha.shrestha@g42.ai>
Co-authored-by: edificewang <609552430@qq.com>
před 3 roky
Glenn Jocher
4aa2959101
Suppress jit trace warning + graph once ( #3454 )
* Suppress jit trace warning + graph once
Suppress harmless jit trace warning on TensorBoard add_graph call. Also fix multiple add_graph() calls bug, now only on batch 0.
* Update train.py
před 3 roky
Glenn Jocher
4b52e19a61
COCO evolution fix ( #3388 )
* COCO evolution fix
* cleanup
* update print
* print fix
před 3 roky
Glenn Jocher
ba6f3f974b
Enable direct `--weights URL` definition ( #3373 )
* Enable direct `--weights URL` definition
@KalenMike this PR will enable direct --weights URL definition. Example use case:
```
python train.py --weights https://storage.googleapis.com/bucket/dir/model.pt
```
* cleanup
* bug fixes
* weights = attempt_download(weights)
* Update experimental.py
* Update hubconf.py
* return bug fix
* comment mirror
* min_bytes
před 3 roky
Glenn Jocher
aad99b63d6
TensorBoard DP/DDP graph fix ( #3325 )
před 3 roky
Charles Frye
19100ba007
Improves docs and handling of entities and resuming by WandbLogger ( #3264 )
* adds latest tag to match wandb defaults
* adds entity handling, 'last' tag
* fixes bug causing finished runs to resume
* removes redundant "last" tag for wandb artifact
před 3 roky
Glenn Jocher
dd7f0b7e05
Fix TypeError: 'PosixPath' object is not iterable ( #3285 )
před 3 roky
Glenn Jocher
10d56d784e
Assert `--image-weights` not combined with DDP ( #3275 )
před 3 roky
Glenn Jocher
b7cd1f540d
TensorBoard add_graph() fix ( #3236 )
před 3 roky
Glenn Jocher
60fe54449d
Update train.py ( #3099 )
před 3 roky
Glenn Jocher
a833ee2a46
Update check_requirements() exclude list ( #2974 )
před 3 roky
Glenn Jocher
f7bc685c2c
Implement yaml.safe_load() ( #2876 )
* Implement yaml.safe_load()
* yaml.safe_dump()
před 3 roky
Burhan
c949fc86d1
Detection cropping+saving feature addition for detect.py and PyTorch Hub ( #2827 )
* Update detect.py
* Update detect.py
* Update greetings.yml
* Update cropping
* cleanup
* Update increment_path()
* Update common.py
* Update detect.py
* Update detect.py
* Update detect.py
* Update common.py
* cleanup
* Update detect.py
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
před 3 roky
Glenn Jocher
6dd1083bbb
Tensorboard model visualization bug fix ( #2758 )
This fix should allow for visualizing YOLOv5 model graphs correctly in Tensorboard by uncommenting line 335 in train.py:
```python
if tb_writer:
tb_writer.add_graph(torch.jit.trace(model, imgs, strict=False), []) # add model graph
```
The problem was that the detect() layer checks the input size to adapt the grid if required, and tracing does not seem to like this shape check (even if the shape is fine and no grid recomputation is required). The following will warn:
0cae7576a9/train.py (L335)
Solution is below. This is a YOLOv5s model displayed in TensorBoard. You can see the Detect() layer merging the 3 layers into a single output for example, and everything appears to work and visualize correctly.
```python
tb_writer.add_graph(torch.jit.trace(model, imgs, strict=False), [])
```
<img width="893" alt="Screenshot 2021-04-11 at 01 10 09" src="https://user-images.githubusercontent.com/26833433/114286928-349bd600-9a63-11eb-941f-7139ee6cd602.png ">
před 3 roky
Ding Yiwei
1148e2ea63
Add TransformerLayer, TransformerBlock, C3TR modules ( #2333 )
* yolotr
* transformer block
* Remove bias in Transformer
* Remove C3T
* Remove a deprecated class
* put the 2nd LayerNorm into the 2nd residual block
* move example model to models/hub, rename to -transformer
* Add module comments and TODOs
* Remove LN in Transformer
* Add comments for Transformer
* Solve the problem of MA with DDP
* cleanup
* cleanup find_unused_parameters
* PEP8 reformat
Co-authored-by: DingYiwei <846414640@qq.com>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
před 3 roky
Phat Tran
9c803f2f7e
Add --label-smoothing eps argument to train.py (default 0.0) ( #2344 )
* Add label smoothing option
* Correct data type
* add_log
* Remove log
* Add log
* Update loss.py
remove comment (too versbose)
Co-authored-by: phattran <phat.tranhoang@cyberlogitec.com>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
před 3 roky
Ayush Chaurasia
518c09578e
W&B resume ddp from run link fix ( #2579 )
* W&B resume ddp from run link fix
* Native DDP W&B support for training, resuming
před 3 roky
Ayush Chaurasia
dc51e80b00
Fix: evolve with wandb ( #2634 )
před 3 roky
Glenn Jocher
9f98201dd9
W&B DDP fix 2 ( #2587 )
Revert unintentional change to test batch sizes caused by PR https://github.com/ultralytics/yolov5/pull/2125
před 3 roky
Glenn Jocher
e5b0200cd2
Update tensorboard>=2.4.1 ( #2576 )
* Update tensorboard>=2.4.1
Update tensorboard version to attempt to address https://github.com/ultralytics/yolov5/issues/2573 (tensorboard logging fail in Docker image).
* cleanup
před 3 roky
Ayush Chaurasia
1bf9365280
W&B DDP fix ( #2574 )
před 3 roky
Ayush Chaurasia
e8fc97aa38
Improved W&B integration ( #2125 )
* Init Commit
* new wandb integration
* Update
* Use data_dict in test
* Updates
* Update: scope of log_img
* Update: scope of log_img
* Update
* Update: Fix logging conditions
* Add tqdm bar, support for .txt dataset format
* Improve Result table Logger
* Init Commit
* new wandb integration
* Update
* Use data_dict in test
* Updates
* Update: scope of log_img
* Update: scope of log_img
* Update
* Update: Fix logging conditions
* Add tqdm bar, support for .txt dataset format
* Improve Result table Logger
* Add dataset creation in training script
* Change scope: self.wandb_run
* Add wandb-artifact:// natively
you can now use --resume with wandb run links
* Add suuport for logging dataset while training
* Cleanup
* Fix: Merge conflict
* Fix: CI tests
* Automatically use wandb config
* Fix: Resume
* Fix: CI
* Enhance: Using val_table
* More resume enhancement
* FIX : CI
* Add alias
* Get useful opt config data
* train.py cleanup
* Cleanup train.py
* more cleanup
* Cleanup| CI fix
* Reformat using PEP8
* FIX:CI
* rebase
* remove uneccesary changes
* remove uneccesary changes
* remove uneccesary changes
* remove unecessary chage from test.py
* FIX: resume from local checkpoint
* FIX:resume
* FIX:resume
* Reformat
* Performance improvement
* Fix local resume
* Fix local resume
* FIX:CI
* Fix: CI
* Imporve image logging
* (:(:Redo CI tests:):)
* Remember epochs when resuming
* Remember epochs when resuming
* Update DDP location
Potential fix for #2405
* PEP8 reformat
* 0.25 confidence threshold
* reset train.py plots syntax to previous
* reset epochs completed syntax to previous
* reset space to previous
* remove brackets
* reset comment to previous
* Update: is_coco check, remove unused code
* Remove redundant print statement
* Remove wandb imports
* remove dsviz logger from test.py
* Remove redundant change from test.py
* remove redundant changes from train.py
* reformat and improvements
* Fix typo
* Add tqdm tqdm progress when scanning files, naming improvements
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
před 3 roky
Glenn Jocher
08d4918d7f
labels.jpg class names ( #2454 )
* labels.png class names
* fontsize=10
před 3 roky
Glenn Jocher
f01f3223d5
Integer printout ( #2450 )
* Integer printout
* test.py 'Labels'
* Update train.py
před 3 roky