* Autofix duplicate labels
PR changes duplicate label handling from report error and ignore image-label pair to report warning and autofix image-label pair.
This should fix this common issue for users and allow everyone to get started and get a model trained faster and easier than before.
* sign fix
* Cleanup
* Increment cache version
* all to any fix
* Add train class filter feature to datasets.py
Allows for training on a subset of total classes if `include_class` list is defined on datasets.py L448:
```python
include_class = [] # filter labels to include only these classes (optional)
```
* segments fix
PR should produce datasets sorted alphabetically by filename. Cache version incremented to 0.5.
Note: will force a one-time re-caching of existing datasets on first-use.
Thank you for sharing nice open-source codes 👍
I applied to shuffle the order of all 4(or 9) images in mosaic augmentation
Currently, the order of images in mosaic augmentation is not completely random.
The remaining images except the first are randomly arranged. Apply shuffle all to increase the diversity of data composition.
Auto-fix corrupt JPEGs PR introduced a bug whereby the f.seek() operation read all of the bytes in the image, resulting in the PIL image having nothing to read upon the .save() operation.
Fix was to re-open the image using PIL before saving.
* Autofix corrupt JPEGs
This PR automatically re-saves corrupt JPEGs and trains with the resaved images. WARNING: this will overwrite the existing corrupt JPEGs in a dataset and replace them with correct JPEGs, though the filesize may increase and the image contents may not be exactly the same due to lossy JPEG compression schemes. Results may vary by JPEG decoder and hardware.
Current behavior is to exclude corrupt JPEGs from training with a warning to the user, but many users have been complaining about large parts of their dataset being excluded from training.
* Clarify re-save reason
* Add models/tf.py for TensorFlow and TFLite export
* Set auto=False for int8 calibration
* Update requirements.txt for TensorFlow and TFLite export
* Read anchors directly from PyTorch weights
* Add --tf-nms to append NMS in TensorFlow SavedModel and GraphDef export
* Remove check_anchor_order, check_file, set_logging from import
* Reformat code and optimize imports
* Autodownload model and check cfg
* update --source path, img-size to 320, single output
* Adjust representative_dataset
* Put representative dataset in tfl_int8 block
* detect.py TF inference
* weights to string
* weights to string
* cleanup tf.py
* Add --dynamic-batch-size
* Add xywh normalization to reduce calibration error
* Update requirements.txt
TensorFlow 2.3.1 -> 2.4.0 to avoid int8 quantization error
* Fix imports
Move C3 from models.experimental to models.common
* Add models/tf.py for TensorFlow and TFLite export
* Set auto=False for int8 calibration
* Update requirements.txt for TensorFlow and TFLite export
* Read anchors directly from PyTorch weights
* Add --tf-nms to append NMS in TensorFlow SavedModel and GraphDef export
* Remove check_anchor_order, check_file, set_logging from import
* Reformat code and optimize imports
* Autodownload model and check cfg
* update --source path, img-size to 320, single output
* Adjust representative_dataset
* detect.py TF inference
* Put representative dataset in tfl_int8 block
* weights to string
* weights to string
* cleanup tf.py
* Add --dynamic-batch-size
* Add xywh normalization to reduce calibration error
* Update requirements.txt
TensorFlow 2.3.1 -> 2.4.0 to avoid int8 quantization error
* Fix imports
Move C3 from models.experimental to models.common
* implement C3() and SiLU()
* Fix reshape dim to support dynamic batching
* Add epsilon argument in tf_BN, which is different between TF and PT
* Set stride to None if not using PyTorch, and do not warmup without PyTorch
* Add list support in check_img_size()
* Add list input support in detect.py
* sys.path.append('./') to run from yolov5/
* Add int8 quantization support for TensorFlow 2.5
* Add get_coco128.sh
* Remove --no-tfl-detect in models/tf.py (Use tf-android-tfl-detect branch for EdgeTPU)
* Update requirements.txt
* Replace torch.load() with attempt_load()
* Update requirements.txt
* Add --tf-raw-resize to set half_pixel_centers=False
* Add --agnostic-nms for TF class-agnostic NMS
* Cleanup after merge
* Cleanup2 after merge
* Cleanup3 after merge
* Add tf.py docstring with credit and usage
* pb saved_model and tflite use only one model in detect.py
* Add use cases in docstring of tf.py
* Remove redundant `stride` definition
* Remove keras direct import
* Fix `check_requirements(('tensorflow>=2.4.1',))`
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
* Add cache-on-disk and cache-directory to cache images on disk
* Fix load_image with cache_on_disk
* Add no_cache flag for load_image
* Revert the parts('logging' and a new line) that do not need to be modified
* Add the assertion for shapes of cached images
* Add a suffix string for cached images
* Fix boundary-error of letterbox for load_mosaic
* Add prefix as cache-key of cache-on-disk
* Update cache-function on disk
* Add psutil in requirements.txt
* Update train.py
* Cleanup1
* Cleanup2
* Skip existing npy
* Include re-space
* Export return character fix
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
* Added the recording feature for multiple streams
Thanks for the very cool repo!!
I was trying to record multiple feeds at the same time, but the current version of the detector only had one video writer and one vid_path!
So the streams were not being saved and only were initialized with one frame and this process didn't record the whole thing.
Fix:
I made a list of `vid_writer` and `vid_path` and the `i` from the loop over the `pred` took care of the writer which need to work!
I hope this helps, Thanks!
* Cleanup list lengths
* batch size variable
* Update datasets.py
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
* Create shortcircuit in augment_hsv when hyperparameter are zero
* implement faster opt-in
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
@KalenMike this is a PR to add image filenames and labels to our stats dictionary and to save the dictionary to JSON. Save location is next to the train labels.cache file. The single JSON contains all stats for entire dataset.
Usage example:
```python
from utils.datasets import *
dataset_stats('coco128.yaml', verbose=True)
```