* update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * reformat * Single-line argparser argument * Update README.md * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update README.md * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>modifyDataloader
@@ -475,7 +475,7 @@ def parse_opt(known=False): | |||
# Weights & Biases arguments | |||
parser.add_argument('--entity', default=None, help='W&B: Entity') | |||
parser.add_argument('--upload_dataset', action='store_true', help='W&B: Upload dataset as artifact table') | |||
parser.add_argument('--upload_dataset', nargs='?', const=True, default=False, help='W&B: Upload data, "val" option') | |||
parser.add_argument('--bbox_interval', type=int, default=-1, help='W&B: Set bounding-box image logging interval') | |||
parser.add_argument('--artifact_alias', type=str, default='latest', help='W&B: Version of dataset artifact to use') | |||
@@ -2,6 +2,7 @@ | |||
* [About Weights & Biases](#about-weights-&-biases) | |||
* [First-Time Setup](#first-time-setup) | |||
* [Viewing runs](#viewing-runs) | |||
* [Disabling wandb](#disabling-wandb) | |||
* [Advanced Usage: Dataset Versioning and Evaluation](#advanced-usage) | |||
* [Reports: Share your work with the world!](#reports) | |||
@@ -49,31 +50,36 @@ Run information streams from your environment to the W&B cloud console as you tr | |||
* Environment: OS and Python types, Git repository and state, **training command** | |||
<p align="center"><img width="900" alt="Weights & Biases dashboard" src="https://user-images.githubusercontent.com/26833433/135390767-c28b050f-8455-4004-adb0-3b730386e2b2.png"></p> | |||
</details> | |||
## Disabling wandb | |||
* training after running `wandb disabled` inside that directory creates no wandb run | |||
![Screenshot (84)](https://user-images.githubusercontent.com/15766192/143441777-c780bdd7-7cb4-4404-9559-b4316030a985.png) | |||
</details> | |||
* To enable wandb again, run `wandb online` | |||
![Screenshot (85)](https://user-images.githubusercontent.com/15766192/143441866-7191b2cb-22f0-4e0f-ae64-2dc47dc13078.png) | |||
## Advanced Usage | |||
You can leverage W&B artifacts and Tables integration to easily visualize and manage your datasets, models and training evaluations. Here are some quick examples to get you started. | |||
<details open> | |||
<h3>1. Visualize and Version Datasets</h3> | |||
Log, visualize, dynamically query, and understand your data with <a href='https://docs.wandb.ai/guides/data-vis/tables'>W&B Tables</a>. You can use the following command to log your dataset as a W&B Table. This will generate a <code>{dataset}_wandb.yaml</code> file which can be used to train from dataset artifact. | |||
<details> | |||
<h3> 1: Train and Log Evaluation simultaneousy </h3> | |||
This is an extension of the previous section, but it'll also training after uploading the dataset. <b> This also evaluation Table</b> | |||
Evaluation table compares your predictions and ground truths across the validation set for each epoch. It uses the references to the already uploaded datasets, | |||
so no images will be uploaded from your system more than once. | |||
<details open> | |||
<summary> <b>Usage</b> </summary> | |||
<b>Code</b> <code> $ python utils/logger/wandb/log_dataset.py --project ... --name ... --data .. </code> | |||
<b>Code</b> <code> $ python train.py --upload_data val</code> | |||
![Screenshot (64)](https://user-images.githubusercontent.com/15766192/128486078-d8433890-98a3-4d12-8986-b6c0e3fc64b9.png) | |||
![Screenshot from 2021-11-21 17-40-06](https://user-images.githubusercontent.com/15766192/142761183-c1696d8c-3f38-45ab-991a-bb0dfd98ae7d.png) | |||
</details> | |||
<h3> 2: Train and Log Evaluation simultaneousy </h3> | |||
This is an extension of the previous section, but it'll also training after uploading the dataset. <b> This also evaluation Table</b> | |||
Evaluation table compares your predictions and ground truths across the validation set for each epoch. It uses the references to the already uploaded datasets, | |||
so no images will be uploaded from your system more than once. | |||
<h3>2. Visualize and Version Datasets</h3> | |||
Log, visualize, dynamically query, and understand your data with <a href='https://docs.wandb.ai/guides/data-vis/tables'>W&B Tables</a>. You can use the following command to log your dataset as a W&B Table. This will generate a <code>{dataset}_wandb.yaml</code> file which can be used to train from dataset artifact. | |||
<details> | |||
<summary> <b>Usage</b> </summary> | |||
<b>Code</b> <code> $ python utils/logger/wandb/log_dataset.py --data .. --upload_data </code> | |||
<b>Code</b> <code> $ python utils/logger/wandb/log_dataset.py --project ... --name ... --data .. </code> | |||
![Screenshot (72)](https://user-images.githubusercontent.com/15766192/128979739-4cf63aeb-a76f-483f-8861-1c0100b938a5.png) | |||
![Screenshot (64)](https://user-images.githubusercontent.com/15766192/128486078-d8433890-98a3-4d12-8986-b6c0e3fc64b9.png) | |||
</details> | |||
<h3> 3: Train using dataset artifact </h3> | |||
@@ -81,7 +87,7 @@ You can leverage W&B artifacts and Tables integration to easily visualize and ma | |||
can be used to train a model directly from the dataset artifact. <b> This also logs evaluation </b> | |||
<details> | |||
<summary> <b>Usage</b> </summary> | |||
<b>Code</b> <code> $ python utils/logger/wandb/log_dataset.py --data {data}_wandb.yaml </code> | |||
<b>Code</b> <code> $ python train.py --data {data}_wandb.yaml </code> | |||
![Screenshot (72)](https://user-images.githubusercontent.com/15766192/128979739-4cf63aeb-a76f-483f-8861-1c0100b938a5.png) | |||
</details> | |||
@@ -123,7 +129,6 @@ Any run can be resumed using artifacts if the <code>--resume</code> argument sta | |||
</details> | |||
<h3> Reports </h3> | |||
W&B Reports can be created from your saved runs for sharing online. Once a report is created you will receive a link you can use to publically share your results. Here is an example report created from the COCO128 tutorial trainings of all four YOLOv5 models ([link](https://wandb.ai/glenn-jocher/yolov5_tutorial/reports/YOLOv5-COCO128-Tutorial-Results--VmlldzozMDI5OTY)). | |||
@@ -202,7 +202,6 @@ class WandbLogger(): | |||
config_path = self.log_dataset_artifact(opt.data, | |||
opt.single_cls, | |||
'YOLOv5' if opt.project == 'runs/train' else Path(opt.project).stem) | |||
LOGGER.info(f"Created dataset config file {config_path}") | |||
with open(config_path, errors='ignore') as f: | |||
wandb_data_dict = yaml.safe_load(f) | |||
return wandb_data_dict | |||
@@ -244,7 +243,9 @@ class WandbLogger(): | |||
if self.val_artifact is not None: | |||
self.result_artifact = wandb.Artifact("run_" + wandb.run.id + "_progress", "evaluation") | |||
self.result_table = wandb.Table(["epoch", "id", "ground truth", "prediction", "avg_confidence"]) | |||
columns = ["epoch", "id", "ground truth", "prediction"] | |||
columns.extend(self.data_dict['names']) | |||
self.result_table = wandb.Table(columns) | |||
self.val_table = self.val_artifact.get("val") | |||
if self.val_table_path_map is None: | |||
self.map_val_table_path() | |||
@@ -331,28 +332,41 @@ class WandbLogger(): | |||
returns: | |||
the new .yaml file with artifact links. it can be used to start training directly from artifacts | |||
""" | |||
upload_dataset = self.wandb_run.config.upload_dataset | |||
log_val_only = isinstance(upload_dataset, str) and upload_dataset == 'val' | |||
self.data_dict = check_dataset(data_file) # parse and check | |||
data = dict(self.data_dict) | |||
nc, names = (1, ['item']) if single_cls else (int(data['nc']), data['names']) | |||
names = {k: v for k, v in enumerate(names)} # to index dictionary | |||
self.train_artifact = self.create_dataset_table(LoadImagesAndLabels( | |||
data['train'], rect=True, batch_size=1), names, name='train') if data.get('train') else None | |||
# log train set | |||
if not log_val_only: | |||
self.train_artifact = self.create_dataset_table(LoadImagesAndLabels( | |||
data['train'], rect=True, batch_size=1), names, name='train') if data.get('train') else None | |||
if data.get('train'): | |||
data['train'] = WANDB_ARTIFACT_PREFIX + str(Path(project) / 'train') | |||
self.val_artifact = self.create_dataset_table(LoadImagesAndLabels( | |||
data['val'], rect=True, batch_size=1), names, name='val') if data.get('val') else None | |||
if data.get('train'): | |||
data['train'] = WANDB_ARTIFACT_PREFIX + str(Path(project) / 'train') | |||
if data.get('val'): | |||
data['val'] = WANDB_ARTIFACT_PREFIX + str(Path(project) / 'val') | |||
path = Path(data_file).stem | |||
path = (path if overwrite_config else path + '_wandb') + '.yaml' # updated data.yaml path | |||
data.pop('download', None) | |||
data.pop('path', None) | |||
with open(path, 'w') as f: | |||
yaml.safe_dump(data, f) | |||
path = Path(data_file) | |||
# create a _wandb.yaml file with artifacts links if both train and test set are logged | |||
if not log_val_only: | |||
path = (path.stem if overwrite_config else path.stem + '_wandb') + '.yaml' # updated data.yaml path | |||
path = Path('data') / path | |||
data.pop('download', None) | |||
data.pop('path', None) | |||
with open(path, 'w') as f: | |||
yaml.safe_dump(data, f) | |||
LOGGER.info(f"Created dataset config file {path}") | |||
if self.job_type == 'Training': # builds correct artifact pipeline graph | |||
if not log_val_only: | |||
self.wandb_run.log_artifact( | |||
self.train_artifact) # calling use_artifact downloads the dataset. NOT NEEDED! | |||
self.wandb_run.use_artifact(self.val_artifact) | |||
self.wandb_run.use_artifact(self.train_artifact) | |||
self.val_artifact.wait() | |||
self.val_table = self.val_artifact.get('val') | |||
self.map_val_table_path() | |||
@@ -371,7 +385,7 @@ class WandbLogger(): | |||
for i, data in enumerate(tqdm(self.val_table.data)): | |||
self.val_table_path_map[data[3]] = data[0] | |||
def create_dataset_table(self, dataset: LoadImagesAndLabels, class_to_id: Dict[int,str], name: str = 'dataset'): | |||
def create_dataset_table(self, dataset: LoadImagesAndLabels, class_to_id: Dict[int, str], name: str = 'dataset'): | |||
""" | |||
Create and return W&B artifact containing W&B Table of the dataset. | |||
@@ -424,23 +438,34 @@ class WandbLogger(): | |||
""" | |||
class_set = wandb.Classes([{'id': id, 'name': name} for id, name in names.items()]) | |||
box_data = [] | |||
total_conf = 0 | |||
avg_conf_per_class = [0] * len(self.data_dict['names']) | |||
pred_class_count = {} | |||
for *xyxy, conf, cls in predn.tolist(): | |||
if conf >= 0.25: | |||
cls = int(cls) | |||
box_data.append( | |||
{"position": {"minX": xyxy[0], "minY": xyxy[1], "maxX": xyxy[2], "maxY": xyxy[3]}, | |||
"class_id": int(cls), | |||
"class_id": cls, | |||
"box_caption": f"{names[cls]} {conf:.3f}", | |||
"scores": {"class_score": conf}, | |||
"domain": "pixel"}) | |||
total_conf += conf | |||
avg_conf_per_class[cls] += conf | |||
if cls in pred_class_count: | |||
pred_class_count[cls] += 1 | |||
else: | |||
pred_class_count[cls] = 1 | |||
for pred_class in pred_class_count.keys(): | |||
avg_conf_per_class[pred_class] = avg_conf_per_class[pred_class] / pred_class_count[pred_class] | |||
boxes = {"predictions": {"box_data": box_data, "class_labels": names}} # inference-space | |||
id = self.val_table_path_map[Path(path).name] | |||
self.result_table.add_data(self.current_epoch, | |||
id, | |||
self.val_table.data[id][1], | |||
wandb.Image(self.val_table.data[id][1], boxes=boxes, classes=class_set), | |||
total_conf / max(1, len(box_data)) | |||
*avg_conf_per_class | |||
) | |||
def val_one_image(self, pred, predn, path, names, im): | |||
@@ -490,7 +515,8 @@ class WandbLogger(): | |||
try: | |||
wandb.log(self.log_dict) | |||
except BaseException as e: | |||
LOGGER.info(f"An error occurred in wandb logger. The training will proceed without interruption. More info\n{e}") | |||
LOGGER.info( | |||
f"An error occurred in wandb logger. The training will proceed without interruption. More info\n{e}") | |||
self.wandb_run.finish() | |||
self.wandb_run = None | |||
@@ -502,7 +528,9 @@ class WandbLogger(): | |||
('best' if best_result else '')]) | |||
wandb.log({"evaluation": self.result_table}) | |||
self.result_table = wandb.Table(["epoch", "id", "ground truth", "prediction", "avg_confidence"]) | |||
columns = ["epoch", "id", "ground truth", "prediction"] | |||
columns.extend(self.data_dict['names']) | |||
self.result_table = wandb.Table(columns) | |||
self.result_artifact = wandb.Artifact("run_" + wandb.run.id + "_progress", "evaluation") | |||
def finish_run(self): |