Skip to content

Commit fb1d4d2

Browse files
committed
Squashed commit of the following:
commit e9aac41 Author: Ross Wightman <[email protected]> Date: Sat Jan 14 22:53:56 2023 -0800 Correct mean/std for CLIP convnexts commit 42bd8f7 Author: Ross Wightman <[email protected]> Date: Sat Jan 14 21:16:29 2023 -0800 Add convnext_base CLIP image tower weights for fine-tuning / features commit 65aea97 Author: Ross Wightman <[email protected]> Date: Thu Jan 12 21:31:44 2023 -0800 Update tests.yml Attempt to work around flaky azure ubuntu mirrors commit dd60c45 Merge: a2c14c2 e520553 Author: Ross Wightman <[email protected]> Date: Thu Jan 12 21:13:58 2023 -0800 Merge pull request huggingface#1633 from rwightman/freeze_norm_revisit Update batchnorm freezing to handle NormAct variants commit e520553 Author: Ross Wightman <[email protected]> Date: Thu Jan 12 16:55:47 2023 -0800 Update batchnorm freezing to handle NormAct variants, Add GroupNorm1Act, update BatchNormAct2d tracing change from PyTorch commit a2c14c2 Author: Ross Wightman <[email protected]> Date: Wed Jan 11 14:50:39 2023 -0800 Add tiny/small in12k pretrained and fine-tuned ConvNeXt models commit 01aea8c Author: Ross Wightman <[email protected]> Date: Mon Jan 9 13:38:31 2023 -0800 Version 0.8.6dev0 commit 2e83bba Author: Ross Wightman <[email protected]> Date: Mon Jan 9 13:37:40 2023 -0800 Revert head norm changes to ConvNeXt as it broke some downstream use, alternate workaround for fcmae weights commit 2c24cb9 Author: Ikko Eltociear Ashimine <[email protected]> Date: Tue Jan 10 01:26:57 2023 +0900 Fix typo in results/README.md occuring -> occurring commit 1825b5e Author: Ross Wightman <[email protected]> Date: Sun Jan 8 22:42:24 2023 -0800 maxxvit type commit 5078b28 Author: Ross Wightman <[email protected]> Date: Sun Jan 8 18:17:17 2023 -0800 More kwarg handling tweaks, maxvit_base_rw def added commit c0d7388 Author: Ross Wightman <[email protected]> Date: Sat Jan 7 16:29:12 2023 -0800 Improving kwarg merging in more models commit 94a9159 Author: Ross Wightman <[email protected]> Date: Fri Jan 6 21:39:25 2023 -0800 Update README.md commit d2ef5a3 Author: Ross Wightman <[email protected]> Date: Fri Jan 6 21:38:40 2023 -0800 Update README.md commit ae91530 Author: Ross Wightman <[email protected]> Date: Fri Jan 6 17:17:35 2023 -0800 Update version.py commit 60ebb6c Author: Ross Wightman <[email protected]> Date: Fri Jan 6 14:35:26 2023 -0800 Re-order vit pretrained entries for more sensible default weights (no .tag specified) commit e861b74 Author: Ross Wightman <[email protected]> Date: Fri Jan 6 12:01:43 2023 -0800 Pass through --model-kwargs (and --opt-kwargs for train) from command line through to model __init__. Update some models to improve arg overlay. Cleanup along the way. commit add3fb8 Author: Ross Wightman <[email protected]> Date: Thu Jan 5 17:50:11 2023 -0800 Working on improved model card template for push_to_hf_hub commit 13c7183 Author: Xa9aX ツ <[email protected]> Date: Fri Jan 6 10:33:08 2023 -0500 Update installation.mdx commit eb83eb3 Author: Ross Wightman <[email protected]> Date: Thu Jan 5 17:27:13 2023 -0800 Rotate changelogs, add redirects to mkdocs -> equivalent HF docs pages commit dd0bb32 Author: Ross Wightman <[email protected]> Date: Thu Jan 5 07:55:18 2023 -0800 Update version.py Ver 0.8.4dev0 commit 6e5553d Author: Ross Wightman <[email protected]> Date: Thu Jan 5 07:53:32 2023 -0800 Add ConvNeXt-V2 support (model additions and weights) (huggingface#1614) * Add ConvNeXt-V2 support (model additions and weights) * ConvNeXt-V2 weights on HF Hub, tweaking some tests * Update README, fixing convnextv2 tests commit 3698e79 Author: nateraw <[email protected]> Date: Wed Jan 4 12:33:40 2023 -0500 :bug: fix github source links in hf docs commit 9f5bba9 Author: Nathan Raw <[email protected]> Date: Tue Jan 3 17:13:53 2023 -0500 Structure Hugging Face Docs (huggingface#1575) * 🎨 structure docs * 🚧 wip docs * 📝 add installation doc * 📝 wip docs * 📝 wip docs * 📝 wip docs * 📝 wip docs * 📝 wip docs * 📝 add basic reference docs * 📝 remove augmentation from toctree * 👷 update pr doc builder to bugfix branch * 📝 wip docs * 🚧 wip * 👷 bump CI * 🚧 wip * 🚧 bump CI * 🚧 wip * 🚧 wip * 🚧 wip * 📝 add hf hub tutorial doc * 🔥 remove inference tut * 🚧 wip * 📝 wip docs * 📝 wip docs * 📝 update docs * 📝 move validation script doc up in order * 🎨 restructure to remove legacy docs * 📝 update index doc * 📝 update number of pretrained models * Update hfdocs/README.md * Update .github/workflows/build_pr_documentation.yml * Update build_pr_documentation.yml * bump * 📌 update gh action to use main branch * 🔥 remove comment commit 960f5f9 Author: Ross Wightman <[email protected]> Date: Fri Dec 30 15:42:41 2022 -0800 Update results csv with latest val/test set runs commit 6902c48 Author: Ross Wightman <[email protected]> Date: Thu Dec 29 16:32:26 2022 -0800 Fix ResNet based models to work w/ norm layers w/o affine params. Reformat long arg lists into vertical form. commit d5aa17e Author: Ross Wightman <[email protected]> Date: Wed Dec 28 17:11:35 2022 -0800 Remove print from auto_augment commit 7c846d9 Author: Ross Wightman <[email protected]> Date: Sat Dec 24 14:36:29 2022 -0800 Better vmap compat across recent torch versions commit 1304589 Author: Ross Wightman <[email protected]> Date: Fri Dec 23 15:20:43 2022 -0800 Update README.md commit d96538f Author: Ross Wightman <[email protected]> Date: Fri Dec 23 15:19:54 2022 -0800 Update README commit 4e24f75 Merge: 18ec173 8ece53e Author: Ross Wightman <[email protected]> Date: Fri Dec 23 10:09:08 2022 -0800 Merge pull request huggingface#1593 from rwightman/multi-weight_effnet_convnext Update efficientnet.py and convnext.py to multi-weight, add new 12k pretrained weights commit 8ece53e Author: Ross Wightman <[email protected]> Date: Thu Dec 22 21:43:04 2022 -0800 Switch BEiT to HF hub weights commit d1bfa9a Author: Ross Wightman <[email protected]> Date: Thu Dec 22 21:34:13 2022 -0800 Support HF datasets and TFSD w/ a sub-path by fixing split, fix huggingface#1598 ... add class mapping support to HF datasets in case class label isn't in info. commit 35fb00c Author: Ross Wightman <[email protected]> Date: Thu Dec 22 21:32:31 2022 -0800 Add flexivit to non-std tests list commit e2fc43b Author: Ross Wightman <[email protected]> Date: Thu Dec 22 17:34:09 2022 -0800 Version 0.8.2dev0 commit 9a51e4e Author: Ross Wightman <[email protected]> Date: Thu Dec 22 17:19:45 2022 -0800 Add FlexiViT models and weights, refactoring, push more weights * push all vision_transformer*.py weights to HF hub * finalize more pretrained tags for pushed weights * refactor pos_embed files and module locations, move some pos embed modules to layers * tweak hf hub helpers to aid bulk uploading and updating commit 656e177 Author: Ross Wightman <[email protected]> Date: Fri Dec 16 09:29:13 2022 -0800 Convert mobilenetv3 to multi-weight, tweak PretrainedCfg metadata commit 6a01101 Author: Ross Wightman <[email protected]> Date: Wed Dec 14 20:33:23 2022 -0800 Update efficientnet.py and convnext.py to multi-weight, add ImageNet-12k pretrained EfficientNet-B5 and ConvNeXt-Nano.
1 parent c5830cc commit fb1d4d2

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

82 files changed

+11049
-8193
lines changed

.github/workflows/build_documentation.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,5 +16,6 @@ jobs:
1616
package_name: timm
1717
repo_owner: rwightman
1818
path_to_docs: pytorch-image-models/hfdocs/source
19+
version_tag_suffix: ""
1920
secrets:
2021
token: ${{ secrets.HUGGINGFACE_PUSH }}

.github/workflows/build_pr_documentation.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,4 @@ jobs:
1717
package_name: timm
1818
repo_owner: rwightman
1919
path_to_docs: pytorch-image-models/hfdocs/source
20+
version_tag_suffix: ""

.github/workflows/tests.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,10 @@ jobs:
4040
- name: Install torch on ubuntu
4141
if: startsWith(matrix.os, 'ubuntu')
4242
run: |
43-
pip install --no-cache-dir torch==${{ matrix.torch }}+cpu torchvision==${{ matrix.torchvision }}+cpu -f https://download.pytorch.org/whl/torch_stable.html
43+
sudo sed -i 's/azure\.//' /etc/apt/sources.list
4444
sudo apt update
4545
sudo apt install -y google-perftools
46+
pip install --no-cache-dir torch==${{ matrix.torch }}+cpu torchvision==${{ matrix.torchvision }}+cpu -f https://download.pytorch.org/whl/torch_stable.html
4647
- name: Install requirements
4748
run: |
4849
pip install -r requirements.txt

.gitignore

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,16 @@ output/
106106
*.tar
107107
*.pth
108108
*.pt
109+
*.torch
109110
*.gz
110111
Untitled.ipynb
111112
Testing notebook.ipynb
113+
114+
# Root dir exclusions
115+
/*.csv
116+
/*.yaml
117+
/*.json
118+
/*.jpg
119+
/*.png
120+
/*.zip
121+
/*.tar.*

README.md

Lines changed: 32 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,36 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
2121

2222
## What's New
2323

24-
### 🤗 Survey: Feedback Appreciated 🤗
25-
26-
For a few months now, `timm` has been part of the Hugging Face ecosystem. Yearly, we survey users of our tools to see what we could do better, what we need to continue doing, or what we need to stop doing.
27-
28-
If you have a couple of minutes and want to participate in shaping the future of the ecosystem, please share your thoughts:
29-
[**hf.co/oss-survey**](https://hf.co/oss-survey) 🙏
24+
* ❗Updates after Oct 10, 2022 are available in 0.8.x pre-releases (`pip install --pre timm`) or cloning main❗
25+
* Stable releases are 0.6.x and available by normal pip install or clone from [0.6.x](https://github.com/rwightman/pytorch-image-models/tree/0.6.x) branch.
26+
27+
### Jan 11, 2023
28+
* Update ConvNeXt ImageNet-12k pretrain series w/ two new fine-tuned weights (and pre FT `.in12k` tags)
29+
* `convnext_nano.in12k_ft_in1k` - 82.3 @ 224, 82.9 @ 288 (previously released)
30+
* `convnext_tiny.in12k_ft_in1k` - 84.2 @ 224, 84.5 @ 288
31+
* `convnext_small.in12k_ft_in1k` - 85.2 @ 224, 85.3 @ 288
32+
33+
### Jan 6, 2023
34+
* Finally got around to adding `--model-kwargs` and `--opt-kwargs` to scripts to pass through rare args directly to model classes from cmd line
35+
* `train.py /imagenet --model resnet50 --amp --model-kwargs output_stride=16 act_layer=silu`
36+
* `train.py /imagenet --model vit_base_patch16_clip_224 --img-size 240 --amp --model-kwargs img_size=240 patch_size=12`
37+
* Cleanup some popular models to better support arg passthrough / merge with model configs, more to go.
38+
39+
### Jan 5, 2023
40+
* ConvNeXt-V2 models and weights added to existing `convnext.py`
41+
* Paper: [ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders](http://arxiv.org/abs/2301.00808)
42+
* Reference impl: https://github.com/facebookresearch/ConvNeXt-V2 (NOTE: weights currently CC-BY-NC)
43+
44+
### Dec 23, 2022 🎄☃
45+
* Add FlexiViT models and weights from https://github.com/google-research/big_vision (check out paper at https://arxiv.org/abs/2212.08013)
46+
* NOTE currently resizing is static on model creation, on-the-fly dynamic / train patch size sampling is a WIP
47+
* Many more models updated to multi-weight and downloadable via HF hub now (convnext, efficientnet, mobilenet, vision_transformer*, beit)
48+
* More model pretrained tag and adjustments, some model names changed (working on deprecation translations, consider main branch DEV branch right now, use 0.6.x for stable use)
49+
* More ImageNet-12k (subset of 22k) pretrain models popping up:
50+
* `efficientnet_b5.in12k_ft_in1k` - 85.9 @ 448x448
51+
* `vit_medium_patch16_gap_384.in12k_ft_in1k` - 85.5 @ 384x384
52+
* `vit_medium_patch16_gap_256.in12k_ft_in1k` - 84.5 @ 256x256
53+
* `convnext_nano.in12k_ft_in1k` - 82.9 @ 288x288
3054

3155
### Dec 8, 2022
3256
* Add 'EVA l' to `vision_transformer.py`, MAE style ViT-L/14 MIM pretrain w/ EVA-CLIP targets, FT on ImageNet-1k (w/ ImageNet-22k intermediate for some)
@@ -325,46 +349,6 @@ More models, more fixes
325349
* TinyNet models added by [rsomani95](https://github.com/rsomani95)
326350
* LCNet added via MobileNetV3 architecture
327351

328-
### Nov 22, 2021
329-
* A number of updated weights anew new model defs
330-
* `eca_halonext26ts` - 79.5 @ 256
331-
* `resnet50_gn` (new) - 80.1 @ 224, 81.3 @ 288
332-
* `resnet50` - 80.7 @ 224, 80.9 @ 288 (trained at 176, not replacing current a1 weights as default since these don't scale as well to higher res, [weights](https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-rsb-weights/resnet50_a1h2_176-001a1197.pth))
333-
* `resnext50_32x4d` - 81.1 @ 224, 82.0 @ 288
334-
* `sebotnet33ts_256` (new) - 81.2 @ 224
335-
* `lamhalobotnet50ts_256` - 81.5 @ 256
336-
* `halonet50ts` - 81.7 @ 256
337-
* `halo2botnet50ts_256` - 82.0 @ 256
338-
* `resnet101` - 82.0 @ 224, 82.8 @ 288
339-
* `resnetv2_101` (new) - 82.1 @ 224, 83.0 @ 288
340-
* `resnet152` - 82.8 @ 224, 83.5 @ 288
341-
* `regnetz_d8` (new) - 83.5 @ 256, 84.0 @ 320
342-
* `regnetz_e8` (new) - 84.5 @ 256, 85.0 @ 320
343-
* `vit_base_patch8_224` (85.8 top-1) & `in21k` variant weights added thanks [Martins Bruveris](https://github.com/martinsbruveris)
344-
* Groundwork in for FX feature extraction thanks to [Alexander Soare](https://github.com/alexander-soare)
345-
* models updated for tracing compatibility (almost full support with some distlled transformer exceptions)
346-
347-
### Oct 19, 2021
348-
* ResNet strikes back (https://arxiv.org/abs/2110.00476) weights added, plus any extra training components used. Model weights and some more details here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-rsb-weights)
349-
* BCE loss and Repeated Augmentation support for RSB paper
350-
* 4 series of ResNet based attention model experiments being added (implemented across byobnet.py/byoanet.py). These include all sorts of attention, from channel attn like SE, ECA to 2D QKV self-attention layers such as Halo, Bottlneck, Lambda. Details here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-attn-weights)
351-
* Working implementations of the following 2D self-attention modules (likely to be differences from paper or eventual official impl):
352-
* Halo (https://arxiv.org/abs/2103.12731)
353-
* Bottleneck Transformer (https://arxiv.org/abs/2101.11605)
354-
* LambdaNetworks (https://arxiv.org/abs/2102.08602)
355-
* A RegNetZ series of models with some attention experiments (being added to). These do not follow the paper (https://arxiv.org/abs/2103.06877) in any way other than block architecture, details of official models are not available. See more here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-attn-weights)
356-
* ConvMixer (https://openreview.net/forum?id=TVHS5Y4dNvM), CrossVit (https://arxiv.org/abs/2103.14899), and BeiT (https://arxiv.org/abs/2106.08254) architectures + weights added
357-
* freeze/unfreeze helpers by [Alexander Soare](https://github.com/alexander-soare)
358-
359-
### Aug 18, 2021
360-
* Optimizer bonanza!
361-
* Add LAMB and LARS optimizers, incl trust ratio clipping options. Tweaked to work properly in PyTorch XLA (tested on TPUs w/ `timm bits` [branch](https://github.com/rwightman/pytorch-image-models/tree/bits_and_tpu/timm/bits))
362-
* Add MADGRAD from FB research w/ a few tweaks (decoupled decay option, step handling that works with PyTorch XLA)
363-
* Some cleanup on all optimizers and factory. No more `.data`, a bit more consistency, unit tests for all!
364-
* SGDP and AdamP still won't work with PyTorch XLA but others should (have yet to test Adabelief, Adafactor, Adahessian myself).
365-
* EfficientNet-V2 XL TF ported weights added, but they don't validate well in PyTorch (L is better). The pre-processing for the V2 TF training is a bit diff and the fine-tuned 21k -> 1k weights are very sensitive and less robust than the 1k weights.
366-
* Added PyTorch trained EfficientNet-V2 'Tiny' w/ GlobalContext attn weights. Only .1-.2 top-1 better than the SE so more of a curiosity for those interested.
367-
368352
## Introduction
369353

370354
Py**T**orch **Im**age **M**odels (`timm`) is a collection of image models, layers, utilities, optimizers, schedulers, data-loaders / augmentations, and reference training / validation scripts that aim to pull together a wide variety of SOTA models with ability to reproduce ImageNet training results.
@@ -385,6 +369,7 @@ A full version of the list below with source links can be found in the [document
385369
* CoaT (Co-Scale Conv-Attentional Image Transformers) - https://arxiv.org/abs/2104.06399
386370
* CoAtNet (Convolution and Attention) - https://arxiv.org/abs/2106.04803
387371
* ConvNeXt - https://arxiv.org/abs/2201.03545
372+
* ConvNeXt-V2 - http://arxiv.org/abs/2301.00808
388373
* ConViT (Soft Convolutional Inductive Biases Vision Transformers)- https://arxiv.org/abs/2103.10697
389374
* CspNet (Cross-Stage Partial Networks) - https://arxiv.org/abs/1911.11929
390375
* DeiT - https://arxiv.org/abs/2012.12877
@@ -407,6 +392,7 @@ A full version of the list below with source links can be found in the [document
407392
* Single-Path NAS - https://arxiv.org/abs/1904.02877
408393
* TinyNet - https://arxiv.org/abs/2010.14819
409394
* EVA - https://arxiv.org/abs/2211.07636
395+
* FlexiViT - https://arxiv.org/abs/2212.08013
410396
* GCViT (Global Context Vision Transformer) - https://arxiv.org/abs/2206.09959
411397
* GhostNet - https://arxiv.org/abs/1911.11907
412398
* gMLP - https://arxiv.org/abs/2105.08050

benchmark.py

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
from timm.layers import set_fast_norm
2323
from timm.models import create_model, is_model, list_models
2424
from timm.optim import create_optimizer_v2
25-
from timm.utils import setup_default_logging, set_jit_fuser, decay_batch_step, check_batch_size_retry
25+
from timm.utils import setup_default_logging, set_jit_fuser, decay_batch_step, check_batch_size_retry, ParseKwargs
2626

2727
has_apex = False
2828
try:
@@ -108,12 +108,15 @@
108108
help='Enable gradient checkpointing through model blocks/stages')
109109
parser.add_argument('--amp', action='store_true', default=False,
110110
help='use PyTorch Native AMP for mixed precision training. Overrides --precision arg.')
111+
parser.add_argument('--amp-dtype', default='float16', type=str,
112+
help='lower precision AMP dtype (default: float16). Overrides --precision arg if args.amp True.')
111113
parser.add_argument('--precision', default='float32', type=str,
112114
help='Numeric precision. One of (amp, float32, float16, bfloat16, tf32)')
113115
parser.add_argument('--fuser', default='', type=str,
114116
help="Select jit fuser. One of ('', 'te', 'old', 'nvfuser')")
115117
parser.add_argument('--fast-norm', default=False, action='store_true',
116118
help='enable experimental fast-norm')
119+
parser.add_argument('--model-kwargs', nargs='*', default={}, action=ParseKwargs)
117120

118121
# codegen (model compilation) options
119122
scripting_group = parser.add_mutually_exclusive_group()
@@ -124,7 +127,6 @@
124127
scripting_group.add_argument('--aot-autograd', default=False, action='store_true',
125128
help="Enable AOT Autograd optimization.")
126129

127-
128130
# train optimizer parameters
129131
parser.add_argument('--opt', default='sgd', type=str, metavar='OPTIMIZER',
130132
help='Optimizer (default: "sgd"')
@@ -168,19 +170,21 @@ def count_params(model: nn.Module):
168170

169171

170172
def resolve_precision(precision: str):
171-
assert precision in ('amp', 'float16', 'bfloat16', 'float32')
172-
use_amp = False
173+
assert precision in ('amp', 'amp_bfloat16', 'float16', 'bfloat16', 'float32')
174+
amp_dtype = None # amp disabled
173175
model_dtype = torch.float32
174176
data_dtype = torch.float32
175177
if precision == 'amp':
176-
use_amp = True
178+
amp_dtype = torch.float16
179+
elif precision == 'amp_bfloat16':
180+
amp_dtype = torch.bfloat16
177181
elif precision == 'float16':
178182
model_dtype = torch.float16
179183
data_dtype = torch.float16
180184
elif precision == 'bfloat16':
181185
model_dtype = torch.bfloat16
182186
data_dtype = torch.bfloat16
183-
return use_amp, model_dtype, data_dtype
187+
return amp_dtype, model_dtype, data_dtype
184188

185189

186190
def profile_deepspeed(model, input_size=(3, 224, 224), batch_size=1, detailed=False):
@@ -228,9 +232,12 @@ def __init__(
228232
self.model_name = model_name
229233
self.detail = detail
230234
self.device = device
231-
self.use_amp, self.model_dtype, self.data_dtype = resolve_precision(precision)
235+
self.amp_dtype, self.model_dtype, self.data_dtype = resolve_precision(precision)
232236
self.channels_last = kwargs.pop('channels_last', False)
233-
self.amp_autocast = partial(torch.cuda.amp.autocast, dtype=torch.float16) if self.use_amp else suppress
237+
if self.amp_dtype is not None:
238+
self.amp_autocast = partial(torch.cuda.amp.autocast, dtype=self.amp_dtype)
239+
else:
240+
self.amp_autocast = suppress
234241

235242
if fuser:
236243
set_jit_fuser(fuser)
@@ -243,6 +250,7 @@ def __init__(
243250
drop_rate=kwargs.pop('drop', 0.),
244251
drop_path_rate=kwargs.pop('drop_path', None),
245252
drop_block_rate=kwargs.pop('drop_block', None),
253+
**kwargs.pop('model_kwargs', {}),
246254
)
247255
self.model.to(
248256
device=self.device,
@@ -560,7 +568,7 @@ def _try_run(
560568
def benchmark(args):
561569
if args.amp:
562570
_logger.warning("Overriding precision to 'amp' since --amp flag set.")
563-
args.precision = 'amp'
571+
args.precision = 'amp' if args.amp_dtype == 'float16' else '_'.join(['amp', args.amp_dtype])
564572
_logger.info(f'Benchmarking in {args.precision} precision. '
565573
f'{"NHWC" if args.channels_last else "NCHW"} layout. '
566574
f'torchscript {"enabled" if args.torchscript else "disabled"}')

docs/archived_changes.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,45 @@
11
# Archived Changes
22

3+
### Nov 22, 2021
4+
* A number of updated weights anew new model defs
5+
* `eca_halonext26ts` - 79.5 @ 256
6+
* `resnet50_gn` (new) - 80.1 @ 224, 81.3 @ 288
7+
* `resnet50` - 80.7 @ 224, 80.9 @ 288 (trained at 176, not replacing current a1 weights as default since these don't scale as well to higher res, [weights](https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-rsb-weights/resnet50_a1h2_176-001a1197.pth))
8+
* `resnext50_32x4d` - 81.1 @ 224, 82.0 @ 288
9+
* `sebotnet33ts_256` (new) - 81.2 @ 224
10+
* `lamhalobotnet50ts_256` - 81.5 @ 256
11+
* `halonet50ts` - 81.7 @ 256
12+
* `halo2botnet50ts_256` - 82.0 @ 256
13+
* `resnet101` - 82.0 @ 224, 82.8 @ 288
14+
* `resnetv2_101` (new) - 82.1 @ 224, 83.0 @ 288
15+
* `resnet152` - 82.8 @ 224, 83.5 @ 288
16+
* `regnetz_d8` (new) - 83.5 @ 256, 84.0 @ 320
17+
* `regnetz_e8` (new) - 84.5 @ 256, 85.0 @ 320
18+
* `vit_base_patch8_224` (85.8 top-1) & `in21k` variant weights added thanks [Martins Bruveris](https://github.com/martinsbruveris)
19+
* Groundwork in for FX feature extraction thanks to [Alexander Soare](https://github.com/alexander-soare)
20+
* models updated for tracing compatibility (almost full support with some distlled transformer exceptions)
21+
22+
### Oct 19, 2021
23+
* ResNet strikes back (https://arxiv.org/abs/2110.00476) weights added, plus any extra training components used. Model weights and some more details here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-rsb-weights)
24+
* BCE loss and Repeated Augmentation support for RSB paper
25+
* 4 series of ResNet based attention model experiments being added (implemented across byobnet.py/byoanet.py). These include all sorts of attention, from channel attn like SE, ECA to 2D QKV self-attention layers such as Halo, Bottlneck, Lambda. Details here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-attn-weights)
26+
* Working implementations of the following 2D self-attention modules (likely to be differences from paper or eventual official impl):
27+
* Halo (https://arxiv.org/abs/2103.12731)
28+
* Bottleneck Transformer (https://arxiv.org/abs/2101.11605)
29+
* LambdaNetworks (https://arxiv.org/abs/2102.08602)
30+
* A RegNetZ series of models with some attention experiments (being added to). These do not follow the paper (https://arxiv.org/abs/2103.06877) in any way other than block architecture, details of official models are not available. See more here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-attn-weights)
31+
* ConvMixer (https://openreview.net/forum?id=TVHS5Y4dNvM), CrossVit (https://arxiv.org/abs/2103.14899), and BeiT (https://arxiv.org/abs/2106.08254) architectures + weights added
32+
* freeze/unfreeze helpers by [Alexander Soare](https://github.com/alexander-soare)
33+
34+
### Aug 18, 2021
35+
* Optimizer bonanza!
36+
* Add LAMB and LARS optimizers, incl trust ratio clipping options. Tweaked to work properly in PyTorch XLA (tested on TPUs w/ `timm bits` [branch](https://github.com/rwightman/pytorch-image-models/tree/bits_and_tpu/timm/bits))
37+
* Add MADGRAD from FB research w/ a few tweaks (decoupled decay option, step handling that works with PyTorch XLA)
38+
* Some cleanup on all optimizers and factory. No more `.data`, a bit more consistency, unit tests for all!
39+
* SGDP and AdamP still won't work with PyTorch XLA but others should (have yet to test Adabelief, Adafactor, Adahessian myself).
40+
* EfficientNet-V2 XL TF ported weights added, but they don't validate well in PyTorch (L is better). The pre-processing for the V2 TF training is a bit diff and the fine-tuned 21k -> 1k weights are very sensitive and less robust than the 1k weights.
41+
* Added PyTorch trained EfficientNet-V2 'Tiny' w/ GlobalContext attn weights. Only .1-.2 top-1 better than the SE so more of a curiosity for those interested.
42+
343
### July 12, 2021
444
* Add XCiT models from [official facebook impl](https://github.com/facebookresearch/xcit). Contributed by [Alexander Soare](https://github.com/alexander-soare)
545

0 commit comments

Comments
 (0)