2022-04-13 16:36:24,659 - mmcls - INFO - Environment info: ------------------------------------------------------------ sys.platform: linux Python: 3.6.9 (default, Dec 8 2021, 21:08:43) [GCC 8.4.0] CUDA available: False GCC: x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.10.0+cpu PyTorch compiling details: PyTorch built with: - GCC 7.3 - C++ Version: 201402 - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740) - OpenMP 201511 (a.k.a. OpenMP 4.5) - LAPACK is enabled (usually provided by MKL) - NNPACK is enabled - CPU capability usage: AVX2 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=0, USE_CUDNN=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, USE_OPENMP=ON, TorchVision: 0.11.1+cpu OpenCV: 4.5.5 MMCV: 1.4.8 MMCV Compiler: n/a MMCV CUDA Compiler: n/a MMClassification: 0.22.0+47f8c48 ------------------------------------------------------------ 2022-04-13 16:36:24,660 - mmcls - INFO - Distributed training: False 2022-04-13 16:36:26,119 - mmcls - INFO - Config: model = dict( type='ImageClassifier', backbone=dict( type='VisionTransformer', arch='b', img_size=224, patch_size=16, drop_rate=0.1, init_cfg=dict( type='Pretrained', checkpoint= 'https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth', prefix='backbone')), neck=None, head=dict( type='VisionTransformerClsHead', num_classes=1000, in_channels=768, loss=dict(type='CrossEntropyLoss', loss_weight=1.0))) policy_imagenet = [[{ 'type': 'Posterize', 'bits': 4, 'prob': 0.4 }, { 'type': 'Rotate', 'angle': 30.0, 'prob': 0.6 }], [{ 'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6 }, { 'type': 'AutoContrast', 'prob': 0.6 }], [{ 'type': 'Equalize', 'prob': 0.8 }, { 'type': 'Equalize', 'prob': 0.6 }], [{ 'type': 'Posterize', 'bits': 5, 'prob': 0.6 }, { 'type': 'Posterize', 'bits': 5, 'prob': 0.6 }], [{ 'type': 'Equalize', 'prob': 0.4 }, { 'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2 }], [{ 'type': 'Equalize', 'prob': 0.4 }, { 'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8 }], [{ 'type': 'Solarize', 'thr': 170.66666666666666, 'prob': 0.6 }, { 'type': 'Equalize', 'prob': 0.6 }], [{ 'type': 'Posterize', 'bits': 6, 'prob': 0.8 }, { 'type': 'Equalize', 'prob': 1.0 }], [{ 'type': 'Rotate', 'angle': 10.0, 'prob': 0.2 }, { 'type': 'Solarize', 'thr': 28.444444444444443, 'prob': 0.6 }], [{ 'type': 'Equalize', 'prob': 0.6 }, { 'type': 'Posterize', 'bits': 5, 'prob': 0.4 }], [{ 'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8 }, { 'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4 }], [{ 'type': 'Rotate', 'angle': 30.0, 'prob': 0.4 }, { 'type': 'Equalize', 'prob': 0.6 }], [{ 'type': 'Equalize', 'prob': 0.0 }, { 'type': 'Equalize', 'prob': 0.8 }], [{ 'type': 'Invert', 'prob': 0.6 }, { 'type': 'Equalize', 'prob': 1.0 }], [{ 'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6 }, { 'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0 }], [{ 'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8 }, { 'type': 'ColorTransform', 'magnitude': 0.2, 'prob': 1.0 }], [{ 'type': 'ColorTransform', 'magnitude': 0.8, 'prob': 0.8 }, { 'type': 'Solarize', 'thr': 56.888888888888886, 'prob': 0.8 }], [{ 'type': 'Sharpness', 'magnitude': 0.7, 'prob': 0.4 }, { 'type': 'Invert', 'prob': 0.6 }], [{ 'type': 'Shear', 'magnitude': 0.16666666666666666, 'prob': 0.6, 'direction': 'horizontal' }, { 'type': 'Equalize', 'prob': 1.0 }], [{ 'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4 }, { 'type': 'Equalize', 'prob': 0.6 }], [{ 'type': 'Equalize', 'prob': 0.4 }, { 'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2 }], [{ 'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6 }, { 'type': 'AutoContrast', 'prob': 0.6 }], [{ 'type': 'Invert', 'prob': 0.6 }, { 'type': 'Equalize', 'prob': 1.0 }], [{ 'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6 }, { 'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0 }], [{ 'type': 'Equalize', 'prob': 0.8 }, { 'type': 'Equalize', 'prob': 0.6 }]] dataset_type = 'ImageNet' img_norm_cfg = dict( mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='RandomResizedCrop', size=224, backend='pillow'), dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'), dict( type='Normalize', mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='ToTensor', keys=['gt_label']), dict(type='ToHalf', keys=['img']), dict(type='Collect', keys=['img', 'gt_label']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict(type='Resize', size=(224, -1), backend='pillow'), dict(type='CenterCrop', crop_size=224), dict( type='Normalize', mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='ToHalf', keys=['img']), dict(type='Collect', keys=['img']) ] data = dict( samples_per_gpu=17, workers_per_gpu=16, train=dict( type='ImageNet', data_prefix='data/imagenet/train', pipeline=[ dict(type='LoadImageFromFile'), dict(type='RandomResizedCrop', size=224, backend='pillow'), dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'), dict( type='Normalize', mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='ToTensor', keys=['gt_label']), dict(type='ToHalf', keys=['img']), dict(type='Collect', keys=['img', 'gt_label']) ]), val=dict( type='ImageNet', data_prefix='data/imagenet/val', ann_file='data/imagenet/meta/val.txt', pipeline=[ dict(type='LoadImageFromFile'), dict(type='Resize', size=(224, -1), backend='pillow'), dict(type='CenterCrop', crop_size=224), dict( type='Normalize', mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='ToHalf', keys=['img']), dict(type='Collect', keys=['img']) ]), test=dict( type='ImageNet', data_prefix='data/imagenet/val', ann_file='data/imagenet/meta/val.txt', pipeline=[ dict(type='LoadImageFromFile'), dict(type='Resize', size=(224, -1), backend='pillow'), dict(type='CenterCrop', crop_size=224), dict( type='Normalize', mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='ToHalf', keys=['img']), dict(type='Collect', keys=['img']) ]), drop_last=True, train_dataloader=dict(mode='async'), val_dataloader=dict(samples_per_gpu=4, workers_per_gpu=1), test_dataloader=dict(samples_per_gpu=4, workers_per_gpu=1)) evaluation = dict(interval=1, metric='accuracy') checkpoint_config = dict(interval=1000) log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] paramwise_cfg = dict( custom_keys=dict({ '.cls_token': dict(decay_mult=0.0), '.pos_embed': dict(decay_mult=0.0) })) optimizer_config = dict() optimizer = dict( type='SGD', lr=0.08, weight_decay=1e-05, momentum=0.9, paramwise_cfg=dict( custom_keys=dict({ '.cls_token': dict(decay_mult=0.0), '.pos_embed': dict(decay_mult=0.0) }))) lr_config = dict( policy='CosineAnnealing', min_lr=0, warmup='linear', warmup_iters=800, warmup_ratio=0.02) ipu_model_cfg = dict( train_split_edges=[ dict(layer_to_call='backbone.patch_embed', ipu_id=0), dict(layer_to_call='backbone.layers.3', ipu_id=1), dict(layer_to_call='backbone.layers.6', ipu_id=2), dict(layer_to_call='backbone.layers.9', ipu_id=3) ], train_ckpt_nodes=[ 'backbone.layers.0', 'backbone.layers.1', 'backbone.layers.2', 'backbone.layers.3', 'backbone.layers.4', 'backbone.layers.5', 'backbone.layers.6', 'backbone.layers.7', 'backbone.layers.8', 'backbone.layers.9', 'backbone.layers.10', 'backbone.layers.11' ]) options_cfg = dict( randomSeed=42, partialsType='half', train_cfg=dict( executionStrategy='SameAsIpu', Training=dict(gradientAccumulation=32), availableMemoryProportion=[0.3, 0.3, 0.3, 0.3]), eval_cfg=dict(deviceIterations=1)) runner = dict( type='IterBasedRunner', ipu_model_cfg=dict( train_split_edges=[ dict(layer_to_call='backbone.patch_embed', ipu_id=0), dict(layer_to_call='backbone.layers.3', ipu_id=1), dict(layer_to_call='backbone.layers.6', ipu_id=2), dict(layer_to_call='backbone.layers.9', ipu_id=3) ], train_ckpt_nodes=[ 'backbone.layers.0', 'backbone.layers.1', 'backbone.layers.2', 'backbone.layers.3', 'backbone.layers.4', 'backbone.layers.5', 'backbone.layers.6', 'backbone.layers.7', 'backbone.layers.8', 'backbone.layers.9', 'backbone.layers.10', 'backbone.layers.11' ]), options_cfg=dict( randomSeed=42, partialsType='half', train_cfg=dict( executionStrategy='SameAsIpu', Training=dict(gradientAccumulation=32), availableMemoryProportion=[0.3, 0.3, 0.3, 0.3]), eval_cfg=dict(deviceIterations=1)), max_iters=5000) fp16 = dict(loss_scale=256.0, velocity_accum_type='half', accum_type='half') work_dir = './work_dirs/vit-base-p16_ft-4xb544_in1k-224_ipu' gpu_ids = [0] ipu_replicas = 4 2022-04-13 16:36:26,120 - mmcls - INFO - Set random seed to 212038955, deterministic: False 2022-04-13 16:36:27,208 - mmcls - INFO - initialize VisionTransformer with init_cfg {'type': 'Pretrained', 'checkpoint': 'https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth', 'prefix': 'backbone'} 2022-04-13 16:36:31,729 - mmcls - INFO - initialize VisionTransformerClsHead with init_cfg {'type': 'Constant', 'layer': 'Linear', 'val': 0} Name of parameter - Initialization information backbone.cls_token - torch.Size([1, 1, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.pos_embed - torch.Size([1, 197, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.patch_embed.projection.weight - torch.Size([768, 3, 16, 16]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.patch_embed.projection.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.0.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.1.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.2.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.3.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.4.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.5.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.6.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.7.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.8.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.9.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.10.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.attn.qkv.weight - torch.Size([2304, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.attn.qkv.bias - torch.Size([2304]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.attn.proj.weight - torch.Size([768, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.attn.proj.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.ln2.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.ln2.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.ffn.layers.0.0.weight - torch.Size([3072, 768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.ffn.layers.0.0.bias - torch.Size([3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.ffn.layers.1.weight - torch.Size([768, 3072]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.layers.11.ffn.layers.1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.ln1.weight - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth backbone.ln1.bias - torch.Size([768]): PretrainedInit: load from https://download.openmmlab.com/mmclassification/v0/vit/pretrain/vit-base-p16_3rdparty_pt-64xb64_in1k-224_20210928-02284250.pth head.layers.head.weight - torch.Size([1000, 768]): ConstantInit: val=0, bias=0 head.layers.head.bias - torch.Size([1000]): ConstantInit: val=0, bias=0 2022-04-13 16:52:36,044 - mmcls - INFO - Start running, host: dihu@sgjur-pod002-3, work_dir: /localdata/cn-customer-engineering/hudi/mmclassification/work_dirs/vit-base-p16_ft-4xb544_in1k-224_ipu 2022-04-13 16:52:36,045 - mmcls - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) ipu_lr_hook_class (NORMAL ) CheckpointHook (VERY_LOW ) TextLoggerHook -------------------- before_train_epoch: (VERY_HIGH ) ipu_lr_hook_class (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook -------------------- before_train_iter: (VERY_HIGH ) ipu_lr_hook_class (LOW ) IterTimerHook -------------------- after_train_iter: (ABOVE_NORMAL) IPUFp16OptimizerHook (NORMAL ) CheckpointHook (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook -------------------- after_train_epoch: (NORMAL ) CheckpointHook (VERY_LOW ) TextLoggerHook -------------------- before_val_epoch: (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook -------------------- before_val_iter: (LOW ) IterTimerHook -------------------- after_val_iter: (LOW ) IterTimerHook -------------------- after_val_epoch: (VERY_LOW ) TextLoggerHook -------------------- after_run: (VERY_LOW ) TextLoggerHook -------------------- 2022-04-13 16:52:36,046 - mmcls - INFO - workflow: [('train', 1)], max: 5000 iters 2022-04-13 16:52:36,047 - mmcls - INFO - Checkpoints will be saved to /localdata/cn-customer-engineering/hudi/mmclassification/work_dirs/vit-base-p16_ft-4xb544_in1k-224_ipu by HardDiskBackend. 2022-04-13 17:02:18,342 - mmcls - INFO - Iter [100/5000] lr: 1.129e-02, eta: 7:55:32, time: 5.823, data_time: 0.071, loss: 4.9751 2022-04-13 17:03:02,768 - mmcls - INFO - Iter [200/5000] lr: 2.102e-02, eta: 4:10:41, time: 0.444, data_time: 0.107, loss: 1.7558 2022-04-13 17:03:47,201 - mmcls - INFO - Iter [300/5000] lr: 3.063e-02, eta: 2:55:14, time: 0.444, data_time: 0.106, loss: 1.4765 2022-04-13 17:04:26,740 - mmcls - INFO - Iter [400/5000] lr: 4.007e-02, eta: 2:16:12, time: 0.395, data_time: 0.060, loss: 1.4278 2022-04-13 17:05:03,549 - mmcls - INFO - Iter [500/5000] lr: 4.927e-02, eta: 1:52:07, time: 0.368, data_time: 0.034, loss: 1.5029 2022-04-13 17:05:42,101 - mmcls - INFO - Iter [600/5000] lr: 5.819e-02, eta: 1:36:04, time: 0.386, data_time: 0.051, loss: 1.5051 2022-04-13 17:06:18,627 - mmcls - INFO - Iter [700/5000] lr: 6.678e-02, eta: 1:24:12, time: 0.365, data_time: 0.031, loss: 1.5077 2022-04-13 17:06:55,164 - mmcls - INFO - Iter [800/5000] lr: 7.497e-02, eta: 1:15:10, time: 0.365, data_time: 0.031, loss: 1.5128 2022-04-13 17:07:31,699 - mmcls - INFO - Iter [900/5000] lr: 7.379e-02, eta: 1:08:00, time: 0.365, data_time: 0.031, loss: 1.5276 2022-04-13 17:08:08,426 - mmcls - INFO - Saving checkpoint at 1000 iterations 2022-04-13 17:08:09,223 - mmcls - INFO - Iter [1000/5000] lr: 7.238e-02, eta: 1:02:12, time: 0.375, data_time: 0.033, loss: 1.4408 2022-04-13 17:08:45,764 - mmcls - INFO - Iter [1100/5000] lr: 7.084e-02, eta: 0:57:18, time: 0.365, data_time: 0.031, loss: 1.4149 2022-04-13 17:09:24,350 - mmcls - INFO - Iter [1200/5000] lr: 6.918e-02, eta: 0:53:12, time: 0.386, data_time: 0.051, loss: 1.3670 2022-04-13 17:10:01,339 - mmcls - INFO - Iter [1300/5000] lr: 6.740e-02, eta: 0:49:35, time: 0.370, data_time: 0.035, loss: 1.2952 2022-04-13 17:10:38,860 - mmcls - INFO - Iter [1400/5000] lr: 6.552e-02, eta: 0:46:24, time: 0.375, data_time: 0.041, loss: 1.2759 2022-04-13 17:11:16,655 - mmcls - INFO - Iter [1500/5000] lr: 6.353e-02, eta: 0:43:34, time: 0.378, data_time: 0.044, loss: 1.2387 2022-04-13 17:11:54,763 - mmcls - INFO - Iter [1600/5000] lr: 6.145e-02, eta: 0:41:02, time: 0.381, data_time: 0.047, loss: 1.2119 2022-04-13 17:12:32,938 - mmcls - INFO - Iter [1700/5000] lr: 5.929e-02, eta: 0:38:43, time: 0.382, data_time: 0.047, loss: 1.1715 2022-04-13 17:13:12,331 - mmcls - INFO - Iter [1800/5000] lr: 5.705e-02, eta: 0:36:37, time: 0.394, data_time: 0.059, loss: 1.1356 2022-04-13 17:13:49,791 - mmcls - INFO - Iter [1900/5000] lr: 5.475e-02, eta: 0:34:38, time: 0.375, data_time: 0.040, loss: 1.0733 2022-04-13 17:14:27,850 - mmcls - INFO - Saving checkpoint at 2000 iterations 2022-04-13 17:14:28,516 - mmcls - INFO - Iter [2000/5000] lr: 5.238e-02, eta: 0:32:48, time: 0.387, data_time: 0.046, loss: 1.0650 2022-04-13 17:15:06,641 - mmcls - INFO - Iter [2100/5000] lr: 4.997e-02, eta: 0:31:05, time: 0.381, data_time: 0.045, loss: 1.0690 2022-04-13 17:15:45,412 - mmcls - INFO - Iter [2200/5000] lr: 4.752e-02, eta: 0:29:28, time: 0.388, data_time: 0.052, loss: 1.0750 2022-04-13 17:16:25,403 - mmcls - INFO - Iter [2300/5000] lr: 4.504e-02, eta: 0:27:57, time: 0.400, data_time: 0.060, loss: 1.0298 2022-04-13 17:17:07,474 - mmcls - INFO - Iter [2400/5000] lr: 4.254e-02, eta: 0:26:34, time: 0.421, data_time: 0.081, loss: 0.9902 2022-04-13 17:17:44,876 - mmcls - INFO - Iter [2500/5000] lr: 4.003e-02, eta: 0:25:08, time: 0.374, data_time: 0.039, loss: 0.9331 2022-04-13 17:18:22,266 - mmcls - INFO - Iter [2600/5000] lr: 3.751e-02, eta: 0:23:47, time: 0.374, data_time: 0.039, loss: 0.9368 2022-04-13 17:19:03,130 - mmcls - INFO - Iter [2700/5000] lr: 3.501e-02, eta: 0:22:31, time: 0.409, data_time: 0.064, loss: 0.9308 2022-04-13 17:19:45,023 - mmcls - INFO - Iter [2800/5000] lr: 3.253e-02, eta: 0:21:19, time: 0.419, data_time: 0.076, loss: 0.8999 2022-04-13 17:20:24,164 - mmcls - INFO - Iter [2900/5000] lr: 3.008e-02, eta: 0:20:07, time: 0.391, data_time: 0.056, loss: 0.9042 2022-04-13 17:21:08,215 - mmcls - INFO - Saving checkpoint at 3000 iterations 2022-04-13 17:21:08,997 - mmcls - INFO - Iter [3000/5000] lr: 2.766e-02, eta: 0:19:01, time: 0.448, data_time: 0.097, loss: 0.8575 2022-04-13 17:21:46,340 - mmcls - INFO - Iter [3100/5000] lr: 2.530e-02, eta: 0:17:52, time: 0.373, data_time: 0.039, loss: 0.8599 2022-04-13 17:22:25,144 - mmcls - INFO - Iter [3200/5000] lr: 2.299e-02, eta: 0:16:46, time: 0.388, data_time: 0.053, loss: 0.8356 2022-04-13 17:23:05,904 - mmcls - INFO - Iter [3300/5000] lr: 2.075e-02, eta: 0:15:42, time: 0.408, data_time: 0.064, loss: 0.8158 2022-04-13 17:23:45,077 - mmcls - INFO - Iter [3400/5000] lr: 1.859e-02, eta: 0:14:39, time: 0.392, data_time: 0.058, loss: 0.7750 2022-04-13 17:24:31,449 - mmcls - INFO - Iter [3500/5000] lr: 1.651e-02, eta: 0:13:40, time: 0.464, data_time: 0.109, loss: 0.8225 2022-04-13 17:25:13,011 - mmcls - INFO - Iter [3600/5000] lr: 1.452e-02, eta: 0:12:41, time: 0.416, data_time: 0.080, loss: 0.7546 2022-04-13 17:25:53,874 - mmcls - INFO - Iter [3700/5000] lr: 1.264e-02, eta: 0:11:41, time: 0.409, data_time: 0.072, loss: 0.7133 2022-04-13 17:26:32,456 - mmcls - INFO - Iter [3800/5000] lr: 1.086e-02, eta: 0:10:43, time: 0.386, data_time: 0.051, loss: 0.7266 2022-04-13 17:27:11,448 - mmcls - INFO - Iter [3900/5000] lr: 9.195e-03, eta: 0:09:45, time: 0.390, data_time: 0.055, loss: 0.7177 2022-04-13 17:27:49,789 - mmcls - INFO - Saving checkpoint at 4000 iterations 2022-04-13 17:27:50,406 - mmcls - INFO - Iter [4000/5000] lr: 7.654e-03, eta: 0:08:48, time: 0.390, data_time: 0.049, loss: 0.7267 2022-04-13 17:28:32,828 - mmcls - INFO - Iter [4100/5000] lr: 6.240e-03, eta: 0:07:53, time: 0.424, data_time: 0.087, loss: 0.7084 2022-04-13 17:29:13,154 - mmcls - INFO - Iter [4200/5000] lr: 4.960e-03, eta: 0:06:58, time: 0.403, data_time: 0.069, loss: 0.7003 2022-04-13 17:29:51,701 - mmcls - INFO - Iter [4300/5000] lr: 3.818e-03, eta: 0:06:03, time: 0.385, data_time: 0.051, loss: 0.6764 2022-04-13 17:30:30,116 - mmcls - INFO - Iter [4400/5000] lr: 2.818e-03, eta: 0:05:10, time: 0.384, data_time: 0.050, loss: 0.6887 2022-04-13 17:31:08,479 - mmcls - INFO - Iter [4500/5000] lr: 1.966e-03, eta: 0:04:16, time: 0.384, data_time: 0.049, loss: 0.6532 2022-04-13 17:31:47,110 - mmcls - INFO - Iter [4600/5000] lr: 1.263e-03, eta: 0:03:24, time: 0.386, data_time: 0.052, loss: 0.6630 2022-04-13 17:32:25,315 - mmcls - INFO - Iter [4700/5000] lr: 7.132e-04, eta: 0:02:32, time: 0.382, data_time: 0.048, loss: 0.7071 2022-04-13 17:33:04,552 - mmcls - INFO - Iter [4800/5000] lr: 3.186e-04, eta: 0:01:41, time: 0.392, data_time: 0.058, loss: 0.6766 2022-04-13 17:33:44,367 - mmcls - INFO - Iter [4900/5000] lr: 8.052e-05, eta: 0:00:50, time: 0.398, data_time: 0.063, loss: 0.6759 2022-04-13 17:34:23,525 - mmcls - INFO - Saving checkpoint at 5000 iterations 2022-04-13 17:34:25,273 - mmcls - INFO - Iter [5000/5000] lr: 7.896e-09, eta: 0:00:00, time: 0.409, data_time: 0.056, loss: 0.6554