2023/01/23 20:29:21 - mmengine - INFO - ------------------------------------------------------------ System environment: sys.platform: linux Python: 3.7.0 (default, Oct 9 2018, 10:31:47) [GCC 7.3.0] CUDA available: True numpy_random_seed: 340417316 GPU 0,1,2,3,4,5,6,7: NVIDIA A100-SXM4-80GB CUDA_HOME: /mnt/cache/share/cuda-11.1 NVCC: Cuda compilation tools, release 11.1, V11.1.74 GCC: gcc (GCC) 5.4.0 PyTorch: 1.9.0+cu111 PyTorch compiling details: PyTorch built with: - GCC 7.3 - C++ Version: 201402 - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb) - OpenMP 201511 (a.k.a. OpenMP 4.5) - NNPACK is enabled - CPU capability usage: AVX2 - CUDA Runtime 11.1 - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86 - CuDNN 8.0.5 - Magma 2.5.2 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, TorchVision: 0.10.0+cu111 OpenCV: 4.6.0 MMEngine: 0.4.0 Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: None diff_rank_seed: False deterministic: False Distributed launcher: pytorch Distributed training: True GPU number: 32 ------------------------------------------------------------ 2023/01/23 20:29:21 - mmengine - INFO - Config: model = dict( type='Recognizer3D', backbone=dict(type='MViT', arch='small', drop_path_rate=0.2), data_preprocessor=dict( type='ActionDataPreprocessor', mean=[114.75, 114.75, 114.75], std=[57.375, 57.375, 57.375], format_shape='NCTHW', blending=dict( type='RandomBatchAugment', augments=[ dict(type='MixupBlending', alpha=0.8, num_classes=400), dict(type='CutmixBlending', alpha=1, num_classes=400) ])), cls_head=dict( type='MViTHead', in_channels=768, num_classes=400, label_smooth_eps=0.1, average_clips='prob')) default_scope = 'mmaction' default_hooks = dict( runtime_info=dict(type='RuntimeInfoHook'), timer=dict(type='IterTimerHook'), logger=dict(type='LoggerHook', interval=100, ignore_last=False), param_scheduler=dict(type='ParamSchedulerHook'), checkpoint=dict( type='CheckpointHook', interval=1, save_best='auto', max_keep_ckpts=20), sampler_seed=dict(type='DistSamplerSeedHook'), sync_buffers=dict(type='SyncBuffersHook')) env_cfg = dict( cudnn_benchmark=False, mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), dist_cfg=dict(backend='nccl')) log_processor = dict(type='LogProcessor', window_size=20, by_epoch=True) vis_backends = [ dict(type='LocalVisBackend'), dict(type='TensorboardVisBackend') ] visualizer = dict( type='ActionVisualizer', vis_backends=[ dict(type='LocalVisBackend'), dict(type='TensorboardVisBackend') ]) log_level = 'INFO' load_from = None resume = False dataset_type = 'VideoDataset' data_root = 'data/kinetics400/videos_train' data_root_val = 'data/kinetics400/videos_val' ann_file_train = 'data/kinetics400/kinetics400_train_list_videos.txt' ann_file_val = 'data/kinetics400/kinetics400_val_list_videos.txt' ann_file_test = 'data/kinetics400/kinetics400_val_list_videos.txt' file_client_args = dict( io_backend='petrel', path_mapping=dict({ 'data/kinetics400': 's254:s3://openmmlab/datasets/action/Kinetics400' })) train_pipeline = [ dict( type='DecordInit', io_backend='petrel', path_mapping=dict({ 'data/kinetics400': 's254:s3://openmmlab/datasets/action/Kinetics400' })), dict(type='SampleFrames', clip_len=16, frame_interval=4, num_clips=1), dict(type='DecordDecode'), dict(type='Resize', scale=(-1, 256)), dict( type='PytorchVideoWrapper', op='RandAugment', magnitude=7, num_layers=4), dict(type='RandomResizedCrop'), dict(type='Resize', scale=(224, 224), keep_ratio=False), dict(type='Flip', flip_ratio=0.5), dict(type='RandomErasing', erase_prob=0.25, mode='rand'), dict(type='FormatShape', input_format='NCTHW'), dict(type='PackActionInputs') ] val_pipeline = [ dict( type='DecordInit', io_backend='petrel', path_mapping=dict({ 'data/kinetics400': 's254:s3://openmmlab/datasets/action/Kinetics400' })), dict( type='SampleFrames', clip_len=16, frame_interval=4, num_clips=1, test_mode=True), dict(type='DecordDecode'), dict(type='Resize', scale=(-1, 256)), dict(type='CenterCrop', crop_size=224), dict(type='FormatShape', input_format='NCTHW'), dict(type='PackActionInputs') ] test_pipeline = [ dict( type='DecordInit', io_backend='petrel', path_mapping=dict({ 'data/kinetics400': 's254:s3://openmmlab/datasets/action/Kinetics400' })), dict( type='SampleFrames', clip_len=16, frame_interval=4, num_clips=5, test_mode=True), dict(type='DecordDecode'), dict(type='Resize', scale=(-1, 256)), dict(type='CenterCrop', crop_size=224), dict(type='FormatShape', input_format='NCTHW'), dict(type='PackActionInputs') ] repeat_sample = 2 train_dataloader = dict( batch_size=8, num_workers=8, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=True), collate_fn=dict(type='repeat_pseudo_collate'), dataset=dict( type='RepeatAugDataset', num_repeats=2, sample_once=True, ann_file='data/kinetics400/kinetics400_train_list_videos.txt', data_prefix=dict(video='data/kinetics400/videos_train'), pipeline=[ dict( type='DecordInit', io_backend='petrel', path_mapping=dict({ 'data/kinetics400': 's254:s3://openmmlab/datasets/action/Kinetics400' })), dict( type='SampleFrames', clip_len=16, frame_interval=4, num_clips=1), dict(type='DecordDecode'), dict(type='Resize', scale=(-1, 256)), dict( type='PytorchVideoWrapper', op='RandAugment', magnitude=7, num_layers=4), dict(type='RandomResizedCrop'), dict(type='Resize', scale=(224, 224), keep_ratio=False), dict(type='Flip', flip_ratio=0.5), dict(type='RandomErasing', erase_prob=0.25, mode='rand'), dict(type='FormatShape', input_format='NCTHW'), dict(type='PackActionInputs') ])) val_dataloader = dict( batch_size=8, num_workers=8, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=False), dataset=dict( type='VideoDataset', ann_file='data/kinetics400/kinetics400_val_list_videos.txt', data_prefix=dict(video='data/kinetics400/videos_val'), pipeline=[ dict( type='DecordInit', io_backend='petrel', path_mapping=dict({ 'data/kinetics400': 's254:s3://openmmlab/datasets/action/Kinetics400' })), dict( type='SampleFrames', clip_len=16, frame_interval=4, num_clips=1, test_mode=True), dict(type='DecordDecode'), dict(type='Resize', scale=(-1, 256)), dict(type='CenterCrop', crop_size=224), dict(type='FormatShape', input_format='NCTHW'), dict(type='PackActionInputs') ], test_mode=True)) test_dataloader = dict( batch_size=1, num_workers=8, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=False), dataset=dict( type='VideoDataset', ann_file='data/kinetics400/kinetics400_val_list_videos.txt', data_prefix=dict(video='data/kinetics400/videos_val'), pipeline=[ dict( type='DecordInit', io_backend='petrel', path_mapping=dict({ 'data/kinetics400': 's254:s3://openmmlab/datasets/action/Kinetics400' })), dict( type='SampleFrames', clip_len=16, frame_interval=4, num_clips=5, test_mode=True), dict(type='DecordDecode'), dict(type='Resize', scale=(-1, 256)), dict(type='CenterCrop', crop_size=224), dict(type='FormatShape', input_format='NCTHW'), dict(type='PackActionInputs') ], test_mode=True)) val_evaluator = dict(type='AccMetric') test_evaluator = dict(type='AccMetric') train_cfg = dict( type='EpochBasedTrainLoop', max_epochs=200, val_begin=1, val_interval=1) val_cfg = dict(type='ValLoop') test_cfg = dict(type='TestLoop') base_lr = 0.0016 optim_wrapper = dict( optimizer=dict( type='AdamW', lr=0.0016, betas=(0.9, 0.999), weight_decay=0.05), paramwise_cfg=dict(norm_decay_mult=0.0, bias_decay_mult=0.0), clip_grad=dict(max_norm=1, norm_type=2)) param_scheduler = [ dict( type='LinearLR', start_factor=0.01, by_epoch=True, begin=0, end=30, convert_to_iter_based=True), dict( type='CosineAnnealingLR', T_max=200, eta_min=1.6e-05, by_epoch=True, begin=30, end=200, convert_to_iter_based=True) ] auto_scale_lr = dict(enable=True, base_batch_size=256) launcher = 'pytorch' work_dir = 'work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e' randomness = dict(seed=None, diff_rank_seed=False, deterministic=False) 2023/01/23 20:29:23 - mmengine - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) RuntimeInfoHook (BELOW_NORMAL) LoggerHook -------------------- before_train: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (VERY_LOW ) CheckpointHook -------------------- before_train_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) DistSamplerSeedHook -------------------- before_train_iter: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook -------------------- after_train_iter: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook -------------------- after_train_epoch: (NORMAL ) IterTimerHook (NORMAL ) SyncBuffersHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook -------------------- before_val_epoch: (NORMAL ) IterTimerHook -------------------- before_val_iter: (NORMAL ) IterTimerHook -------------------- after_val_iter: (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook -------------------- after_val_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook (VERY_LOW ) CheckpointHook -------------------- before_test_epoch: (NORMAL ) IterTimerHook -------------------- before_test_iter: (NORMAL ) IterTimerHook -------------------- after_test_iter: (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook -------------------- after_test_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook -------------------- after_run: (BELOW_NORMAL) LoggerHook -------------------- 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.patch_embed.projection.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.0.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.1.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.2.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.3.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.4.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.5.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.6.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.7.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.8.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.9.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.10.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.11.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.12.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.13.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.14.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.norm1.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.norm1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.attn.qkv.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.attn.proj.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.attn.norm_q.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.attn.norm_q.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.attn.norm_k.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.attn.norm_k.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.attn.norm_v.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.attn.norm_v.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.norm2.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.norm2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.mlp.fc1.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.blocks.15.mlp.fc2.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.norm3.weight:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- backbone.norm3.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - paramwise_options -- cls_head.fc_cls.bias:weight_decay=0.0 2023/01/23 20:29:24 - mmengine - INFO - LR is set based on batch size of 256 and the current batch size is 256. Scaling the original LR by 1.0. Name of parameter - Initialization information backbone.cls_token - torch.Size([1, 1, 96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.patch_embed.projection.weight - torch.Size([96, 3, 3, 7, 7]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.patch_embed.projection.bias - torch.Size([96]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.0.norm1.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.0.norm1.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.0.attn.rel_pos_h - torch.Size([111, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.0.attn.rel_pos_w - torch.Size([111, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.0.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.0.attn.qkv.weight - torch.Size([288, 96]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.0.attn.qkv.bias - torch.Size([288]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.0.attn.proj.weight - torch.Size([96, 96]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.0.attn.proj.bias - torch.Size([96]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.0.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.0.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.0.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.0.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.0.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.0.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.0.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.0.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.0.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.0.norm2.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.0.norm2.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.0.mlp.fc1.weight - torch.Size([384, 96]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.0.mlp.fc1.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.0.mlp.fc2.weight - torch.Size([96, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.0.mlp.fc2.bias - torch.Size([96]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.1.norm1.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.1.norm1.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.1.attn.rel_pos_h - torch.Size([55, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.1.attn.rel_pos_w - torch.Size([55, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.1.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.1.attn.qkv.weight - torch.Size([576, 96]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.1.attn.qkv.bias - torch.Size([576]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.1.attn.proj.weight - torch.Size([192, 192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.1.attn.proj.bias - torch.Size([192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.1.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.1.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.1.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.1.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.1.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.1.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.1.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.1.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.1.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.1.norm2.weight - torch.Size([192]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.1.norm2.bias - torch.Size([192]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.1.mlp.fc1.weight - torch.Size([768, 192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.1.mlp.fc1.bias - torch.Size([768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.1.mlp.fc2.weight - torch.Size([192, 768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.1.mlp.fc2.bias - torch.Size([192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.1.proj.weight - torch.Size([192, 96]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.1.proj.bias - torch.Size([192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.2.norm1.weight - torch.Size([192]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.2.norm1.bias - torch.Size([192]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.2.attn.rel_pos_h - torch.Size([55, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.2.attn.rel_pos_w - torch.Size([55, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.2.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.2.attn.qkv.weight - torch.Size([576, 192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.2.attn.qkv.bias - torch.Size([576]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.2.attn.proj.weight - torch.Size([192, 192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.2.attn.proj.bias - torch.Size([192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.2.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.2.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.2.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.2.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.2.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.2.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.2.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.2.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.2.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.2.norm2.weight - torch.Size([192]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.2.norm2.bias - torch.Size([192]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.2.mlp.fc1.weight - torch.Size([768, 192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.2.mlp.fc1.bias - torch.Size([768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.2.mlp.fc2.weight - torch.Size([192, 768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.2.mlp.fc2.bias - torch.Size([192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.3.norm1.weight - torch.Size([192]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.3.norm1.bias - torch.Size([192]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.3.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.3.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.3.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.3.attn.qkv.weight - torch.Size([1152, 192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.3.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.3.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.3.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.3.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.3.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.3.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.3.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.3.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.3.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.3.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.3.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.3.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.3.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.3.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.3.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.3.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.3.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.3.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.3.proj.weight - torch.Size([384, 192]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.3.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.4.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.4.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.4.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.4.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.4.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.4.attn.qkv.weight - torch.Size([1152, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.4.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.4.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.4.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.4.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.4.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.4.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.4.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.4.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.4.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.4.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.4.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.4.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.4.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.4.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.4.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.4.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.4.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.4.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.5.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.5.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.5.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.5.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.5.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.5.attn.qkv.weight - torch.Size([1152, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.5.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.5.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.5.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.5.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.5.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.5.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.5.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.5.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.5.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.5.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.5.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.5.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.5.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.5.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.5.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.5.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.5.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.5.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.6.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.6.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.6.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.6.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.6.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.6.attn.qkv.weight - torch.Size([1152, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.6.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.6.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.6.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.6.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.6.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.6.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.6.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.6.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.6.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.6.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.6.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.6.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.6.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.6.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.6.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.6.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.6.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.6.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.7.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.7.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.7.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.7.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.7.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.7.attn.qkv.weight - torch.Size([1152, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.7.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.7.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.7.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.7.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.7.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.7.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.7.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.7.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.7.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.7.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.7.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.7.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.7.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.7.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.7.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.7.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.7.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.7.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.8.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.8.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.8.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.8.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.8.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.8.attn.qkv.weight - torch.Size([1152, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.8.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.8.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.8.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.8.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.8.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.8.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.8.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.8.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.8.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.8.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.8.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.8.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.8.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.8.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.8.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.8.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.8.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.8.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.9.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.9.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.9.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.9.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.9.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.9.attn.qkv.weight - torch.Size([1152, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.9.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.9.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.9.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.9.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.9.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.9.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.9.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.9.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.9.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.9.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.9.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.9.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.9.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.9.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.9.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.9.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.9.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.9.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.10.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.10.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.10.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.10.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.10.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.10.attn.qkv.weight - torch.Size([1152, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.10.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.10.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.10.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.10.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.10.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.10.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.10.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.10.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.10.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.10.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.10.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.10.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.10.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.10.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.10.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.10.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.10.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.10.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.11.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.11.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.11.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.11.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.11.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.11.attn.qkv.weight - torch.Size([1152, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.11.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.11.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.11.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.11.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.11.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.11.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.11.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.11.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.11.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.11.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.11.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.11.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.11.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.11.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.11.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.11.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.11.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.11.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.12.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.12.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.12.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.12.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.12.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.12.attn.qkv.weight - torch.Size([1152, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.12.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.12.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.12.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.12.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.12.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.12.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.12.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.12.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.12.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.12.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.12.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.12.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.12.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.12.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.12.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.12.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.12.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.12.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.13.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.13.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.13.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.13.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.13.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.13.attn.qkv.weight - torch.Size([1152, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.13.attn.qkv.bias - torch.Size([1152]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.13.attn.proj.weight - torch.Size([384, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.13.attn.proj.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.13.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.13.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.13.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.13.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.13.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.13.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.13.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.13.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.13.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.13.norm2.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.13.norm2.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.13.mlp.fc1.weight - torch.Size([1536, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.13.mlp.fc1.bias - torch.Size([1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.13.mlp.fc2.weight - torch.Size([384, 1536]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.13.mlp.fc2.bias - torch.Size([384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.14.norm1.weight - torch.Size([384]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.14.norm1.bias - torch.Size([384]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.14.attn.rel_pos_h - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.14.attn.rel_pos_w - torch.Size([27, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.14.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.14.attn.qkv.weight - torch.Size([2304, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.14.attn.qkv.bias - torch.Size([2304]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.14.attn.proj.weight - torch.Size([768, 768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.14.attn.proj.bias - torch.Size([768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.14.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.14.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.14.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.14.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.14.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.14.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.14.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.14.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.14.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.14.norm2.weight - torch.Size([768]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.14.norm2.bias - torch.Size([768]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.14.mlp.fc1.weight - torch.Size([3072, 768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.14.mlp.fc1.bias - torch.Size([3072]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.14.mlp.fc2.weight - torch.Size([768, 3072]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.14.mlp.fc2.bias - torch.Size([768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.14.proj.weight - torch.Size([768, 384]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.14.proj.bias - torch.Size([768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.15.norm1.weight - torch.Size([768]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.15.norm1.bias - torch.Size([768]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.15.attn.rel_pos_h - torch.Size([13, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.15.attn.rel_pos_w - torch.Size([13, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.15.attn.rel_pos_t - torch.Size([15, 96]): Initialized by user-defined `init_weights` in MultiScaleAttention backbone.blocks.15.attn.qkv.weight - torch.Size([2304, 768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.15.attn.qkv.bias - torch.Size([2304]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.15.attn.proj.weight - torch.Size([768, 768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.15.attn.proj.bias - torch.Size([768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.15.attn.pool_q.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.15.attn.norm_q.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.15.attn.norm_q.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.15.attn.pool_k.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.15.attn.norm_k.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.15.attn.norm_k.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.15.attn.pool_v.weight - torch.Size([96, 1, 3, 3, 3]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0 backbone.blocks.15.attn.norm_v.weight - torch.Size([96]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.15.attn.norm_v.bias - torch.Size([96]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.15.norm2.weight - torch.Size([768]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.blocks.15.norm2.bias - torch.Size([768]): ConstantInit: val=1.0, bias=0.02 backbone.blocks.15.mlp.fc1.weight - torch.Size([3072, 768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.15.mlp.fc1.bias - torch.Size([3072]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.15.mlp.fc2.weight - torch.Size([768, 3072]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.blocks.15.mlp.fc2.bias - torch.Size([768]): TruncNormalInit: a=-2, b=2, mean=0, std=0.02, bias=0.02 backbone.norm3.weight - torch.Size([768]): The value is the same before and after calling `init_weights` of Recognizer3D backbone.norm3.bias - torch.Size([768]): ConstantInit: val=1.0, bias=0.02 cls_head.fc_cls.weight - torch.Size([400, 768]): The value is the same before and after calling `init_weights` of Recognizer3D cls_head.fc_cls.bias - torch.Size([400]): The value is the same before and after calling `init_weights` of Recognizer3D 2023/01/23 20:29:25 - mmengine - INFO - Checkpoints will be saved to /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e. 2023/01/23 20:32:08 - mmengine - INFO - Epoch(train) [1][100/940] lr: 2.1561e-05 eta: 3 days, 13:04:03 time: 0.9445 data_time: 0.0141 memory: 28828 grad_norm: 4.0223 loss: 6.1484 loss_cls: 6.1484 2023/01/23 20:33:42 - mmengine - INFO - Epoch(train) [1][200/940] lr: 2.7178e-05 eta: 2 days, 19:05:19 time: 0.9371 data_time: 0.0141 memory: 28828 grad_norm: 3.0517 loss: 6.0661 loss_cls: 6.0661 2023/01/23 20:35:15 - mmengine - INFO - Epoch(train) [1][300/940] lr: 3.2795e-05 eta: 2 days, 12:57:45 time: 0.9316 data_time: 0.0166 memory: 28828 grad_norm: 2.6916 loss: 6.1346 loss_cls: 6.1346 2023/01/23 20:36:49 - mmengine - INFO - Epoch(train) [1][400/940] lr: 3.8413e-05 eta: 2 days, 9:52:18 time: 0.9341 data_time: 0.0178 memory: 28828 grad_norm: 2.6116 loss: 6.0864 loss_cls: 6.0864 2023/01/23 20:38:22 - mmengine - INFO - Epoch(train) [1][500/940] lr: 4.4030e-05 eta: 2 days, 8:00:54 time: 0.9331 data_time: 0.0188 memory: 28828 grad_norm: 3.5836 loss: 6.0147 loss_cls: 6.0147 2023/01/23 20:39:56 - mmengine - INFO - Epoch(train) [1][600/940] lr: 4.9647e-05 eta: 2 days, 6:47:14 time: 0.9330 data_time: 0.0224 memory: 28828 grad_norm: 3.6823 loss: 5.9852 loss_cls: 5.9852 2023/01/23 20:41:30 - mmengine - INFO - Epoch(train) [1][700/940] lr: 5.5264e-05 eta: 2 days, 5:54:14 time: 0.9432 data_time: 0.0235 memory: 28828 grad_norm: 3.9733 loss: 5.9193 loss_cls: 5.9193 2023/01/23 20:43:04 - mmengine - INFO - Epoch(train) [1][800/940] lr: 6.0882e-05 eta: 2 days, 5:15:44 time: 0.9415 data_time: 0.0201 memory: 28828 grad_norm: 3.8125 loss: 5.8898 loss_cls: 5.8898 2023/01/23 20:44:38 - mmengine - INFO - Epoch(train) [1][900/940] lr: 6.6499e-05 eta: 2 days, 4:45:15 time: 0.9427 data_time: 0.0219 memory: 28828 grad_norm: 3.7139 loss: 5.8663 loss_cls: 5.8663 2023/01/23 20:45:14 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 20:45:14 - mmengine - INFO - Epoch(train) [1][940/940] lr: 6.8746e-05 eta: 2 days, 4:29:52 time: 0.8687 data_time: 0.0172 memory: 28828 grad_norm: 3.7325 loss: 5.8538 loss_cls: 5.8538 2023/01/23 20:45:14 - mmengine - INFO - Saving checkpoint at 1 epochs 2023/01/23 20:46:20 - mmengine - INFO - Epoch(val) [1][78/78] acc/top1: 0.0243 acc/top5: 0.0793 acc/mean1: 0.0242 2023/01/23 20:46:21 - mmengine - INFO - The best checkpoint with 0.0243 acc/top1 at 1 epoch is saved to best_acc/top1_epoch_1.pth. 2023/01/23 20:47:27 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 20:48:05 - mmengine - INFO - Epoch(train) [2][100/940] lr: 7.4363e-05 eta: 2 days, 4:36:20 time: 0.9478 data_time: 0.0267 memory: 28828 grad_norm: 4.1147 loss: 5.8861 loss_cls: 5.8861 2023/01/23 20:49:39 - mmengine - INFO - Epoch(train) [2][200/940] lr: 7.9980e-05 eta: 2 days, 4:13:35 time: 0.9334 data_time: 0.0296 memory: 28828 grad_norm: 3.3882 loss: 5.9021 loss_cls: 5.9021 2023/01/23 20:51:12 - mmengine - INFO - Epoch(train) [2][300/940] lr: 8.5597e-05 eta: 2 days, 3:53:20 time: 0.9345 data_time: 0.0278 memory: 28828 grad_norm: 3.4185 loss: 5.8177 loss_cls: 5.8177 2023/01/23 20:52:45 - mmengine - INFO - Epoch(train) [2][400/940] lr: 9.1215e-05 eta: 2 days, 3:35:49 time: 0.9306 data_time: 0.0252 memory: 28828 grad_norm: 3.5014 loss: 5.8325 loss_cls: 5.8325 2023/01/23 20:54:18 - mmengine - INFO - Epoch(train) [2][500/940] lr: 9.6832e-05 eta: 2 days, 3:20:17 time: 0.9352 data_time: 0.0262 memory: 28828 grad_norm: 3.5702 loss: 5.7769 loss_cls: 5.7769 2023/01/23 20:55:51 - mmengine - INFO - Epoch(train) [2][600/940] lr: 1.0245e-04 eta: 2 days, 3:06:39 time: 0.9314 data_time: 0.0270 memory: 28828 grad_norm: 3.0516 loss: 5.6508 loss_cls: 5.6508 2023/01/23 20:57:24 - mmengine - INFO - Epoch(train) [2][700/940] lr: 1.0807e-04 eta: 2 days, 2:54:32 time: 0.9381 data_time: 0.0266 memory: 28828 grad_norm: 3.1018 loss: 5.7515 loss_cls: 5.7515 2023/01/23 20:58:58 - mmengine - INFO - Epoch(train) [2][800/940] lr: 1.1368e-04 eta: 2 days, 2:44:00 time: 0.9354 data_time: 0.0268 memory: 28828 grad_norm: 3.0634 loss: 5.7125 loss_cls: 5.7125 2023/01/23 21:00:31 - mmengine - INFO - Epoch(train) [2][900/940] lr: 1.1930e-04 eta: 2 days, 2:34:00 time: 0.9274 data_time: 0.0257 memory: 28828 grad_norm: 2.9443 loss: 5.6091 loss_cls: 5.6091 2023/01/23 21:01:07 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 21:01:07 - mmengine - INFO - Epoch(train) [2][940/940] lr: 1.2155e-04 eta: 2 days, 2:28:21 time: 0.8696 data_time: 0.0157 memory: 28828 grad_norm: 2.9616 loss: 5.6809 loss_cls: 5.6809 2023/01/23 21:01:07 - mmengine - INFO - Saving checkpoint at 2 epochs 2023/01/23 21:01:23 - mmengine - INFO - Epoch(val) [2][78/78] acc/top1: 0.0400 acc/top5: 0.1358 acc/mean1: 0.0398 2023/01/23 21:01:23 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_1.pth is removed 2023/01/23 21:01:24 - mmengine - INFO - The best checkpoint with 0.0400 acc/top1 at 2 epoch is saved to best_acc/top1_epoch_2.pth. 2023/01/23 21:03:06 - mmengine - INFO - Epoch(train) [3][100/940] lr: 1.2716e-04 eta: 2 days, 2:32:03 time: 0.9299 data_time: 0.0278 memory: 28828 grad_norm: 3.0613 loss: 5.6901 loss_cls: 5.6901 2023/01/23 21:03:24 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 21:04:39 - mmengine - INFO - Epoch(train) [3][200/940] lr: 1.3278e-04 eta: 2 days, 2:23:18 time: 0.9360 data_time: 0.0269 memory: 28828 grad_norm: 2.9625 loss: 5.6835 loss_cls: 5.6835 2023/01/23 21:06:12 - mmengine - INFO - Epoch(train) [3][300/940] lr: 1.3840e-04 eta: 2 days, 2:15:17 time: 0.9302 data_time: 0.0286 memory: 28828 grad_norm: 3.0749 loss: 5.5133 loss_cls: 5.5133 2023/01/23 21:07:45 - mmengine - INFO - Epoch(train) [3][400/940] lr: 1.4402e-04 eta: 2 days, 2:08:06 time: 0.9377 data_time: 0.0269 memory: 28828 grad_norm: 3.0399 loss: 5.6335 loss_cls: 5.6335 2023/01/23 21:09:18 - mmengine - INFO - Epoch(train) [3][500/940] lr: 1.4963e-04 eta: 2 days, 2:01:12 time: 0.9271 data_time: 0.0291 memory: 28828 grad_norm: 2.8750 loss: 5.6313 loss_cls: 5.6313 2023/01/23 21:10:51 - mmengine - INFO - Epoch(train) [3][600/940] lr: 1.5525e-04 eta: 2 days, 1:54:50 time: 0.9319 data_time: 0.0288 memory: 28828 grad_norm: 2.9826 loss: 5.5793 loss_cls: 5.5793 2023/01/23 21:12:24 - mmengine - INFO - Epoch(train) [3][700/940] lr: 1.6087e-04 eta: 2 days, 1:48:49 time: 0.9363 data_time: 0.0295 memory: 28828 grad_norm: 3.0030 loss: 5.5776 loss_cls: 5.5776 2023/01/23 21:13:58 - mmengine - INFO - Epoch(train) [3][800/940] lr: 1.6649e-04 eta: 2 days, 1:43:22 time: 0.9322 data_time: 0.0283 memory: 28828 grad_norm: 2.7658 loss: 5.4999 loss_cls: 5.4999 2023/01/23 21:15:31 - mmengine - INFO - Epoch(train) [3][900/940] lr: 1.7210e-04 eta: 2 days, 1:37:50 time: 0.9303 data_time: 0.0249 memory: 28828 grad_norm: 2.6775 loss: 5.5104 loss_cls: 5.5104 2023/01/23 21:16:07 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 21:16:07 - mmengine - INFO - Epoch(train) [3][940/940] lr: 1.7435e-04 eta: 2 days, 1:34:18 time: 0.8655 data_time: 0.0176 memory: 28828 grad_norm: 2.8642 loss: 5.4843 loss_cls: 5.4843 2023/01/23 21:16:07 - mmengine - INFO - Saving checkpoint at 3 epochs 2023/01/23 21:16:22 - mmengine - INFO - Epoch(val) [3][78/78] acc/top1: 0.0611 acc/top5: 0.1874 acc/mean1: 0.0610 2023/01/23 21:16:22 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_2.pth is removed 2023/01/23 21:16:24 - mmengine - INFO - The best checkpoint with 0.0611 acc/top1 at 3 epoch is saved to best_acc/top1_epoch_3.pth. 2023/01/23 21:18:06 - mmengine - INFO - Epoch(train) [4][100/940] lr: 1.7997e-04 eta: 2 days, 1:38:32 time: 0.9368 data_time: 0.0245 memory: 28828 grad_norm: 2.7579 loss: 5.5203 loss_cls: 5.5203 2023/01/23 21:19:20 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 21:19:39 - mmengine - INFO - Epoch(train) [4][200/940] lr: 1.8558e-04 eta: 2 days, 1:33:22 time: 0.9328 data_time: 0.0273 memory: 28828 grad_norm: 2.7013 loss: 5.6208 loss_cls: 5.6208 2023/01/23 21:21:14 - mmengine - INFO - Epoch(train) [4][300/940] lr: 1.9120e-04 eta: 2 days, 1:30:12 time: 0.9406 data_time: 0.0282 memory: 28828 grad_norm: 2.5254 loss: 5.5278 loss_cls: 5.5278 2023/01/23 21:22:48 - mmengine - INFO - Epoch(train) [4][400/940] lr: 1.9682e-04 eta: 2 days, 1:26:09 time: 0.9327 data_time: 0.0313 memory: 28828 grad_norm: 2.7506 loss: 5.5233 loss_cls: 5.5233 2023/01/23 21:24:21 - mmengine - INFO - Epoch(train) [4][500/940] lr: 2.0244e-04 eta: 2 days, 1:21:38 time: 0.9336 data_time: 0.0299 memory: 28828 grad_norm: 2.6840 loss: 5.5167 loss_cls: 5.5167 2023/01/23 21:25:54 - mmengine - INFO - Epoch(train) [4][600/940] lr: 2.0805e-04 eta: 2 days, 1:17:12 time: 0.9267 data_time: 0.0306 memory: 28828 grad_norm: 2.6273 loss: 5.5464 loss_cls: 5.5464 2023/01/23 21:27:27 - mmengine - INFO - Epoch(train) [4][700/940] lr: 2.1367e-04 eta: 2 days, 1:12:41 time: 0.9299 data_time: 0.0301 memory: 28828 grad_norm: 2.6779 loss: 5.5808 loss_cls: 5.5808 2023/01/23 21:29:00 - mmengine - INFO - Epoch(train) [4][800/940] lr: 2.1929e-04 eta: 2 days, 1:08:39 time: 0.9301 data_time: 0.0311 memory: 28828 grad_norm: 2.7454 loss: 5.5536 loss_cls: 5.5536 2023/01/23 21:30:33 - mmengine - INFO - Epoch(train) [4][900/940] lr: 2.2490e-04 eta: 2 days, 1:04:46 time: 0.9340 data_time: 0.0320 memory: 28828 grad_norm: 2.3370 loss: 5.5040 loss_cls: 5.5040 2023/01/23 21:31:09 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 21:31:09 - mmengine - INFO - Epoch(train) [4][940/940] lr: 2.2715e-04 eta: 2 days, 1:02:10 time: 0.8659 data_time: 0.0188 memory: 28828 grad_norm: 2.5931 loss: 5.3826 loss_cls: 5.3826 2023/01/23 21:31:09 - mmengine - INFO - Saving checkpoint at 4 epochs 2023/01/23 21:31:25 - mmengine - INFO - Epoch(val) [4][78/78] acc/top1: 0.0787 acc/top5: 0.2187 acc/mean1: 0.0784 2023/01/23 21:31:25 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_3.pth is removed 2023/01/23 21:31:26 - mmengine - INFO - The best checkpoint with 0.0787 acc/top1 at 4 epoch is saved to best_acc/top1_epoch_4.pth. 2023/01/23 21:33:08 - mmengine - INFO - Epoch(train) [5][100/940] lr: 2.3277e-04 eta: 2 days, 1:05:17 time: 0.9299 data_time: 0.0291 memory: 28828 grad_norm: 2.5213 loss: 5.5057 loss_cls: 5.5057 2023/01/23 21:34:41 - mmengine - INFO - Epoch(train) [5][200/940] lr: 2.3839e-04 eta: 2 days, 1:01:22 time: 0.9308 data_time: 0.0311 memory: 28828 grad_norm: 2.7509 loss: 5.5312 loss_cls: 5.5312 2023/01/23 21:35:18 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 21:36:14 - mmengine - INFO - Epoch(train) [5][300/940] lr: 2.4400e-04 eta: 2 days, 0:57:31 time: 0.9273 data_time: 0.0298 memory: 28828 grad_norm: 2.4860 loss: 5.3030 loss_cls: 5.3030 2023/01/23 21:37:47 - mmengine - INFO - Epoch(train) [5][400/940] lr: 2.4962e-04 eta: 2 days, 0:53:53 time: 0.9302 data_time: 0.0294 memory: 28828 grad_norm: 2.4579 loss: 5.3558 loss_cls: 5.3558 2023/01/23 21:39:20 - mmengine - INFO - Epoch(train) [5][500/940] lr: 2.5524e-04 eta: 2 days, 0:50:31 time: 0.9369 data_time: 0.0345 memory: 28828 grad_norm: 2.3785 loss: 5.3276 loss_cls: 5.3276 2023/01/23 21:40:53 - mmengine - INFO - Epoch(train) [5][600/940] lr: 2.6085e-04 eta: 2 days, 0:46:58 time: 0.9267 data_time: 0.0299 memory: 28828 grad_norm: 2.5989 loss: 5.2950 loss_cls: 5.2950 2023/01/23 21:42:26 - mmengine - INFO - Epoch(train) [5][700/940] lr: 2.6647e-04 eta: 2 days, 0:43:28 time: 0.9311 data_time: 0.0299 memory: 28828 grad_norm: 2.1729 loss: 5.4857 loss_cls: 5.4857 2023/01/23 21:43:59 - mmengine - INFO - Epoch(train) [5][800/940] lr: 2.7209e-04 eta: 2 days, 0:40:02 time: 0.9271 data_time: 0.0307 memory: 28828 grad_norm: 2.2351 loss: 5.5254 loss_cls: 5.5254 2023/01/23 21:45:32 - mmengine - INFO - Epoch(train) [5][900/940] lr: 2.7771e-04 eta: 2 days, 0:36:54 time: 0.9352 data_time: 0.0307 memory: 28828 grad_norm: 2.7385 loss: 5.4444 loss_cls: 5.4444 2023/01/23 21:46:08 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 21:46:08 - mmengine - INFO - Epoch(train) [5][940/940] lr: 2.7995e-04 eta: 2 days, 0:34:52 time: 0.8661 data_time: 0.0192 memory: 28828 grad_norm: 2.4933 loss: 5.4922 loss_cls: 5.4922 2023/01/23 21:46:08 - mmengine - INFO - Saving checkpoint at 5 epochs 2023/01/23 21:46:30 - mmengine - INFO - Epoch(val) [5][78/78] acc/top1: 0.1009 acc/top5: 0.2780 acc/mean1: 0.1008 2023/01/23 21:46:30 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_4.pth is removed 2023/01/23 21:46:31 - mmengine - INFO - The best checkpoint with 0.1009 acc/top1 at 5 epoch is saved to best_acc/top1_epoch_5.pth. 2023/01/23 21:48:12 - mmengine - INFO - Epoch(train) [6][100/940] lr: 2.8557e-04 eta: 2 days, 0:37:05 time: 0.9297 data_time: 0.0273 memory: 28828 grad_norm: 2.3337 loss: 5.5250 loss_cls: 5.5250 2023/01/23 21:49:45 - mmengine - INFO - Epoch(train) [6][200/940] lr: 2.9119e-04 eta: 2 days, 0:33:54 time: 0.9281 data_time: 0.0288 memory: 28828 grad_norm: 2.1961 loss: 5.4755 loss_cls: 5.4755 2023/01/23 21:51:18 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 21:51:18 - mmengine - INFO - Epoch(train) [6][300/940] lr: 2.9680e-04 eta: 2 days, 0:30:47 time: 0.9307 data_time: 0.0368 memory: 28828 grad_norm: 2.0414 loss: 5.3910 loss_cls: 5.3910 2023/01/23 21:52:52 - mmengine - INFO - Epoch(train) [6][400/940] lr: 3.0242e-04 eta: 2 days, 0:28:00 time: 0.9308 data_time: 0.0322 memory: 28828 grad_norm: 2.1820 loss: 5.2566 loss_cls: 5.2566 2023/01/23 21:54:25 - mmengine - INFO - Epoch(train) [6][500/940] lr: 3.0804e-04 eta: 2 days, 0:25:11 time: 0.9340 data_time: 0.0346 memory: 28828 grad_norm: 2.1287 loss: 5.3693 loss_cls: 5.3693 2023/01/23 21:55:59 - mmengine - INFO - Epoch(train) [6][600/940] lr: 3.1366e-04 eta: 2 days, 0:22:30 time: 0.9327 data_time: 0.0301 memory: 28828 grad_norm: 1.9304 loss: 5.4582 loss_cls: 5.4582 2023/01/23 21:57:32 - mmengine - INFO - Epoch(train) [6][700/940] lr: 3.1927e-04 eta: 2 days, 0:19:45 time: 0.9342 data_time: 0.0287 memory: 28828 grad_norm: 1.8440 loss: 5.3557 loss_cls: 5.3557 2023/01/23 21:59:05 - mmengine - INFO - Epoch(train) [6][800/940] lr: 3.2489e-04 eta: 2 days, 0:17:01 time: 0.9358 data_time: 0.0316 memory: 28828 grad_norm: 2.0094 loss: 5.2955 loss_cls: 5.2955 2023/01/23 22:00:38 - mmengine - INFO - Epoch(train) [6][900/940] lr: 3.3051e-04 eta: 2 days, 0:14:13 time: 0.9335 data_time: 0.0295 memory: 28828 grad_norm: 2.0288 loss: 5.3854 loss_cls: 5.3854 2023/01/23 22:01:14 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 22:01:14 - mmengine - INFO - Epoch(train) [6][940/940] lr: 3.3276e-04 eta: 2 days, 0:12:25 time: 0.8650 data_time: 0.0189 memory: 28828 grad_norm: 2.1629 loss: 5.3829 loss_cls: 5.3829 2023/01/23 22:01:14 - mmengine - INFO - Saving checkpoint at 6 epochs 2023/01/23 22:01:32 - mmengine - INFO - Epoch(val) [6][78/78] acc/top1: 0.1220 acc/top5: 0.3052 acc/mean1: 0.1219 2023/01/23 22:01:32 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_5.pth is removed 2023/01/23 22:01:34 - mmengine - INFO - The best checkpoint with 0.1220 acc/top1 at 6 epoch is saved to best_acc/top1_epoch_6.pth. 2023/01/23 22:03:16 - mmengine - INFO - Epoch(train) [7][100/940] lr: 3.3837e-04 eta: 2 days, 0:14:34 time: 0.9271 data_time: 0.0300 memory: 28828 grad_norm: 2.3006 loss: 5.3134 loss_cls: 5.3134 2023/01/23 22:04:49 - mmengine - INFO - Epoch(train) [7][200/940] lr: 3.4399e-04 eta: 2 days, 0:11:46 time: 0.9297 data_time: 0.0306 memory: 28828 grad_norm: 1.9092 loss: 5.3983 loss_cls: 5.3983 2023/01/23 22:06:22 - mmengine - INFO - Epoch(train) [7][300/940] lr: 3.4961e-04 eta: 2 days, 0:09:05 time: 0.9305 data_time: 0.0316 memory: 28828 grad_norm: 1.7775 loss: 5.4932 loss_cls: 5.4932 2023/01/23 22:07:18 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 22:07:55 - mmengine - INFO - Epoch(train) [7][400/940] lr: 3.5522e-04 eta: 2 days, 0:06:20 time: 0.9270 data_time: 0.0290 memory: 28828 grad_norm: 1.9507 loss: 5.2303 loss_cls: 5.2303 2023/01/23 22:09:28 - mmengine - INFO - Epoch(train) [7][500/940] lr: 3.6084e-04 eta: 2 days, 0:03:38 time: 0.9256 data_time: 0.0325 memory: 28828 grad_norm: 1.7598 loss: 5.3016 loss_cls: 5.3016 2023/01/23 22:11:01 - mmengine - INFO - Epoch(train) [7][600/940] lr: 3.6646e-04 eta: 2 days, 0:01:01 time: 0.9292 data_time: 0.0288 memory: 28828 grad_norm: 1.8793 loss: 5.0879 loss_cls: 5.0879 2023/01/23 22:12:34 - mmengine - INFO - Epoch(train) [7][700/940] lr: 3.7208e-04 eta: 1 day, 23:58:25 time: 0.9324 data_time: 0.0293 memory: 28828 grad_norm: 1.6398 loss: 5.2318 loss_cls: 5.2318 2023/01/23 22:14:07 - mmengine - INFO - Epoch(train) [7][800/940] lr: 3.7769e-04 eta: 1 day, 23:55:48 time: 0.9315 data_time: 0.0297 memory: 28828 grad_norm: 1.5853 loss: 5.4735 loss_cls: 5.4735 2023/01/23 22:15:40 - mmengine - INFO - Epoch(train) [7][900/940] lr: 3.8331e-04 eta: 1 day, 23:53:17 time: 0.9278 data_time: 0.0301 memory: 28828 grad_norm: 1.7242 loss: 5.3481 loss_cls: 5.3481 2023/01/23 22:16:15 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 22:16:15 - mmengine - INFO - Epoch(train) [7][940/940] lr: 3.8556e-04 eta: 1 day, 23:51:41 time: 0.8677 data_time: 0.0200 memory: 28828 grad_norm: 1.6953 loss: 5.2503 loss_cls: 5.2503 2023/01/23 22:16:15 - mmengine - INFO - Saving checkpoint at 7 epochs 2023/01/23 22:16:31 - mmengine - INFO - Epoch(val) [7][78/78] acc/top1: 0.1422 acc/top5: 0.3476 acc/mean1: 0.1420 2023/01/23 22:16:31 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_6.pth is removed 2023/01/23 22:16:32 - mmengine - INFO - The best checkpoint with 0.1422 acc/top1 at 7 epoch is saved to best_acc/top1_epoch_7.pth. 2023/01/23 22:18:15 - mmengine - INFO - Epoch(train) [8][100/940] lr: 3.9117e-04 eta: 1 day, 23:53:23 time: 0.9303 data_time: 0.0303 memory: 28828 grad_norm: 1.7561 loss: 5.3039 loss_cls: 5.3039 2023/01/23 22:19:48 - mmengine - INFO - Epoch(train) [8][200/940] lr: 3.9679e-04 eta: 1 day, 23:50:52 time: 0.9297 data_time: 0.0322 memory: 28828 grad_norm: 1.7813 loss: 5.2273 loss_cls: 5.2273 2023/01/23 22:21:21 - mmengine - INFO - Epoch(train) [8][300/940] lr: 4.0241e-04 eta: 1 day, 23:48:20 time: 0.9294 data_time: 0.0282 memory: 28828 grad_norm: 1.6450 loss: 5.2799 loss_cls: 5.2799 2023/01/23 22:22:53 - mmengine - INFO - Epoch(train) [8][400/940] lr: 4.0803e-04 eta: 1 day, 23:45:49 time: 0.9270 data_time: 0.0318 memory: 28828 grad_norm: 1.5670 loss: 5.3931 loss_cls: 5.3931 2023/01/23 22:23:12 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 22:24:26 - mmengine - INFO - Epoch(train) [8][500/940] lr: 4.1364e-04 eta: 1 day, 23:43:21 time: 0.9316 data_time: 0.0285 memory: 28828 grad_norm: 1.6401 loss: 5.2424 loss_cls: 5.2424 2023/01/23 22:25:59 - mmengine - INFO - Epoch(train) [8][600/940] lr: 4.1926e-04 eta: 1 day, 23:40:55 time: 0.9311 data_time: 0.0283 memory: 28828 grad_norm: 1.5313 loss: 5.2679 loss_cls: 5.2679 2023/01/23 22:27:32 - mmengine - INFO - Epoch(train) [8][700/940] lr: 4.2488e-04 eta: 1 day, 23:38:37 time: 0.9355 data_time: 0.0262 memory: 28828 grad_norm: 1.6227 loss: 5.1381 loss_cls: 5.1381 2023/01/23 22:29:06 - mmengine - INFO - Epoch(train) [8][800/940] lr: 4.3049e-04 eta: 1 day, 23:36:19 time: 0.9324 data_time: 0.0317 memory: 28828 grad_norm: 1.5982 loss: 5.2599 loss_cls: 5.2599 2023/01/23 22:30:38 - mmengine - INFO - Epoch(train) [8][900/940] lr: 4.3611e-04 eta: 1 day, 23:33:56 time: 0.9280 data_time: 0.0269 memory: 28828 grad_norm: 1.4309 loss: 5.2478 loss_cls: 5.2478 2023/01/23 22:31:14 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 22:31:14 - mmengine - INFO - Epoch(train) [8][940/940] lr: 4.3836e-04 eta: 1 day, 23:32:30 time: 0.8678 data_time: 0.0174 memory: 28828 grad_norm: 1.8274 loss: 5.2657 loss_cls: 5.2657 2023/01/23 22:31:14 - mmengine - INFO - Saving checkpoint at 8 epochs 2023/01/23 22:31:30 - mmengine - INFO - Epoch(val) [8][78/78] acc/top1: 0.1678 acc/top5: 0.3872 acc/mean1: 0.1675 2023/01/23 22:31:30 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_7.pth is removed 2023/01/23 22:31:31 - mmengine - INFO - The best checkpoint with 0.1678 acc/top1 at 8 epoch is saved to best_acc/top1_epoch_8.pth. 2023/01/23 22:33:13 - mmengine - INFO - Epoch(train) [9][100/940] lr: 4.4398e-04 eta: 1 day, 23:33:35 time: 0.9359 data_time: 0.0296 memory: 28828 grad_norm: 1.5651 loss: 5.2069 loss_cls: 5.2069 2023/01/23 22:34:46 - mmengine - INFO - Epoch(train) [9][200/940] lr: 4.4959e-04 eta: 1 day, 23:31:14 time: 0.9301 data_time: 0.0270 memory: 28828 grad_norm: 1.3830 loss: 5.3177 loss_cls: 5.3177 2023/01/23 22:36:19 - mmengine - INFO - Epoch(train) [9][300/940] lr: 4.5521e-04 eta: 1 day, 23:28:58 time: 0.9284 data_time: 0.0277 memory: 28828 grad_norm: 1.7562 loss: 5.0155 loss_cls: 5.0155 2023/01/23 22:37:52 - mmengine - INFO - Epoch(train) [9][400/940] lr: 4.6083e-04 eta: 1 day, 23:26:46 time: 0.9318 data_time: 0.0270 memory: 28828 grad_norm: 1.2850 loss: 5.3374 loss_cls: 5.3374 2023/01/23 22:39:07 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 22:39:25 - mmengine - INFO - Epoch(train) [9][500/940] lr: 4.6644e-04 eta: 1 day, 23:24:33 time: 0.9350 data_time: 0.0281 memory: 28828 grad_norm: 1.3996 loss: 5.4645 loss_cls: 5.4645 2023/01/23 22:40:58 - mmengine - INFO - Epoch(train) [9][600/940] lr: 4.7206e-04 eta: 1 day, 23:22:16 time: 0.9278 data_time: 0.0250 memory: 28828 grad_norm: 1.5427 loss: 5.1027 loss_cls: 5.1027 2023/01/23 22:42:31 - mmengine - INFO - Epoch(train) [9][700/940] lr: 4.7768e-04 eta: 1 day, 23:20:03 time: 0.9310 data_time: 0.0279 memory: 28828 grad_norm: 1.5187 loss: 5.1165 loss_cls: 5.1165 2023/01/23 22:44:05 - mmengine - INFO - Epoch(train) [9][800/940] lr: 4.8330e-04 eta: 1 day, 23:17:51 time: 0.9318 data_time: 0.0251 memory: 28828 grad_norm: 1.4881 loss: 5.2652 loss_cls: 5.2652 2023/01/23 22:45:37 - mmengine - INFO - Epoch(train) [9][900/940] lr: 4.8891e-04 eta: 1 day, 23:15:36 time: 0.9275 data_time: 0.0286 memory: 28828 grad_norm: 1.2911 loss: 5.3239 loss_cls: 5.3239 2023/01/23 22:46:13 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 22:46:13 - mmengine - INFO - Epoch(train) [9][940/940] lr: 4.9116e-04 eta: 1 day, 23:14:16 time: 0.8652 data_time: 0.0182 memory: 28828 grad_norm: 1.5003 loss: 5.1409 loss_cls: 5.1409 2023/01/23 22:46:13 - mmengine - INFO - Saving checkpoint at 9 epochs 2023/01/23 22:46:29 - mmengine - INFO - Epoch(val) [9][78/78] acc/top1: 0.1916 acc/top5: 0.4316 acc/mean1: 0.1914 2023/01/23 22:46:29 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_8.pth is removed 2023/01/23 22:46:31 - mmengine - INFO - The best checkpoint with 0.1916 acc/top1 at 9 epoch is saved to best_acc/top1_epoch_9.pth. 2023/01/23 22:48:12 - mmengine - INFO - Epoch(train) [10][100/940] lr: 4.9678e-04 eta: 1 day, 23:15:01 time: 0.9347 data_time: 0.0306 memory: 28828 grad_norm: 1.3534 loss: 5.0636 loss_cls: 5.0636 2023/01/23 22:49:45 - mmengine - INFO - Epoch(train) [10][200/940] lr: 5.0240e-04 eta: 1 day, 23:12:52 time: 0.9330 data_time: 0.0293 memory: 28828 grad_norm: 1.4208 loss: 5.1764 loss_cls: 5.1764 2023/01/23 22:51:18 - mmengine - INFO - Epoch(train) [10][300/940] lr: 5.0801e-04 eta: 1 day, 23:10:38 time: 0.9290 data_time: 0.0296 memory: 28828 grad_norm: 1.4553 loss: 5.1178 loss_cls: 5.1178 2023/01/23 22:52:51 - mmengine - INFO - Epoch(train) [10][400/940] lr: 5.1363e-04 eta: 1 day, 23:08:27 time: 0.9316 data_time: 0.0309 memory: 28828 grad_norm: 1.3950 loss: 5.1149 loss_cls: 5.1149 2023/01/23 22:54:24 - mmengine - INFO - Epoch(train) [10][500/940] lr: 5.1925e-04 eta: 1 day, 23:06:21 time: 0.9291 data_time: 0.0307 memory: 28828 grad_norm: 1.4616 loss: 4.8969 loss_cls: 4.8969 2023/01/23 22:55:01 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 22:55:57 - mmengine - INFO - Epoch(train) [10][600/940] lr: 5.2486e-04 eta: 1 day, 23:04:14 time: 0.9343 data_time: 0.0296 memory: 28828 grad_norm: 1.3785 loss: 5.1235 loss_cls: 5.1235 2023/01/23 22:57:31 - mmengine - INFO - Epoch(train) [10][700/940] lr: 5.3048e-04 eta: 1 day, 23:02:13 time: 0.9275 data_time: 0.0319 memory: 28828 grad_norm: 1.2130 loss: 5.1397 loss_cls: 5.1397 2023/01/23 22:59:04 - mmengine - INFO - Epoch(train) [10][800/940] lr: 5.3610e-04 eta: 1 day, 23:00:11 time: 0.9344 data_time: 0.0283 memory: 28828 grad_norm: 1.2631 loss: 5.1059 loss_cls: 5.1059 2023/01/23 23:00:38 - mmengine - INFO - Epoch(train) [10][900/940] lr: 5.4172e-04 eta: 1 day, 22:58:18 time: 0.9344 data_time: 0.0259 memory: 28828 grad_norm: 1.3248 loss: 5.1094 loss_cls: 5.1094 2023/01/23 23:01:14 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 23:01:14 - mmengine - INFO - Epoch(train) [10][940/940] lr: 5.4396e-04 eta: 1 day, 22:57:05 time: 0.8669 data_time: 0.0170 memory: 28828 grad_norm: 1.3250 loss: 5.1150 loss_cls: 5.1150 2023/01/23 23:01:14 - mmengine - INFO - Saving checkpoint at 10 epochs 2023/01/23 23:01:31 - mmengine - INFO - Epoch(val) [10][78/78] acc/top1: 0.2157 acc/top5: 0.4616 acc/mean1: 0.2155 2023/01/23 23:01:31 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_9.pth is removed 2023/01/23 23:01:32 - mmengine - INFO - The best checkpoint with 0.2157 acc/top1 at 10 epoch is saved to best_acc/top1_epoch_10.pth. 2023/01/23 23:03:16 - mmengine - INFO - Epoch(train) [11][100/940] lr: 5.4958e-04 eta: 1 day, 22:58:30 time: 0.9359 data_time: 0.0302 memory: 28828 grad_norm: 1.4174 loss: 4.9803 loss_cls: 4.9803 2023/01/23 23:04:49 - mmengine - INFO - Epoch(train) [11][200/940] lr: 5.5520e-04 eta: 1 day, 22:56:25 time: 0.9316 data_time: 0.0292 memory: 28828 grad_norm: 1.3557 loss: 4.9361 loss_cls: 4.9361 2023/01/23 23:06:22 - mmengine - INFO - Epoch(train) [11][300/940] lr: 5.6081e-04 eta: 1 day, 22:54:20 time: 0.9306 data_time: 0.0287 memory: 28828 grad_norm: 1.4043 loss: 4.9350 loss_cls: 4.9350 2023/01/23 23:07:56 - mmengine - INFO - Epoch(train) [11][400/940] lr: 5.6643e-04 eta: 1 day, 22:52:17 time: 0.9328 data_time: 0.0289 memory: 28828 grad_norm: 1.2645 loss: 5.0838 loss_cls: 5.0838 2023/01/23 23:09:29 - mmengine - INFO - Epoch(train) [11][500/940] lr: 5.7205e-04 eta: 1 day, 22:50:15 time: 0.9351 data_time: 0.0281 memory: 28828 grad_norm: 1.4285 loss: 4.8318 loss_cls: 4.8318 2023/01/23 23:11:02 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 23:11:02 - mmengine - INFO - Epoch(train) [11][600/940] lr: 5.7767e-04 eta: 1 day, 22:48:11 time: 0.9333 data_time: 0.0315 memory: 28828 grad_norm: 1.2246 loss: 4.8473 loss_cls: 4.8473 2023/01/23 23:12:35 - mmengine - INFO - Epoch(train) [11][700/940] lr: 5.8328e-04 eta: 1 day, 22:46:07 time: 0.9316 data_time: 0.0299 memory: 28828 grad_norm: 1.2484 loss: 5.0327 loss_cls: 5.0327 2023/01/23 23:14:08 - mmengine - INFO - Epoch(train) [11][800/940] lr: 5.8890e-04 eta: 1 day, 22:44:03 time: 0.9309 data_time: 0.0274 memory: 28828 grad_norm: 1.3373 loss: 5.0154 loss_cls: 5.0154 2023/01/23 23:15:41 - mmengine - INFO - Epoch(train) [11][900/940] lr: 5.9452e-04 eta: 1 day, 22:42:00 time: 0.9339 data_time: 0.0308 memory: 28828 grad_norm: 1.2492 loss: 5.0587 loss_cls: 5.0587 2023/01/23 23:16:17 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 23:16:17 - mmengine - INFO - Epoch(train) [11][940/940] lr: 5.9676e-04 eta: 1 day, 22:40:52 time: 0.8662 data_time: 0.0170 memory: 28828 grad_norm: 1.3219 loss: 5.0691 loss_cls: 5.0691 2023/01/23 23:16:17 - mmengine - INFO - Saving checkpoint at 11 epochs 2023/01/23 23:16:33 - mmengine - INFO - Epoch(val) [11][78/78] acc/top1: 0.2477 acc/top5: 0.5057 acc/mean1: 0.2475 2023/01/23 23:16:33 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_10.pth is removed 2023/01/23 23:16:34 - mmengine - INFO - The best checkpoint with 0.2477 acc/top1 at 11 epoch is saved to best_acc/top1_epoch_11.pth. 2023/01/23 23:18:15 - mmengine - INFO - Epoch(train) [12][100/940] lr: 6.0238e-04 eta: 1 day, 22:41:13 time: 0.9285 data_time: 0.0290 memory: 28828 grad_norm: 1.1782 loss: 4.9006 loss_cls: 4.9006 2023/01/23 23:19:48 - mmengine - INFO - Epoch(train) [12][200/940] lr: 6.0800e-04 eta: 1 day, 22:39:11 time: 0.9306 data_time: 0.0262 memory: 28828 grad_norm: 1.1796 loss: 5.0815 loss_cls: 5.0815 2023/01/23 23:21:21 - mmengine - INFO - Epoch(train) [12][300/940] lr: 6.1362e-04 eta: 1 day, 22:37:06 time: 0.9304 data_time: 0.0277 memory: 28828 grad_norm: 1.2016 loss: 4.9729 loss_cls: 4.9729 2023/01/23 23:22:54 - mmengine - INFO - Epoch(train) [12][400/940] lr: 6.1923e-04 eta: 1 day, 22:35:06 time: 0.9318 data_time: 0.0267 memory: 28828 grad_norm: 1.2008 loss: 4.9642 loss_cls: 4.9642 2023/01/23 23:24:27 - mmengine - INFO - Epoch(train) [12][500/940] lr: 6.2485e-04 eta: 1 day, 22:33:03 time: 0.9300 data_time: 0.0274 memory: 28828 grad_norm: 1.3103 loss: 4.8352 loss_cls: 4.8352 2023/01/23 23:26:00 - mmengine - INFO - Epoch(train) [12][600/940] lr: 6.3047e-04 eta: 1 day, 22:31:04 time: 0.9299 data_time: 0.0303 memory: 28828 grad_norm: 1.1690 loss: 4.9800 loss_cls: 4.9800 2023/01/23 23:26:56 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 23:27:33 - mmengine - INFO - Epoch(train) [12][700/940] lr: 6.3608e-04 eta: 1 day, 22:29:06 time: 0.9352 data_time: 0.0287 memory: 28828 grad_norm: 1.3086 loss: 4.8691 loss_cls: 4.8691 2023/01/23 23:29:07 - mmengine - INFO - Epoch(train) [12][800/940] lr: 6.4170e-04 eta: 1 day, 22:27:16 time: 0.9352 data_time: 0.0312 memory: 28828 grad_norm: 1.1414 loss: 4.7525 loss_cls: 4.7525 2023/01/23 23:30:40 - mmengine - INFO - Epoch(train) [12][900/940] lr: 6.4732e-04 eta: 1 day, 22:25:21 time: 0.9296 data_time: 0.0286 memory: 28828 grad_norm: 1.2084 loss: 4.9491 loss_cls: 4.9491 2023/01/23 23:31:16 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 23:31:16 - mmengine - INFO - Epoch(train) [12][940/940] lr: 6.4957e-04 eta: 1 day, 22:24:13 time: 0.8648 data_time: 0.0174 memory: 28828 grad_norm: 1.1595 loss: 4.8190 loss_cls: 4.8190 2023/01/23 23:31:16 - mmengine - INFO - Saving checkpoint at 12 epochs 2023/01/23 23:31:32 - mmengine - INFO - Epoch(val) [12][78/78] acc/top1: 0.2686 acc/top5: 0.5277 acc/mean1: 0.2685 2023/01/23 23:31:32 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_11.pth is removed 2023/01/23 23:31:33 - mmengine - INFO - The best checkpoint with 0.2686 acc/top1 at 12 epoch is saved to best_acc/top1_epoch_12.pth. 2023/01/23 23:33:16 - mmengine - INFO - Epoch(train) [13][100/940] lr: 6.5518e-04 eta: 1 day, 22:24:42 time: 0.9286 data_time: 0.0318 memory: 28828 grad_norm: 1.1330 loss: 4.9008 loss_cls: 4.9008 2023/01/23 23:34:49 - mmengine - INFO - Epoch(train) [13][200/940] lr: 6.6080e-04 eta: 1 day, 22:22:46 time: 0.9304 data_time: 0.0283 memory: 28828 grad_norm: 1.2201 loss: 4.8305 loss_cls: 4.8305 2023/01/23 23:36:23 - mmengine - INFO - Epoch(train) [13][300/940] lr: 6.6642e-04 eta: 1 day, 22:21:05 time: 0.9300 data_time: 0.0313 memory: 28828 grad_norm: 1.2957 loss: 4.9106 loss_cls: 4.9106 2023/01/23 23:37:56 - mmengine - INFO - Epoch(train) [13][400/940] lr: 6.7204e-04 eta: 1 day, 22:19:09 time: 0.9318 data_time: 0.0290 memory: 28828 grad_norm: 1.1704 loss: 4.9708 loss_cls: 4.9708 2023/01/23 23:39:29 - mmengine - INFO - Epoch(train) [13][500/940] lr: 6.7765e-04 eta: 1 day, 22:17:12 time: 0.9303 data_time: 0.0273 memory: 28828 grad_norm: 1.0688 loss: 5.0039 loss_cls: 5.0039 2023/01/23 23:41:03 - mmengine - INFO - Epoch(train) [13][600/940] lr: 6.8327e-04 eta: 1 day, 22:15:33 time: 0.9945 data_time: 0.0280 memory: 28828 grad_norm: 1.1963 loss: 4.7074 loss_cls: 4.7074 2023/01/23 23:42:36 - mmengine - INFO - Epoch(train) [13][700/940] lr: 6.8889e-04 eta: 1 day, 22:13:36 time: 0.9314 data_time: 0.0333 memory: 28828 grad_norm: 1.1636 loss: 4.8704 loss_cls: 4.8704 2023/01/23 23:42:55 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 23:44:09 - mmengine - INFO - Epoch(train) [13][800/940] lr: 6.9450e-04 eta: 1 day, 22:11:39 time: 0.9325 data_time: 0.0272 memory: 28828 grad_norm: 1.1230 loss: 4.9814 loss_cls: 4.9814 2023/01/23 23:45:46 - mmengine - INFO - Epoch(train) [13][900/940] lr: 7.0012e-04 eta: 1 day, 22:10:38 time: 0.9302 data_time: 0.0301 memory: 28828 grad_norm: 1.0622 loss: 4.7280 loss_cls: 4.7280 2023/01/23 23:46:22 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 23:46:22 - mmengine - INFO - Epoch(train) [13][940/940] lr: 7.0237e-04 eta: 1 day, 22:09:32 time: 0.8660 data_time: 0.0182 memory: 28828 grad_norm: 1.1720 loss: 4.7721 loss_cls: 4.7721 2023/01/23 23:46:22 - mmengine - INFO - Saving checkpoint at 13 epochs 2023/01/23 23:46:38 - mmengine - INFO - Epoch(val) [13][78/78] acc/top1: 0.2908 acc/top5: 0.5599 acc/mean1: 0.2906 2023/01/23 23:46:38 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_12.pth is removed 2023/01/23 23:46:40 - mmengine - INFO - The best checkpoint with 0.2908 acc/top1 at 13 epoch is saved to best_acc/top1_epoch_13.pth. 2023/01/23 23:48:21 - mmengine - INFO - Epoch(train) [14][100/940] lr: 7.0799e-04 eta: 1 day, 22:09:30 time: 0.9300 data_time: 0.0305 memory: 28828 grad_norm: 1.2016 loss: 4.6063 loss_cls: 4.6063 2023/01/23 23:49:53 - mmengine - INFO - Epoch(train) [14][200/940] lr: 7.1360e-04 eta: 1 day, 22:07:30 time: 0.9306 data_time: 0.0296 memory: 28828 grad_norm: 1.1372 loss: 4.7088 loss_cls: 4.7088 2023/01/23 23:51:26 - mmengine - INFO - Epoch(train) [14][300/940] lr: 7.1922e-04 eta: 1 day, 22:05:33 time: 0.9281 data_time: 0.0284 memory: 28828 grad_norm: 1.0061 loss: 5.0018 loss_cls: 5.0018 2023/01/23 23:53:00 - mmengine - INFO - Epoch(train) [14][400/940] lr: 7.2484e-04 eta: 1 day, 22:03:46 time: 0.9643 data_time: 0.0297 memory: 28828 grad_norm: 1.0822 loss: 4.8583 loss_cls: 4.8583 2023/01/23 23:54:33 - mmengine - INFO - Epoch(train) [14][500/940] lr: 7.3045e-04 eta: 1 day, 22:01:49 time: 0.9288 data_time: 0.0270 memory: 28828 grad_norm: 1.0830 loss: 4.9455 loss_cls: 4.9455 2023/01/23 23:56:06 - mmengine - INFO - Epoch(train) [14][600/940] lr: 7.3607e-04 eta: 1 day, 21:59:55 time: 0.9317 data_time: 0.0299 memory: 28828 grad_norm: 1.1355 loss: 4.7523 loss_cls: 4.7523 2023/01/23 23:57:39 - mmengine - INFO - Epoch(train) [14][700/940] lr: 7.4169e-04 eta: 1 day, 21:57:57 time: 0.9284 data_time: 0.0296 memory: 28828 grad_norm: 1.1253 loss: 4.8232 loss_cls: 4.8232 2023/01/23 23:58:53 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/23 23:59:14 - mmengine - INFO - Epoch(train) [14][800/940] lr: 7.4731e-04 eta: 1 day, 21:56:31 time: 1.0449 data_time: 0.0271 memory: 28828 grad_norm: 1.1384 loss: 4.7164 loss_cls: 4.7164 2023/01/24 00:00:47 - mmengine - INFO - Epoch(train) [14][900/940] lr: 7.5292e-04 eta: 1 day, 21:54:36 time: 0.9286 data_time: 0.0307 memory: 28828 grad_norm: 1.0087 loss: 5.0792 loss_cls: 5.0792 2023/01/24 00:01:23 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 00:01:23 - mmengine - INFO - Epoch(train) [14][940/940] lr: 7.5517e-04 eta: 1 day, 21:53:36 time: 0.8728 data_time: 0.0174 memory: 28828 grad_norm: 1.0725 loss: 4.6402 loss_cls: 4.6402 2023/01/24 00:01:23 - mmengine - INFO - Saving checkpoint at 14 epochs 2023/01/24 00:01:39 - mmengine - INFO - Epoch(val) [14][78/78] acc/top1: 0.3140 acc/top5: 0.5899 acc/mean1: 0.3138 2023/01/24 00:01:39 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_13.pth is removed 2023/01/24 00:01:40 - mmengine - INFO - The best checkpoint with 0.3140 acc/top1 at 14 epoch is saved to best_acc/top1_epoch_14.pth. 2023/01/24 00:03:23 - mmengine - INFO - Epoch(train) [15][100/940] lr: 7.6079e-04 eta: 1 day, 21:53:50 time: 0.9285 data_time: 0.0297 memory: 28828 grad_norm: 1.0206 loss: 4.9119 loss_cls: 4.9119 2023/01/24 00:04:56 - mmengine - INFO - Epoch(train) [15][200/940] lr: 7.6640e-04 eta: 1 day, 21:51:55 time: 0.9274 data_time: 0.0255 memory: 28828 grad_norm: 1.1393 loss: 4.9088 loss_cls: 4.9088 2023/01/24 00:06:29 - mmengine - INFO - Epoch(train) [15][300/940] lr: 7.7202e-04 eta: 1 day, 21:50:00 time: 0.9330 data_time: 0.0300 memory: 28828 grad_norm: 1.0311 loss: 4.8972 loss_cls: 4.8972 2023/01/24 00:08:02 - mmengine - INFO - Epoch(train) [15][400/940] lr: 7.7764e-04 eta: 1 day, 21:48:03 time: 0.9272 data_time: 0.0265 memory: 28828 grad_norm: 1.0219 loss: 4.8921 loss_cls: 4.8921 2023/01/24 00:09:35 - mmengine - INFO - Epoch(train) [15][500/940] lr: 7.8326e-04 eta: 1 day, 21:46:09 time: 0.9309 data_time: 0.0269 memory: 28828 grad_norm: 1.0766 loss: 4.7190 loss_cls: 4.7190 2023/01/24 00:11:14 - mmengine - INFO - Epoch(train) [15][600/940] lr: 7.8887e-04 eta: 1 day, 21:45:27 time: 0.9257 data_time: 0.0291 memory: 28828 grad_norm: 1.0708 loss: 4.6670 loss_cls: 4.6670 2023/01/24 00:12:46 - mmengine - INFO - Epoch(train) [15][700/940] lr: 7.9449e-04 eta: 1 day, 21:43:32 time: 0.9297 data_time: 0.0345 memory: 28828 grad_norm: 1.0612 loss: 4.7552 loss_cls: 4.7552 2023/01/24 00:14:28 - mmengine - INFO - Epoch(train) [15][800/940] lr: 8.0011e-04 eta: 1 day, 21:43:29 time: 1.3856 data_time: 0.0280 memory: 28828 grad_norm: 1.0452 loss: 4.8451 loss_cls: 4.8451 2023/01/24 00:15:05 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 00:16:01 - mmengine - INFO - Epoch(train) [15][900/940] lr: 8.0573e-04 eta: 1 day, 21:41:30 time: 0.9248 data_time: 0.0304 memory: 28828 grad_norm: 1.0054 loss: 4.8555 loss_cls: 4.8555 2023/01/24 00:16:37 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 00:16:37 - mmengine - INFO - Epoch(train) [15][940/940] lr: 8.0797e-04 eta: 1 day, 21:40:28 time: 0.8667 data_time: 0.0191 memory: 28828 grad_norm: 1.1219 loss: 4.7702 loss_cls: 4.7702 2023/01/24 00:16:37 - mmengine - INFO - Saving checkpoint at 15 epochs 2023/01/24 00:16:53 - mmengine - INFO - Epoch(val) [15][78/78] acc/top1: 0.3295 acc/top5: 0.6027 acc/mean1: 0.3293 2023/01/24 00:16:53 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_14.pth is removed 2023/01/24 00:16:54 - mmengine - INFO - The best checkpoint with 0.3295 acc/top1 at 15 epoch is saved to best_acc/top1_epoch_15.pth. 2023/01/24 00:18:36 - mmengine - INFO - Epoch(train) [16][100/940] lr: 8.1359e-04 eta: 1 day, 21:40:29 time: 0.9313 data_time: 0.0343 memory: 28828 grad_norm: 1.0688 loss: 4.8506 loss_cls: 4.8506 2023/01/24 00:20:09 - mmengine - INFO - Epoch(train) [16][200/940] lr: 8.1921e-04 eta: 1 day, 21:38:35 time: 0.9301 data_time: 0.0307 memory: 28828 grad_norm: 1.0310 loss: 4.5640 loss_cls: 4.5640 2023/01/24 00:21:44 - mmengine - INFO - Epoch(train) [16][300/940] lr: 8.2482e-04 eta: 1 day, 21:36:59 time: 0.9299 data_time: 0.0292 memory: 28828 grad_norm: 1.0165 loss: 5.0097 loss_cls: 5.0097 2023/01/24 00:23:23 - mmengine - INFO - Epoch(train) [16][400/940] lr: 8.3044e-04 eta: 1 day, 21:36:25 time: 0.9305 data_time: 0.0356 memory: 28828 grad_norm: 0.9920 loss: 4.8228 loss_cls: 4.8228 2023/01/24 00:25:02 - mmengine - INFO - Epoch(train) [16][500/940] lr: 8.3606e-04 eta: 1 day, 21:35:42 time: 0.9285 data_time: 0.0301 memory: 28828 grad_norm: 1.0716 loss: 4.6763 loss_cls: 4.6763 2023/01/24 00:26:36 - mmengine - INFO - Epoch(train) [16][600/940] lr: 8.4168e-04 eta: 1 day, 21:33:49 time: 0.9336 data_time: 0.0336 memory: 28828 grad_norm: 1.0203 loss: 4.7086 loss_cls: 4.7086 2023/01/24 00:28:10 - mmengine - INFO - Epoch(train) [16][700/940] lr: 8.4729e-04 eta: 1 day, 21:32:07 time: 0.9876 data_time: 0.0341 memory: 28828 grad_norm: 0.9113 loss: 4.7736 loss_cls: 4.7736 2023/01/24 00:29:42 - mmengine - INFO - Epoch(train) [16][800/940] lr: 8.5291e-04 eta: 1 day, 21:30:09 time: 0.9312 data_time: 0.0346 memory: 28828 grad_norm: 1.0775 loss: 4.6751 loss_cls: 4.6751 2023/01/24 00:31:20 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 00:31:20 - mmengine - INFO - Epoch(train) [16][900/940] lr: 8.5853e-04 eta: 1 day, 21:29:10 time: 0.9267 data_time: 0.0356 memory: 28828 grad_norm: 1.1020 loss: 4.5338 loss_cls: 4.5338 2023/01/24 00:31:56 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 00:31:56 - mmengine - INFO - Epoch(train) [16][940/940] lr: 8.6077e-04 eta: 1 day, 21:28:09 time: 0.8643 data_time: 0.0211 memory: 28828 grad_norm: 0.9339 loss: 4.7450 loss_cls: 4.7450 2023/01/24 00:31:56 - mmengine - INFO - Saving checkpoint at 16 epochs 2023/01/24 00:32:13 - mmengine - INFO - Epoch(val) [16][78/78] acc/top1: 0.3492 acc/top5: 0.6191 acc/mean1: 0.3491 2023/01/24 00:32:13 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_15.pth is removed 2023/01/24 00:32:15 - mmengine - INFO - The best checkpoint with 0.3492 acc/top1 at 16 epoch is saved to best_acc/top1_epoch_16.pth. 2023/01/24 00:33:57 - mmengine - INFO - Epoch(train) [17][100/940] lr: 8.6639e-04 eta: 1 day, 21:27:55 time: 0.9254 data_time: 0.0265 memory: 28828 grad_norm: 0.9864 loss: 4.7153 loss_cls: 4.7153 2023/01/24 00:35:30 - mmengine - INFO - Epoch(train) [17][200/940] lr: 8.7201e-04 eta: 1 day, 21:25:59 time: 0.9241 data_time: 0.0321 memory: 28828 grad_norm: 0.9509 loss: 4.7229 loss_cls: 4.7229 2023/01/24 00:37:08 - mmengine - INFO - Epoch(train) [17][300/940] lr: 8.7763e-04 eta: 1 day, 21:25:03 time: 0.9282 data_time: 0.0325 memory: 28828 grad_norm: 0.9955 loss: 4.5762 loss_cls: 4.5762 2023/01/24 00:38:41 - mmengine - INFO - Epoch(train) [17][400/940] lr: 8.8324e-04 eta: 1 day, 21:23:18 time: 0.9279 data_time: 0.0323 memory: 28828 grad_norm: 0.9829 loss: 4.7811 loss_cls: 4.7811 2023/01/24 00:40:14 - mmengine - INFO - Epoch(train) [17][500/940] lr: 8.8886e-04 eta: 1 day, 21:21:22 time: 0.9315 data_time: 0.0323 memory: 28828 grad_norm: 0.9815 loss: 4.8325 loss_cls: 4.8325 2023/01/24 00:41:47 - mmengine - INFO - Epoch(train) [17][600/940] lr: 8.9448e-04 eta: 1 day, 21:19:25 time: 0.9262 data_time: 0.0336 memory: 28828 grad_norm: 0.9435 loss: 4.5899 loss_cls: 4.5899 2023/01/24 00:43:20 - mmengine - INFO - Epoch(train) [17][700/940] lr: 9.0009e-04 eta: 1 day, 21:17:29 time: 0.9298 data_time: 0.0342 memory: 28828 grad_norm: 0.9708 loss: 4.6495 loss_cls: 4.6495 2023/01/24 00:44:59 - mmengine - INFO - Epoch(train) [17][800/940] lr: 9.0571e-04 eta: 1 day, 21:16:43 time: 0.9241 data_time: 0.0321 memory: 28828 grad_norm: 0.9527 loss: 4.6409 loss_cls: 4.6409 2023/01/24 00:46:31 - mmengine - INFO - Epoch(train) [17][900/940] lr: 9.1133e-04 eta: 1 day, 21:14:47 time: 0.9247 data_time: 0.0328 memory: 28828 grad_norm: 0.9687 loss: 4.6188 loss_cls: 4.6188 2023/01/24 00:47:07 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 00:47:07 - mmengine - INFO - Epoch(train) [17][940/940] lr: 9.1358e-04 eta: 1 day, 21:13:47 time: 0.8648 data_time: 0.0192 memory: 28828 grad_norm: 0.9708 loss: 4.4676 loss_cls: 4.4676 2023/01/24 00:47:07 - mmengine - INFO - Saving checkpoint at 17 epochs 2023/01/24 00:47:23 - mmengine - INFO - Epoch(val) [17][78/78] acc/top1: 0.3742 acc/top5: 0.6444 acc/mean1: 0.3740 2023/01/24 00:47:23 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_16.pth is removed 2023/01/24 00:47:25 - mmengine - INFO - The best checkpoint with 0.3742 acc/top1 at 17 epoch is saved to best_acc/top1_epoch_17.pth. 2023/01/24 00:47:51 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 00:49:06 - mmengine - INFO - Epoch(train) [18][100/940] lr: 9.1919e-04 eta: 1 day, 21:13:24 time: 0.9261 data_time: 0.0372 memory: 28828 grad_norm: 0.9050 loss: 4.8781 loss_cls: 4.8781 2023/01/24 00:50:45 - mmengine - INFO - Epoch(train) [18][200/940] lr: 9.2481e-04 eta: 1 day, 21:12:37 time: 0.9245 data_time: 0.0329 memory: 28828 grad_norm: 0.9279 loss: 4.7533 loss_cls: 4.7533 2023/01/24 00:52:18 - mmengine - INFO - Epoch(train) [18][300/940] lr: 9.3043e-04 eta: 1 day, 21:10:39 time: 0.9250 data_time: 0.0318 memory: 28828 grad_norm: 0.8957 loss: 4.7801 loss_cls: 4.7801 2023/01/24 00:53:51 - mmengine - INFO - Epoch(train) [18][400/940] lr: 9.3604e-04 eta: 1 day, 21:08:46 time: 0.9245 data_time: 0.0340 memory: 28828 grad_norm: 0.9191 loss: 4.6644 loss_cls: 4.6644 2023/01/24 00:55:28 - mmengine - INFO - Epoch(train) [18][500/940] lr: 9.4166e-04 eta: 1 day, 21:07:37 time: 1.1539 data_time: 0.0343 memory: 28828 grad_norm: 0.9676 loss: 4.6444 loss_cls: 4.6444 2023/01/24 00:57:01 - mmengine - INFO - Epoch(train) [18][600/940] lr: 9.4728e-04 eta: 1 day, 21:05:40 time: 0.9277 data_time: 0.0336 memory: 28828 grad_norm: 0.8806 loss: 4.6476 loss_cls: 4.6476 2023/01/24 00:58:33 - mmengine - INFO - Epoch(train) [18][700/940] lr: 9.5290e-04 eta: 1 day, 21:03:44 time: 0.9232 data_time: 0.0349 memory: 28828 grad_norm: 0.9244 loss: 4.8052 loss_cls: 4.8052 2023/01/24 01:00:06 - mmengine - INFO - Epoch(train) [18][800/940] lr: 9.5851e-04 eta: 1 day, 21:01:52 time: 0.9270 data_time: 0.0352 memory: 28828 grad_norm: 0.9081 loss: 4.6744 loss_cls: 4.6744 2023/01/24 01:03:25 - mmengine - INFO - Epoch(train) [18][900/940] lr: 9.6413e-04 eta: 1 day, 21:17:55 time: 0.9282 data_time: 0.0352 memory: 28828 grad_norm: 0.9179 loss: 4.6072 loss_cls: 4.6072 2023/01/24 01:04:01 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 01:04:01 - mmengine - INFO - Epoch(train) [18][940/940] lr: 9.6638e-04 eta: 1 day, 21:16:54 time: 0.8652 data_time: 0.0200 memory: 28828 grad_norm: 0.9831 loss: 4.6046 loss_cls: 4.6046 2023/01/24 01:04:01 - mmengine - INFO - Saving checkpoint at 18 epochs 2023/01/24 01:04:17 - mmengine - INFO - Epoch(val) [18][78/78] acc/top1: 0.3841 acc/top5: 0.6620 acc/mean1: 0.3840 2023/01/24 01:04:17 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_17.pth is removed 2023/01/24 01:04:18 - mmengine - INFO - The best checkpoint with 0.3841 acc/top1 at 18 epoch is saved to best_acc/top1_epoch_18.pth. 2023/01/24 01:05:42 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 01:06:01 - mmengine - INFO - Epoch(train) [19][100/940] lr: 9.7199e-04 eta: 1 day, 21:16:30 time: 0.9325 data_time: 0.0328 memory: 28828 grad_norm: 0.9859 loss: 4.4246 loss_cls: 4.4246 2023/01/24 01:07:34 - mmengine - INFO - Epoch(train) [19][200/940] lr: 9.7761e-04 eta: 1 day, 21:14:37 time: 0.9285 data_time: 0.0314 memory: 28828 grad_norm: 0.8712 loss: 4.6927 loss_cls: 4.6927 2023/01/24 01:09:08 - mmengine - INFO - Epoch(train) [19][300/940] lr: 9.8323e-04 eta: 1 day, 21:12:49 time: 0.9311 data_time: 0.0331 memory: 28828 grad_norm: 0.9360 loss: 4.6357 loss_cls: 4.6357 2023/01/24 01:10:47 - mmengine - INFO - Epoch(train) [19][400/940] lr: 9.8885e-04 eta: 1 day, 21:11:47 time: 0.9250 data_time: 0.0315 memory: 28828 grad_norm: 0.9602 loss: 4.2975 loss_cls: 4.2975 2023/01/24 01:12:20 - mmengine - INFO - Epoch(train) [19][500/940] lr: 9.9446e-04 eta: 1 day, 21:09:46 time: 0.9238 data_time: 0.0285 memory: 28828 grad_norm: 0.8855 loss: 4.8057 loss_cls: 4.8057 2023/01/24 01:13:52 - mmengine - INFO - Epoch(train) [19][600/940] lr: 1.0001e-03 eta: 1 day, 21:07:44 time: 0.9249 data_time: 0.0321 memory: 28828 grad_norm: 0.8644 loss: 4.7407 loss_cls: 4.7407 2023/01/24 01:15:28 - mmengine - INFO - Epoch(train) [19][700/940] lr: 1.0057e-03 eta: 1 day, 21:06:11 time: 0.9255 data_time: 0.0301 memory: 28828 grad_norm: 0.8720 loss: 4.5476 loss_cls: 4.5476 2023/01/24 01:17:00 - mmengine - INFO - Epoch(train) [19][800/940] lr: 1.0113e-03 eta: 1 day, 21:04:09 time: 0.9248 data_time: 0.0266 memory: 28828 grad_norm: 0.8904 loss: 4.6074 loss_cls: 4.6074 2023/01/24 01:18:40 - mmengine - INFO - Epoch(train) [19][900/940] lr: 1.0169e-03 eta: 1 day, 21:03:17 time: 0.9240 data_time: 0.0276 memory: 28828 grad_norm: 0.8983 loss: 4.5961 loss_cls: 4.5961 2023/01/24 01:19:16 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 01:19:16 - mmengine - INFO - Epoch(train) [19][940/940] lr: 1.0192e-03 eta: 1 day, 21:02:16 time: 0.8635 data_time: 0.0188 memory: 28828 grad_norm: 0.9791 loss: 4.3914 loss_cls: 4.3914 2023/01/24 01:19:16 - mmengine - INFO - Saving checkpoint at 19 epochs 2023/01/24 01:19:39 - mmengine - INFO - Epoch(val) [19][78/78] acc/top1: 0.4019 acc/top5: 0.6759 acc/mean1: 0.4019 2023/01/24 01:19:39 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_18.pth is removed 2023/01/24 01:19:40 - mmengine - INFO - The best checkpoint with 0.4019 acc/top1 at 19 epoch is saved to best_acc/top1_epoch_19.pth. 2023/01/24 01:21:22 - mmengine - INFO - Epoch(train) [20][100/940] lr: 1.0248e-03 eta: 1 day, 21:01:42 time: 0.9286 data_time: 0.0284 memory: 28828 grad_norm: 0.9166 loss: 4.4480 loss_cls: 4.4480 2023/01/24 01:22:01 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 01:22:57 - mmengine - INFO - Epoch(train) [20][200/940] lr: 1.0304e-03 eta: 1 day, 21:00:02 time: 0.9265 data_time: 0.0313 memory: 28828 grad_norm: 0.9139 loss: 4.4441 loss_cls: 4.4441 2023/01/24 01:24:30 - mmengine - INFO - Epoch(train) [20][300/940] lr: 1.0360e-03 eta: 1 day, 20:58:02 time: 0.9233 data_time: 0.0312 memory: 28828 grad_norm: 0.9068 loss: 4.4176 loss_cls: 4.4176 2023/01/24 01:26:07 - mmengine - INFO - Epoch(train) [20][400/940] lr: 1.0416e-03 eta: 1 day, 20:56:44 time: 1.1494 data_time: 0.0283 memory: 28828 grad_norm: 0.8410 loss: 4.4824 loss_cls: 4.4824 2023/01/24 01:27:40 - mmengine - INFO - Epoch(train) [20][500/940] lr: 1.0473e-03 eta: 1 day, 20:54:50 time: 0.9501 data_time: 0.0286 memory: 28828 grad_norm: 0.8597 loss: 4.6501 loss_cls: 4.6501 2023/01/24 01:29:13 - mmengine - INFO - Epoch(train) [20][600/940] lr: 1.0529e-03 eta: 1 day, 20:52:51 time: 0.9288 data_time: 0.0296 memory: 28828 grad_norm: 0.8830 loss: 4.5369 loss_cls: 4.5369 2023/01/24 01:30:46 - mmengine - INFO - Epoch(train) [20][700/940] lr: 1.0585e-03 eta: 1 day, 20:50:51 time: 0.9252 data_time: 0.0290 memory: 28828 grad_norm: 0.8588 loss: 4.6601 loss_cls: 4.6601 2023/01/24 01:32:18 - mmengine - INFO - Epoch(train) [20][800/940] lr: 1.0641e-03 eta: 1 day, 20:48:52 time: 0.9238 data_time: 0.0256 memory: 28828 grad_norm: 0.8313 loss: 4.3952 loss_cls: 4.3952 2023/01/24 01:33:57 - mmengine - INFO - Epoch(train) [20][900/940] lr: 1.0697e-03 eta: 1 day, 20:47:44 time: 0.9956 data_time: 0.0257 memory: 28828 grad_norm: 0.8790 loss: 4.6026 loss_cls: 4.6026 2023/01/24 01:34:33 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 01:34:33 - mmengine - INFO - Epoch(train) [20][940/940] lr: 1.0720e-03 eta: 1 day, 20:46:52 time: 0.8935 data_time: 0.0185 memory: 28828 grad_norm: 0.9148 loss: 4.6025 loss_cls: 4.6025 2023/01/24 01:34:33 - mmengine - INFO - Saving checkpoint at 20 epochs 2023/01/24 01:34:55 - mmengine - INFO - Epoch(val) [20][78/78] acc/top1: 0.4051 acc/top5: 0.6787 acc/mean1: 0.4048 2023/01/24 01:34:55 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_19.pth is removed 2023/01/24 01:34:57 - mmengine - INFO - The best checkpoint with 0.4051 acc/top1 at 20 epoch is saved to best_acc/top1_epoch_20.pth. 2023/01/24 01:36:39 - mmengine - INFO - Epoch(train) [21][100/940] lr: 1.0776e-03 eta: 1 day, 20:46:13 time: 0.9305 data_time: 0.0282 memory: 28828 grad_norm: 0.8687 loss: 4.3832 loss_cls: 4.3832 2023/01/24 01:38:27 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 01:38:27 - mmengine - INFO - Epoch(train) [21][200/940] lr: 1.0832e-03 eta: 1 day, 20:46:33 time: 1.7015 data_time: 0.0230 memory: 28828 grad_norm: 0.8554 loss: 4.4798 loss_cls: 4.4798 2023/01/24 01:41:42 - mmengine - INFO - Epoch(train) [21][300/940] lr: 1.0888e-03 eta: 1 day, 20:59:36 time: 0.9315 data_time: 0.0258 memory: 28828 grad_norm: 0.8657 loss: 4.5371 loss_cls: 4.5371 2023/01/24 01:43:21 - mmengine - INFO - Epoch(train) [21][400/940] lr: 1.0945e-03 eta: 1 day, 20:58:30 time: 0.9279 data_time: 0.0250 memory: 28828 grad_norm: 0.8255 loss: 4.4418 loss_cls: 4.4418 2023/01/24 01:44:54 - mmengine - INFO - Epoch(train) [21][500/940] lr: 1.1001e-03 eta: 1 day, 20:56:24 time: 0.9225 data_time: 0.0257 memory: 28828 grad_norm: 0.8527 loss: 4.3906 loss_cls: 4.3906 2023/01/24 01:46:26 - mmengine - INFO - Epoch(train) [21][600/940] lr: 1.1057e-03 eta: 1 day, 20:54:19 time: 0.9238 data_time: 0.0262 memory: 28828 grad_norm: 0.8476 loss: 4.6947 loss_cls: 4.6947 2023/01/24 01:47:59 - mmengine - INFO - Epoch(train) [21][700/940] lr: 1.1113e-03 eta: 1 day, 20:52:15 time: 0.9260 data_time: 0.0248 memory: 28828 grad_norm: 0.8585 loss: 4.4842 loss_cls: 4.4842 2023/01/24 01:49:34 - mmengine - INFO - Epoch(train) [21][800/940] lr: 1.1169e-03 eta: 1 day, 20:50:30 time: 1.0370 data_time: 0.0219 memory: 28828 grad_norm: 0.7934 loss: 4.6157 loss_cls: 4.6157 2023/01/24 01:51:06 - mmengine - INFO - Epoch(train) [21][900/940] lr: 1.1225e-03 eta: 1 day, 20:48:26 time: 0.9284 data_time: 0.0247 memory: 28828 grad_norm: 0.8354 loss: 4.3775 loss_cls: 4.3775 2023/01/24 01:51:42 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 01:51:42 - mmengine - INFO - Epoch(train) [21][940/940] lr: 1.1248e-03 eta: 1 day, 20:47:27 time: 0.8659 data_time: 0.0157 memory: 28828 grad_norm: 0.8937 loss: 4.6485 loss_cls: 4.6485 2023/01/24 01:51:42 - mmengine - INFO - Saving checkpoint at 21 epochs 2023/01/24 01:51:58 - mmengine - INFO - Epoch(val) [21][78/78] acc/top1: 0.4217 acc/top5: 0.6982 acc/mean1: 0.4215 2023/01/24 01:51:58 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_20.pth is removed 2023/01/24 01:51:59 - mmengine - INFO - The best checkpoint with 0.4217 acc/top1 at 21 epoch is saved to best_acc/top1_epoch_21.pth. 2023/01/24 01:53:41 - mmengine - INFO - Epoch(train) [22][100/940] lr: 1.1304e-03 eta: 1 day, 20:46:40 time: 0.9285 data_time: 0.0256 memory: 28828 grad_norm: 0.8383 loss: 4.3674 loss_cls: 4.3674 2023/01/24 01:55:14 - mmengine - INFO - Epoch(train) [22][200/940] lr: 1.1360e-03 eta: 1 day, 20:44:37 time: 0.9284 data_time: 0.0265 memory: 28828 grad_norm: 0.8648 loss: 4.3886 loss_cls: 4.3886 2023/01/24 01:56:09 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 01:56:46 - mmengine - INFO - Epoch(train) [22][300/940] lr: 1.1416e-03 eta: 1 day, 20:42:34 time: 0.9261 data_time: 0.0239 memory: 28828 grad_norm: 0.8302 loss: 4.3218 loss_cls: 4.3218 2023/01/24 01:58:19 - mmengine - INFO - Epoch(train) [22][400/940] lr: 1.1473e-03 eta: 1 day, 20:40:31 time: 0.9260 data_time: 0.0254 memory: 28828 grad_norm: 0.8990 loss: 4.4090 loss_cls: 4.4090 2023/01/24 01:59:51 - mmengine - INFO - Epoch(train) [22][500/940] lr: 1.1529e-03 eta: 1 day, 20:38:29 time: 0.9269 data_time: 0.0242 memory: 28828 grad_norm: 0.8554 loss: 4.6064 loss_cls: 4.6064 2023/01/24 02:01:29 - mmengine - INFO - Epoch(train) [22][600/940] lr: 1.1585e-03 eta: 1 day, 20:37:03 time: 0.9236 data_time: 0.0254 memory: 28828 grad_norm: 0.8235 loss: 4.3295 loss_cls: 4.3295 2023/01/24 02:03:01 - mmengine - INFO - Epoch(train) [22][700/940] lr: 1.1641e-03 eta: 1 day, 20:35:01 time: 0.9268 data_time: 0.0226 memory: 28828 grad_norm: 0.7843 loss: 4.7217 loss_cls: 4.7217 2023/01/24 02:04:34 - mmengine - INFO - Epoch(train) [22][800/940] lr: 1.1697e-03 eta: 1 day, 20:33:03 time: 0.9247 data_time: 0.0237 memory: 28828 grad_norm: 0.7797 loss: 4.4941 loss_cls: 4.4941 2023/01/24 02:06:07 - mmengine - INFO - Epoch(train) [22][900/940] lr: 1.1753e-03 eta: 1 day, 20:31:03 time: 0.9239 data_time: 0.0250 memory: 28828 grad_norm: 0.8557 loss: 4.4651 loss_cls: 4.4651 2023/01/24 02:06:43 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 02:06:43 - mmengine - INFO - Epoch(train) [22][940/940] lr: 1.1776e-03 eta: 1 day, 20:30:05 time: 0.8631 data_time: 0.0153 memory: 28828 grad_norm: 0.8423 loss: 4.5516 loss_cls: 4.5516 2023/01/24 02:06:43 - mmengine - INFO - Saving checkpoint at 22 epochs 2023/01/24 02:06:59 - mmengine - INFO - Epoch(val) [22][78/78] acc/top1: 0.4318 acc/top5: 0.7060 acc/mean1: 0.4316 2023/01/24 02:06:59 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_21.pth is removed 2023/01/24 02:07:00 - mmengine - INFO - The best checkpoint with 0.4318 acc/top1 at 22 epoch is saved to best_acc/top1_epoch_22.pth. 2023/01/24 02:08:48 - mmengine - INFO - Epoch(train) [23][100/940] lr: 1.1832e-03 eta: 1 day, 20:30:05 time: 0.9269 data_time: 0.0244 memory: 28828 grad_norm: 0.8330 loss: 4.4414 loss_cls: 4.4414 2023/01/24 02:10:24 - mmengine - INFO - Epoch(train) [23][200/940] lr: 1.1888e-03 eta: 1 day, 20:28:31 time: 0.9247 data_time: 0.0240 memory: 28828 grad_norm: 0.8320 loss: 4.4027 loss_cls: 4.4027 2023/01/24 02:11:56 - mmengine - INFO - Epoch(train) [23][300/940] lr: 1.1944e-03 eta: 1 day, 20:26:29 time: 0.9257 data_time: 0.0241 memory: 28828 grad_norm: 0.8274 loss: 4.5420 loss_cls: 4.5420 2023/01/24 02:12:15 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 02:13:29 - mmengine - INFO - Epoch(train) [23][400/940] lr: 1.2001e-03 eta: 1 day, 20:24:28 time: 0.9251 data_time: 0.0257 memory: 28828 grad_norm: 0.7723 loss: 4.4341 loss_cls: 4.4341 2023/01/24 02:15:02 - mmengine - INFO - Epoch(train) [23][500/940] lr: 1.2057e-03 eta: 1 day, 20:22:28 time: 0.9252 data_time: 0.0286 memory: 28828 grad_norm: 0.8315 loss: 4.1907 loss_cls: 4.1907 2023/01/24 02:16:34 - mmengine - INFO - Epoch(train) [23][600/940] lr: 1.2113e-03 eta: 1 day, 20:20:27 time: 0.9256 data_time: 0.0274 memory: 28828 grad_norm: 0.7910 loss: 4.3142 loss_cls: 4.3142 2023/01/24 02:18:07 - mmengine - INFO - Epoch(train) [23][700/940] lr: 1.2169e-03 eta: 1 day, 20:18:26 time: 0.9253 data_time: 0.0279 memory: 28828 grad_norm: 0.7646 loss: 4.6749 loss_cls: 4.6749 2023/01/24 02:19:39 - mmengine - INFO - Epoch(train) [23][800/940] lr: 1.2225e-03 eta: 1 day, 20:16:25 time: 0.9246 data_time: 0.0224 memory: 28828 grad_norm: 0.7980 loss: 4.5752 loss_cls: 4.5752 2023/01/24 02:21:18 - mmengine - INFO - Epoch(train) [23][900/940] lr: 1.2281e-03 eta: 1 day, 20:15:16 time: 0.9230 data_time: 0.0258 memory: 28828 grad_norm: 0.7521 loss: 4.4441 loss_cls: 4.4441 2023/01/24 02:21:54 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 02:21:54 - mmengine - INFO - Epoch(train) [23][940/940] lr: 1.2304e-03 eta: 1 day, 20:14:19 time: 0.8655 data_time: 0.0166 memory: 28828 grad_norm: 0.8281 loss: 4.5039 loss_cls: 4.5039 2023/01/24 02:21:54 - mmengine - INFO - Saving checkpoint at 23 epochs 2023/01/24 02:22:10 - mmengine - INFO - Epoch(val) [23][78/78] acc/top1: 0.4404 acc/top5: 0.7078 acc/mean1: 0.4403 2023/01/24 02:22:10 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_22.pth is removed 2023/01/24 02:22:12 - mmengine - INFO - The best checkpoint with 0.4404 acc/top1 at 23 epoch is saved to best_acc/top1_epoch_23.pth. 2023/01/24 02:23:54 - mmengine - INFO - Epoch(train) [24][100/940] lr: 1.2360e-03 eta: 1 day, 20:13:32 time: 0.9250 data_time: 0.0252 memory: 28828 grad_norm: 0.8207 loss: 4.4296 loss_cls: 4.4296 2023/01/24 02:25:32 - mmengine - INFO - Epoch(train) [24][200/940] lr: 1.2416e-03 eta: 1 day, 20:12:14 time: 0.9275 data_time: 0.0248 memory: 28828 grad_norm: 0.8089 loss: 4.4056 loss_cls: 4.4056 2023/01/24 02:27:04 - mmengine - INFO - Epoch(train) [24][300/940] lr: 1.2472e-03 eta: 1 day, 20:10:13 time: 0.9291 data_time: 0.0251 memory: 28828 grad_norm: 0.8231 loss: 4.2740 loss_cls: 4.2740 2023/01/24 02:28:18 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 02:28:37 - mmengine - INFO - Epoch(train) [24][400/940] lr: 1.2529e-03 eta: 1 day, 20:08:15 time: 0.9349 data_time: 0.0239 memory: 28828 grad_norm: 0.7879 loss: 4.5401 loss_cls: 4.5401 2023/01/24 02:30:10 - mmengine - INFO - Epoch(train) [24][500/940] lr: 1.2585e-03 eta: 1 day, 20:06:20 time: 0.9547 data_time: 0.0246 memory: 28828 grad_norm: 0.8441 loss: 4.4000 loss_cls: 4.4000 2023/01/24 02:31:44 - mmengine - INFO - Epoch(train) [24][600/940] lr: 1.2641e-03 eta: 1 day, 20:04:33 time: 0.9253 data_time: 0.0272 memory: 28828 grad_norm: 0.7633 loss: 4.4898 loss_cls: 4.4898 2023/01/24 02:33:26 - mmengine - INFO - Epoch(train) [24][700/940] lr: 1.2697e-03 eta: 1 day, 20:03:38 time: 0.9259 data_time: 0.0228 memory: 28828 grad_norm: 0.7964 loss: 4.4703 loss_cls: 4.4703 2023/01/24 02:35:02 - mmengine - INFO - Epoch(train) [24][800/940] lr: 1.2753e-03 eta: 1 day, 20:02:10 time: 0.9275 data_time: 0.0266 memory: 28828 grad_norm: 0.7703 loss: 4.5087 loss_cls: 4.5087 2023/01/24 02:36:41 - mmengine - INFO - Epoch(train) [24][900/940] lr: 1.2809e-03 eta: 1 day, 20:00:55 time: 1.2261 data_time: 0.0262 memory: 28828 grad_norm: 0.7835 loss: 4.5448 loss_cls: 4.5448 2023/01/24 02:37:17 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 02:37:17 - mmengine - INFO - Epoch(train) [24][940/940] lr: 1.2832e-03 eta: 1 day, 19:59:58 time: 0.8638 data_time: 0.0160 memory: 28828 grad_norm: 0.8131 loss: 4.4879 loss_cls: 4.4879 2023/01/24 02:37:17 - mmengine - INFO - Saving checkpoint at 24 epochs 2023/01/24 02:37:33 - mmengine - INFO - Epoch(val) [24][78/78] acc/top1: 0.4568 acc/top5: 0.7222 acc/mean1: 0.4567 2023/01/24 02:37:33 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_23.pth is removed 2023/01/24 02:37:34 - mmengine - INFO - The best checkpoint with 0.4568 acc/top1 at 24 epoch is saved to best_acc/top1_epoch_24.pth. 2023/01/24 02:39:17 - mmengine - INFO - Epoch(train) [25][100/940] lr: 1.2888e-03 eta: 1 day, 19:59:16 time: 1.0003 data_time: 0.0238 memory: 28828 grad_norm: 0.7866 loss: 4.3654 loss_cls: 4.3654 2023/01/24 02:40:50 - mmengine - INFO - Epoch(train) [25][200/940] lr: 1.2944e-03 eta: 1 day, 19:57:17 time: 0.9256 data_time: 0.0224 memory: 28828 grad_norm: 0.7900 loss: 4.4428 loss_cls: 4.4428 2023/01/24 02:42:30 - mmengine - INFO - Epoch(train) [25][300/940] lr: 1.3000e-03 eta: 1 day, 19:56:14 time: 0.9754 data_time: 0.0278 memory: 28828 grad_norm: 0.7169 loss: 4.5906 loss_cls: 4.5906 2023/01/24 02:44:04 - mmengine - INFO - Epoch(train) [25][400/940] lr: 1.3057e-03 eta: 1 day, 19:54:23 time: 0.9224 data_time: 0.0263 memory: 28828 grad_norm: 0.7706 loss: 4.2960 loss_cls: 4.2960 2023/01/24 02:44:41 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 02:45:37 - mmengine - INFO - Epoch(train) [25][500/940] lr: 1.3113e-03 eta: 1 day, 19:52:25 time: 0.9249 data_time: 0.0279 memory: 28828 grad_norm: 0.7551 loss: 4.3486 loss_cls: 4.3486 2023/01/24 02:47:17 - mmengine - INFO - Epoch(train) [25][600/940] lr: 1.3169e-03 eta: 1 day, 19:51:21 time: 0.9266 data_time: 0.0293 memory: 28828 grad_norm: 0.7563 loss: 4.3062 loss_cls: 4.3062 2023/01/24 02:48:49 - mmengine - INFO - Epoch(train) [25][700/940] lr: 1.3225e-03 eta: 1 day, 19:49:23 time: 0.9250 data_time: 0.0288 memory: 28828 grad_norm: 0.7336 loss: 4.6156 loss_cls: 4.6156 2023/01/24 02:50:23 - mmengine - INFO - Epoch(train) [25][800/940] lr: 1.3281e-03 eta: 1 day, 19:47:31 time: 0.9635 data_time: 0.0250 memory: 28828 grad_norm: 0.7201 loss: 4.3750 loss_cls: 4.3750 2023/01/24 02:51:56 - mmengine - INFO - Epoch(train) [25][900/940] lr: 1.3337e-03 eta: 1 day, 19:45:33 time: 0.9279 data_time: 0.0281 memory: 28828 grad_norm: 0.7418 loss: 4.4185 loss_cls: 4.4185 2023/01/24 02:52:31 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 02:52:31 - mmengine - INFO - Epoch(train) [25][940/940] lr: 1.3360e-03 eta: 1 day, 19:44:37 time: 0.8641 data_time: 0.0159 memory: 28828 grad_norm: 0.7963 loss: 4.2134 loss_cls: 4.2134 2023/01/24 02:52:31 - mmengine - INFO - Saving checkpoint at 25 epochs 2023/01/24 02:52:54 - mmengine - INFO - Epoch(val) [25][78/78] acc/top1: 0.4393 acc/top5: 0.7117 acc/mean1: 0.4391 2023/01/24 02:54:38 - mmengine - INFO - Epoch(train) [26][100/940] lr: 1.3416e-03 eta: 1 day, 19:43:57 time: 0.9276 data_time: 0.0298 memory: 28828 grad_norm: 0.7198 loss: 4.6535 loss_cls: 4.6535 2023/01/24 02:56:11 - mmengine - INFO - Epoch(train) [26][200/940] lr: 1.3472e-03 eta: 1 day, 19:42:00 time: 0.9262 data_time: 0.0267 memory: 28828 grad_norm: 0.7715 loss: 4.3408 loss_cls: 4.3408 2023/01/24 02:57:43 - mmengine - INFO - Epoch(train) [26][300/940] lr: 1.3528e-03 eta: 1 day, 19:40:03 time: 0.9291 data_time: 0.0245 memory: 28828 grad_norm: 0.7540 loss: 4.4371 loss_cls: 4.4371 2023/01/24 02:59:16 - mmengine - INFO - Epoch(train) [26][400/940] lr: 1.3585e-03 eta: 1 day, 19:38:06 time: 0.9248 data_time: 0.0271 memory: 28828 grad_norm: 0.7446 loss: 4.2601 loss_cls: 4.2601 2023/01/24 03:00:58 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 03:00:58 - mmengine - INFO - Epoch(train) [26][500/940] lr: 1.3641e-03 eta: 1 day, 19:37:11 time: 1.3746 data_time: 0.0255 memory: 28828 grad_norm: 0.7552 loss: 4.4387 loss_cls: 4.4387 2023/01/24 03:02:30 - mmengine - INFO - Epoch(train) [26][600/940] lr: 1.3697e-03 eta: 1 day, 19:35:14 time: 0.9262 data_time: 0.0298 memory: 28828 grad_norm: 0.7215 loss: 4.4233 loss_cls: 4.4233 2023/01/24 03:04:03 - mmengine - INFO - Epoch(train) [26][700/940] lr: 1.3753e-03 eta: 1 day, 19:33:17 time: 0.9267 data_time: 0.0245 memory: 28828 grad_norm: 0.7916 loss: 4.2592 loss_cls: 4.2592 2023/01/24 03:05:40 - mmengine - INFO - Epoch(train) [26][800/940] lr: 1.3809e-03 eta: 1 day, 19:31:53 time: 0.9252 data_time: 0.0249 memory: 28828 grad_norm: 0.7452 loss: 4.2372 loss_cls: 4.2372 2023/01/24 03:07:13 - mmengine - INFO - Epoch(train) [26][900/940] lr: 1.3865e-03 eta: 1 day, 19:29:55 time: 0.9258 data_time: 0.0250 memory: 28828 grad_norm: 0.7349 loss: 4.3835 loss_cls: 4.3835 2023/01/24 03:07:49 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 03:07:49 - mmengine - INFO - Epoch(train) [26][940/940] lr: 1.3888e-03 eta: 1 day, 19:29:03 time: 0.8807 data_time: 0.0161 memory: 28828 grad_norm: 0.7261 loss: 4.5866 loss_cls: 4.5866 2023/01/24 03:07:49 - mmengine - INFO - Saving checkpoint at 26 epochs 2023/01/24 03:08:08 - mmengine - INFO - Epoch(val) [26][78/78] acc/top1: 0.4541 acc/top5: 0.7271 acc/mean1: 0.4540 2023/01/24 03:09:52 - mmengine - INFO - Epoch(train) [27][100/940] lr: 1.3944e-03 eta: 1 day, 19:28:24 time: 0.9261 data_time: 0.0283 memory: 28828 grad_norm: 0.7393 loss: 4.3661 loss_cls: 4.3661 2023/01/24 03:11:24 - mmengine - INFO - Epoch(train) [27][200/940] lr: 1.4000e-03 eta: 1 day, 19:26:27 time: 0.9238 data_time: 0.0239 memory: 28828 grad_norm: 0.7557 loss: 4.2948 loss_cls: 4.2948 2023/01/24 03:13:08 - mmengine - INFO - Epoch(train) [27][300/940] lr: 1.4056e-03 eta: 1 day, 19:25:43 time: 1.4703 data_time: 0.0233 memory: 28828 grad_norm: 0.7590 loss: 4.2806 loss_cls: 4.2806 2023/01/24 03:14:42 - mmengine - INFO - Epoch(train) [27][400/940] lr: 1.4113e-03 eta: 1 day, 19:23:56 time: 0.9207 data_time: 0.0256 memory: 28828 grad_norm: 0.7396 loss: 4.4807 loss_cls: 4.4807 2023/01/24 03:16:14 - mmengine - INFO - Epoch(train) [27][500/940] lr: 1.4169e-03 eta: 1 day, 19:21:59 time: 0.9264 data_time: 0.0259 memory: 28828 grad_norm: 0.7580 loss: 4.2297 loss_cls: 4.2297 2023/01/24 03:17:10 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 03:17:48 - mmengine - INFO - Epoch(train) [27][600/940] lr: 1.4225e-03 eta: 1 day, 19:20:08 time: 0.9781 data_time: 0.0210 memory: 28828 grad_norm: 0.7107 loss: 4.3693 loss_cls: 4.3693 2023/01/24 03:19:21 - mmengine - INFO - Epoch(train) [27][700/940] lr: 1.4281e-03 eta: 1 day, 19:18:13 time: 0.9299 data_time: 0.0281 memory: 28828 grad_norm: 0.7272 loss: 4.3753 loss_cls: 4.3753 2023/01/24 03:20:57 - mmengine - INFO - Epoch(train) [27][800/940] lr: 1.4337e-03 eta: 1 day, 19:16:43 time: 0.9302 data_time: 0.0257 memory: 28828 grad_norm: 0.7111 loss: 4.4585 loss_cls: 4.4585 2023/01/24 03:22:30 - mmengine - INFO - Epoch(train) [27][900/940] lr: 1.4393e-03 eta: 1 day, 19:14:49 time: 0.9253 data_time: 0.0239 memory: 28828 grad_norm: 0.6889 loss: 4.5806 loss_cls: 4.5806 2023/01/24 03:23:06 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 03:23:06 - mmengine - INFO - Epoch(train) [27][940/940] lr: 1.4416e-03 eta: 1 day, 19:13:55 time: 0.8627 data_time: 0.0175 memory: 28828 grad_norm: 0.7707 loss: 4.4329 loss_cls: 4.4329 2023/01/24 03:23:06 - mmengine - INFO - Saving checkpoint at 27 epochs 2023/01/24 03:23:29 - mmengine - INFO - Epoch(val) [27][78/78] acc/top1: 0.4675 acc/top5: 0.7370 acc/mean1: 0.4674 2023/01/24 03:23:29 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_24.pth is removed 2023/01/24 03:23:31 - mmengine - INFO - The best checkpoint with 0.4675 acc/top1 at 27 epoch is saved to best_acc/top1_epoch_27.pth. 2023/01/24 03:25:13 - mmengine - INFO - Epoch(train) [28][100/940] lr: 1.4472e-03 eta: 1 day, 19:13:01 time: 0.9298 data_time: 0.0265 memory: 28828 grad_norm: 0.7443 loss: 4.2293 loss_cls: 4.2293 2023/01/24 03:26:48 - mmengine - INFO - Epoch(train) [28][200/940] lr: 1.4528e-03 eta: 1 day, 19:11:18 time: 1.0195 data_time: 0.0238 memory: 28828 grad_norm: 0.7109 loss: 4.4303 loss_cls: 4.4303 2023/01/24 03:28:20 - mmengine - INFO - Epoch(train) [28][300/940] lr: 1.4584e-03 eta: 1 day, 19:09:24 time: 0.9277 data_time: 0.0224 memory: 28828 grad_norm: 0.7142 loss: 4.2921 loss_cls: 4.2921 2023/01/24 03:29:53 - mmengine - INFO - Epoch(train) [28][400/940] lr: 1.4641e-03 eta: 1 day, 19:07:30 time: 0.9303 data_time: 0.0263 memory: 28828 grad_norm: 0.7332 loss: 4.2201 loss_cls: 4.2201 2023/01/24 03:31:26 - mmengine - INFO - Epoch(train) [28][500/940] lr: 1.4697e-03 eta: 1 day, 19:05:36 time: 0.9276 data_time: 0.0205 memory: 28828 grad_norm: 0.7357 loss: 4.3895 loss_cls: 4.3895 2023/01/24 03:32:59 - mmengine - INFO - Epoch(train) [28][600/940] lr: 1.4753e-03 eta: 1 day, 19:03:43 time: 0.9328 data_time: 0.0270 memory: 28828 grad_norm: 0.7177 loss: 4.2164 loss_cls: 4.2164 2023/01/24 03:33:17 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 03:34:32 - mmengine - INFO - Epoch(train) [28][700/940] lr: 1.4809e-03 eta: 1 day, 19:01:49 time: 0.9280 data_time: 0.0201 memory: 28828 grad_norm: 0.6865 loss: 4.5234 loss_cls: 4.5234 2023/01/24 03:36:04 - mmengine - INFO - Epoch(train) [28][800/940] lr: 1.4865e-03 eta: 1 day, 18:59:54 time: 0.9229 data_time: 0.0225 memory: 28828 grad_norm: 0.7363 loss: 4.1638 loss_cls: 4.1638 2023/01/24 03:37:37 - mmengine - INFO - Epoch(train) [28][900/940] lr: 1.4921e-03 eta: 1 day, 18:58:00 time: 0.9271 data_time: 0.0208 memory: 28828 grad_norm: 0.7162 loss: 4.4930 loss_cls: 4.4930 2023/01/24 03:38:13 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 03:38:13 - mmengine - INFO - Epoch(train) [28][940/940] lr: 1.4944e-03 eta: 1 day, 18:57:07 time: 0.8686 data_time: 0.0148 memory: 28828 grad_norm: 0.7300 loss: 4.4214 loss_cls: 4.4214 2023/01/24 03:38:13 - mmengine - INFO - Saving checkpoint at 28 epochs 2023/01/24 03:38:28 - mmengine - INFO - Epoch(val) [28][78/78] acc/top1: 0.4759 acc/top5: 0.7442 acc/mean1: 0.4759 2023/01/24 03:38:28 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_27.pth is removed 2023/01/24 03:38:30 - mmengine - INFO - The best checkpoint with 0.4759 acc/top1 at 28 epoch is saved to best_acc/top1_epoch_28.pth. 2023/01/24 03:40:13 - mmengine - INFO - Epoch(train) [29][100/940] lr: 1.5000e-03 eta: 1 day, 18:56:16 time: 0.9311 data_time: 0.0224 memory: 28828 grad_norm: 0.7083 loss: 4.5764 loss_cls: 4.5764 2023/01/24 03:41:45 - mmengine - INFO - Epoch(train) [29][200/940] lr: 1.5056e-03 eta: 1 day, 18:54:23 time: 0.9274 data_time: 0.0214 memory: 28828 grad_norm: 0.7091 loss: 4.2728 loss_cls: 4.2728 2023/01/24 03:43:19 - mmengine - INFO - Epoch(train) [29][300/940] lr: 1.5112e-03 eta: 1 day, 18:52:35 time: 0.9282 data_time: 0.0223 memory: 28828 grad_norm: 0.6864 loss: 4.6463 loss_cls: 4.6463 2023/01/24 03:44:52 - mmengine - INFO - Epoch(train) [29][400/940] lr: 1.5169e-03 eta: 1 day, 18:50:42 time: 0.9262 data_time: 0.0197 memory: 28828 grad_norm: 0.7620 loss: 4.0747 loss_cls: 4.0747 2023/01/24 03:46:24 - mmengine - INFO - Epoch(train) [29][500/940] lr: 1.5225e-03 eta: 1 day, 18:48:48 time: 0.9296 data_time: 0.0235 memory: 28828 grad_norm: 0.6841 loss: 4.1710 loss_cls: 4.1710 2023/01/24 03:47:58 - mmengine - INFO - Epoch(train) [29][600/940] lr: 1.5281e-03 eta: 1 day, 18:47:01 time: 0.9296 data_time: 0.0196 memory: 28828 grad_norm: 0.7027 loss: 4.1241 loss_cls: 4.1241 2023/01/24 03:49:12 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 03:49:31 - mmengine - INFO - Epoch(train) [29][700/940] lr: 1.5337e-03 eta: 1 day, 18:45:07 time: 0.9253 data_time: 0.0208 memory: 28828 grad_norm: 0.6936 loss: 4.2004 loss_cls: 4.2004 2023/01/24 03:51:04 - mmengine - INFO - Epoch(train) [29][800/940] lr: 1.5393e-03 eta: 1 day, 18:43:15 time: 0.9266 data_time: 0.0223 memory: 28828 grad_norm: 0.6679 loss: 4.3447 loss_cls: 4.3447 2023/01/24 03:52:36 - mmengine - INFO - Epoch(train) [29][900/940] lr: 1.5450e-03 eta: 1 day, 18:41:23 time: 0.9268 data_time: 0.0235 memory: 28828 grad_norm: 0.6969 loss: 4.4125 loss_cls: 4.4125 2023/01/24 03:53:12 - mmengine - INFO - Exp name: mvit-small-p244_16x4x1_kinetics400-rgb_sampleonce_20230123_202914 2023/01/24 03:53:12 - mmengine - INFO - Epoch(train) [29][940/940] lr: 1.5472e-03 eta: 1 day, 18:40:31 time: 0.8680 data_time: 0.0161 memory: 28828 grad_norm: 0.7515 loss: 4.3610 loss_cls: 4.3610 2023/01/24 03:53:12 - mmengine - INFO - Saving checkpoint at 29 epochs 2023/01/24 03:53:28 - mmengine - INFO - Epoch(val) [29][78/78] acc/top1: 0.4830 acc/top5: 0.7528 acc/mean1: 0.4829 2023/01/24 03:53:28 - mmengine - INFO - The previous best checkpoint /mnt/petrelfs/lilin/Repos/mmact_dev/mmaction2/work_dirs/mvit_train/mvit-s_from_scatch_sample_once_200e/best_acc/top1_epoch_28.pth is removed 2023/01/24 03:53:30 - mmengine - INFO - The best checkpoint with 0.4830 acc/top1 at 29 epoch is saved to best_acc/top1_epoch_29.pth. 2023/01/24 03:55:12 - mmengine - INFO - Epoch(train) [30][100/940] lr: 1.5528e-03 eta: 1 day, 18:39:33 time: 0.9268 data_time: 0.0193 memory: 28828 grad_norm: 0.7179 loss: 4.2545 loss_cls: 4.2545 2023/01/24 03:56:44 - mmengine - INFO - Epoch(train) [30][200/940] lr: 1.5584e-03 eta: 1 day, 18:37:38 time: 0.9213 data_time: 0.0220 memory: 28828 grad_norm: 0.6807 loss: 4.4783 loss_cls: 4.4783 2023/01/24 03:58:21 - mmengine - INFO - Epoch(train) [30][300/940] lr: 1.5640e-03 eta: 1 day, 18:36:09 time: 0.9240 data_time: 0.0195 memory: 28828 grad_norm: 0.6690 loss: 4.3788 loss_cls: 4.3788 2023/01/24 03:59:54 - mmengine - INFO - Epoch(train) [30][400/940] lr: 1.5697e-03 eta: 1 day, 18:34:17 time: 0.9248 data_time: 0.0217 memory: 28828 grad_norm: 0.7049 loss: 4.3282 loss_cls: 4.3282 2023/01/24 04:01:27 - mmengine - INFO - Epoch(train) [30][500/940] lr: 1.5753e-03 eta: 1 day, 18:32:25 time: 0.9304 data_time: 0.0193 memory: 28828 grad_norm: 0.6542 loss: 4.4659 loss_cls: 4.4659