Session Overview

­

Monday, 06/June/2022
12:00pm – 1:00pm Key1: Opening & Keynote 1 (Coral BallRoom 2)
1:30pm – 2:30pm Oral1: Oral Session: Best Paper Session (Coral BallRoom 2)
3:00pm – 6:00pm VBS1: VBS: Expert Session (closed session) (Coral BallRoom 1)

Tuesday, 07/June/2022
12:00pm – 1:00pm Key2: Keynote 2 (Ruby 1)
1:00pm – 2:00pm Oral2: Oral Session:
Applications I (Ruby 1)
SSMAPTA:
Special Session: MAPTA (Ruby 2)
2:00pm – 3:00pm Oral3: Oral Session:
Activities & Events (Ruby 1)
SSMDRE:
Special Session: MDRE (Ruby 2)
3:00pm – 6:00pm VBS2: VBS Public Session (Coral BallRoom 1)

Wednesday, 08/June/2022
1:00pm – 2:00pm Oral4: Oral Session:
Learning (Ruby 1)
SSMULTIMED:
Special Session: MULTIMED (Ruby 2)
2:00pm – 3:00pm Oral5: Oral Session:
Applications II (Ruby 1)
SSMACHU:
Special Session: MACHU (Ruby 2)
3:00pm – 6:00pm PosterDemo: Poster & Demo Session (Foyer Ruby 1&4)

Thursday, 09/June/2022
12:00pm – 1:00pm Keynote3: Keynote 3 (Ruby 1 + 2)
1:00pm – 2:00pm Oral6: Oral Session:
Applications III (Ruby 1)
Oral7: Oral Session:
Image Analytics (Ruby 2)
2:00pm – 3:00pm Oral8: Oral Session:
Speech & Music (Ruby 1)
Oral9: Oral Session:
Multimodal Analytics (Ruby 2)
3:00pm – 3:30pm Closing: Closing Session

12:00pm – 1:00pm

  • Key1: Opening & Keynote 1 (Coral BallRoom 2)

1:30pm – 2:30pm

  • Oral1: Oral: Best Paper Session (Coral BallRoom 2)
Time Title Speaker
1:30pm Real-time detection of tiny objects based on a weighted bi-directional FPN Yaxuan HU1, Yuehong Dai1, Zhongxiang Wang2

1University of Electronic Science and Technology of China, China;

2ShenZhen East-Win technology Co.,LTD, China

1:45pm Multi-Modal Fusion Network for Rumor Detection with Texts and Images Boqun Li, Zhong Qian, Peifeng Li, Qiaoming Zhu

School of Computer Science and Technology, Soochow University, Suzhou, China

2:00pm PF-VTON: Toward High-Quality Parser-Free Virtual Try-On Network Yuan Chang, Tao Peng, Ruhan He, Xinrong Hu, Junping Liu, Zili Zhang, Minghua Jiang

Wuhan Textile University, China, People’s Republic of

2:15pm MF-GAN: Multi-conditional Fusion Generative Adversarial Network for Text-to-Image Synthesis Yuyan Yang1,2, Xin Ni1,2, Yanbin Hao1,2, Chenyu Liu3, Wenshan Wang3, Yifeng Liu3, Haiyong Xie2,4

1University of Science and Technology of China, Anhui 230026, China;

2Key Laboratory of Cyberculture Content Cognition and Detection, Ministry of Culture and Tourism, Anhui 230026, China;

3National Engineering Laboratory for Risk Perception and Prevention (NEL-RPP), Beijing 100041, China;

4Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing 100069, China


3:00pm – 6:00pm

  • VBS1: VBS: Expert Session (closed session) (Coral BallRoom 1)

12:00pm – 1:00pm

  • Opening & Keynote 2 (Ruby 1)

1:00pm – 2:00pm

  • Oral2: Oral Session: Applications I (Ruby 1)
Time Title Speaker
1:00pm Learning to classify weather conditions from single images without labels Kezhen Xie, Lei Huang, Wenfeng Zhang, Qibing Qin, Zhiqiang Wei

Ocean University of China, People’s Republic of China

1:15pm Learning Image Representation via Attribute-aware Attention Networks for Fashion Classification Yongquan Wan1,3, Cairong Yan2, Bofeng Zhang1, Guobing Zou1

1School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China;

2School of Computer Science and Technology, Donghua University, Shanghai 201620, China;

3Department of Computer Science and Technology, Shanghai Jian Qiao University, Shanghai 201306, China

2:00pm Toward Detail-Oriented Image-Based Virtual Try-On with Arbitrary Poses Yuan Chang, Tao Peng, Ruhan He, Xinrong Hu, Junping Liu, Zili Zhang, Minghua Jiang

Wuhan Textile University, China, People’s Republic of

2:15pm Parallel DBSCAN-Martingale estimation of the number of concepts for automatic satellite image clustering Ilias Gialampoukidis1, Stelios Andreadis1, Nick Pantelidis1, Sameed Hayat2, Li Zhong2, Marios Bakratsas1, Dennis Hoppe2, Stefanos Vrochidis1, Ioannis Kompatsiaris1

1Information Technologies Institute, Centre for Research & Technology Hellas, Greece;

2University of Stuttgart – High Performance Computing Center Stuttgart, Germany

  • SSMAPTA: Special Session: MAPTA (Ruby 2)
Time Title Speaker
1:00pm Introduction to MAPTA
1:05pm AI for the Media Industry: Application Potential and Automation Levels Werner Bailer1 , Georg Thallinger1, Verena Krawarik2, Katharina Schell2, Victoria Ertelthalner2

1JOANNEUM RESEARCH, Austria;

2Austria Presse Agentur, Austria

1:10pm Rating-aware Self-Organizing Maps Ladislav Peska, Jakub Lokoc

Charles University, Faculty of Mathematics and Physics, Czech Republic

1:15pm Color the Word: Leveraging Web Images for Machine Translation of Untranslatable Words Yana van de Sande, Martha Larson

Radboud University, Netherlands, The

1:20pm MAPTA Panel Discussion

2:00pm – 3:00pm

  • Oral3: Oral Session: Activities & Events (Ruby 1)
Time Title Speaker
2:00pm Prostate Segmentation of Ultrasound Images based on Interpretable-guided Mathematical Model Tao Peng1, Caiyin Tang2, Jing Wang1

1Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, TX, USA;

2Department of Medical Imaging, Taizhou People’s Hospital, Taizhou, Jiangsu, China

2:15pm Spatiotemporal Perturbation Based Dynamic Consistency for Semi-Supervised Temporal Action Detection Lin Wang, Yan Song, Rui Yan, Xiangbo Shu

School of Computer Science and Engineering, Nanjing University of Science and Technology, NanJing, China

2:30pm MGMP: Multimodal Graph Message Propagation Network for Event Detection Jiankai Li, Yunhong Wang, Weixin Li

IRIP Lab, School of Computer Science and Engineering, Beihang University, Beijing, 100191, China

2:45pm Pose-Enhanced Relation Feature for Action Recognition in Still Images Jiewen Wang, Shuang Liang

Tongji University, China, People’s Republic of

  • SSMDRE: Special Session: MDRE (Ruby 2)
Time Title Speaker
2:00pm Introduction to MDRE
2:05pm A Task Category Space for User-Centric Comparative Multimedia Search Evaluations Jakub Lokoc1, Luca Rossetto2, Werner Bailer3, Klaus Schoeffmann4, Stefanos Vrochidis5, Cathal Gurrin6, Silvan Heller7, Lucia Vadicamo8, Kai Uwe Barthel9, Ladislav Peška1, Jiaxin Wu10, Björn Þór Jónsson111Charles University, Czech Republic;
2University of Zurich, Zurich, Switzerland;
3JOANNEUM RESEARCH, Graz, Austria;
4Klagenfurt University, Klagenfurt, Austria;
5Centre for Research and Technology Hellas, Thessaloniki, Greece;
6Dublin City University, Dublin, Ireland;
7University of Basel, Basel, Switzerland;
8ISTI CNR, Pisa, Italy;
9HTW Berlin, Berlin, Germany;
10City University of Hong Kong, Hong Kong, China;
11IT University of Copenhagen, Copenhagen, Denmark
2:10pm GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval Konstantin Schall, Kai Uwe Barthel, Nico Hezel, Klaus Jung

HTW Berlin, Germany

2:15pm LLQA – Lifelog Question Answering Dataset Ly-Duyen Tran1, Thanh Cong Ho2, Lan Anh Pham2, Binh Nguyen2, Cathal Gurrin1, Liting Zhou1

1Dublin City University, Ireland; 2Ho Chi Minh University of Science, Vietnam

2:20pm MDRE Panel Discussion

3:00pm – 6:00pm

  • VBS2: VBS Public Session & Welcome Session (Coral BallRoom 1)
Time Title Speaker
3:00pm Introduction to the VBS
3:10pm Brief Team Introductions
3:30pm VBS Public Session & Welcome Session
3:30pm – 6:30pm Multi-Modal Video Retrieval in Virtual Reality with vitrivr-VR Florian Spiess1, Ralph Gasser1, Silvan Heller1, Mahnaz Parian-Scherb1, Luca Rossetto2, Loris Sauter1, Heiko Schuldt1

1University of Basel, Switzerland;
2University of Zurich, Switzerland

Multi-Modal Interactive Video Retrieval with Temporal Queries Silvan Heller1, Rahel Arnold1, Ralph Gasser1, Viktor Gsteiger1, Mahnaz Parian-Scherb1, Luca Rossetto2, Loris Sauter1, Florian Spiess1, Heiko Schuldt1

1University of Basel, Switzerland;
2University of Zurich, Switzerland

Efficient Search and Browsing of Large-Scale Video Collections with Vibro Nico Hezel, Konstantin Schall, Klaus Jung, Kai Uwe Barthel

HTW, Germany

VERGE in VBS 2022 Stelios Andreadis, Anastasia Moumtzidou, Damianos Galanopoulos, Nick Pantelidis,
Konstantinos Apostolidis, Despoina Touska,
Konstantinos Gkountakos, Maria Pegia,
Ilias Gialampoukidis, Stefanos Vrochidis, Vasileios Mezaris, Ioannis KompatsiarisCentre for Research & Technology, Hellas (CERTH), Greece
Video Search with Context-aware Ranker and Relevance Feedback Jakub Lokoc, František Mejzlík, Tomáš Soucek, Patrik Dokoupil, Ladislav Peška

Charles University, Czech Republic

Reinforcement Learning-Based Interactive Video Search Zhixin Ma1, Jiaxin Wu2, Zhijian Hou2, Chong-Wah Ngo1

1Singapore Management University, Singapore;
2City University of Hong Kong, Hong Kong, China

Exquisitor at the Video Browser Showdown 2022 Omar Shahbaz Khan1,2, Ujjwal Sharma2, Björn Þór Jónsson1, Stevan Rudinac2, Marcel Worring2, Jan Zahálka3

1IT University of Copenhagen, Denmark;
2University of Amsterdam, Netherlands;
3Czech Technical University in Prague, Czech Republic

VISIONE at Video Browser Showdown 2022 Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, Nicola Messina, Lucia Vadicamo, Claudio Vairo

ISTI-CNR, Italy

Videofall – A Hierarchical Search Engine for VBS2022 Thao-Nhu Nguyen1,Bunyarit
Puangthamawathanakun1,
Graham Healy1, Binh T. Nguyen2,3, Cathal Gurrin1, Annalina Caputo11Dublin City University, Ireland;
2AISIA Research Lab;
3Vietnam National University, Ho Chi Minh University of Science
ViRMA: Virtual Reality Multimedia Analytics at Video Browser Showdown 2022 Aaron Duane, Björn Þór Jónsson

ITU Copenhagen, Denmark

UIT at VBS 2022: an Unified and Interactive video retrieval system with Temporal search Khanh Ho1,2, Vu Xuan Dinh1,2, Hong-Quang Nguyen1,2, Khiem Le1,2, Khang Dinh Tran1,2,
Tien Do1,2, Tien-Dung Mai1,2, Thanh Duc Ngo1,2, Duy-Dinh Le1,21University of Information Technology. Ho Chi Minh City, Vietnam;
2Vietnam National University, Ho Chi Minh City, Vietnam
IVIST: Interactive Video Search Tool in VBS 2022 Sangmin Lee, Sungjune Park, Yong Man Ro

KAIST

diveXplore 6.0: ITEC’s Interactive Video Exploration System at VBS 2022 Andreas Leibetseder, Klaus Schoeffmann

Alpen-Adria-Universität Klagenfurt, Austria

CDC: Color-based Diffusion model with Caption embedding in VBS 2022 Duc-Tuan Luu1,2,3,4, Khanh-An C. Quan1,2,3,4, Thinh-Quyen Nguyen1,4, Van-Son Hua1,4, Minh-Chau Nguyen1,4, Minh-Triet Tran2,3,4, Vinh-Tiep Nguyen1,4

1University of Information Technology, VNU-HCM, Vietnam;
2University of Science, VNU-HCM, Vietnam;
3John von Neumann Institute, VNU-HCM, Vietnam;
4Vietnam National University, Ho Chi Minh City, Vietnam

AVSeeker: An Active Video Retrieval Engine VBS2022 Tu-Khiem Le1, Van-Tu Ninh1, Mai-Khiem Tran2,3,4, Graham Healy1, Cathal Gurrin1, Minh-Triet Tran2,3,4

1Dublin City University, Ireland;
2University of Science, VNU-HCM, Vietnam;
3John von Neumann Institute, VNU-HCM, Vietnam;
4Vietnam National University, Ho Chi Minh City, Vietnam

V-FIRST: A Flexible Interactive RetrievalSystem for Video at VBS 2022 Minh-Triet Tran1,2,3, Nhat Hoang-Xuan1,3, Hoang-Phuc Trang-Trung1,2,3, Thanh-Cong Le1,2,3, Mai-Khiem Tran1,2,3, Minh-Quan Le1,3, Tu-Khiem Le4, Van-Tu Ninh4, Cathal Gurrin4

1University of Science, VNU-HCM, Vietnam;
2John von Neumann Institute, VNU-HCM, Vietnam;
3Vietnam National University, Ho Chi Minh City, Vietnam;
4Dublin City University, Ireland

1:00pm – 2:00pm

  • Oral4: Oral Session: Learning (Ruby 1)
Time Title Speaker
1:00pm Category-sensitive Incremental Learning For Image-based 3D Shape Reconstruction Yijie Zhong, Zhengxing Sun, Shoutong Luo, Yunhan Sun, Wei Zhang

State Key Laboratory for Novel Software Technology, Nanjing University, China

1:15pm AdaConfigure: Reinforcement Learning-based Adaptive Configuration for Video Analytics Services Zhaoliang He1,4, Yuan Wang2, Chen Tang3, Zhi Wang3, Wenwu Zhu1, Chenyang Guo5, Zhibo Chen5

1Department of Computer Science and Technology, Tsinghua University;
2Tsinghua-Berkeley Shenzhen Institute, Tsinghua University;
3Tsinghua Shenzhen International Graduate School, Tsinghua University;
4Peng Cheng Laboratory;
5Tencent Youtu Lab

1:30pm Conditional Context-aware Feature Alignment for Domain Adaptive Detection Transformer Siyuan Chen

University of Science and Technology of China

1:45pm Mining Minority-class Examples With Uncertainty Estimates Gursimran Singh1, Lingyang Chu2, Lanjun Wang3, Jian Pei4, Qi Tian5, Yong Zhang1

1Huawei Technologies Canada Co., Ltd., Canada;
2McMaster University;
3Tianjin University;
4Simon Fraser University;
5Huawei Technologies China

  • SSMULTIMED: Special Session: MULTIMED (Ruby 2)
Time Title Speaker
1:00pm Human activity recognition with IMU and vital signs feature fusion Vasileios-Rafail Xefteris1, Athina Tsanousa1, Thanasis Mavropoulos1, Georgios Meditskos2Stefanos Vrochidis1, Ioannis Kompatsiaris1

1Centre for Research and Technology, Greece;
2School of Informatics, Aristotle University of Thessaloniki, Greece

1:15pm On Assisting Diagnoses of Pareidolia by Emulating Patient Behavior Zhaohui Zhu1,2, Marc A. Kastner2, Shin’ichi Satoh2,1

1The University of Tokyo, Tokyo, Japan;
2National Institute of Informatics, Tokyo, Japan

1:30pm Using Explainable AI to Identify Differences between Clinical and Experimental Pain Detection Models Based on Facial Expressions Pooja Prajod, Tobias Huber, Elisabeth André

University of Augsburg, Augsburg, Germany


2:00pm – 3:00pm

  • Oral5: Oral Session: Applications II (Ruby 1)
Time Title Speaker
2:00pm Multi-scale Cross-modal Transformer Network for RGB-D Object Detection Zhibin Xiao, Pengwei Xie, Guijin Wang

Tsinghua University, China, People’s Republic of

2:15pm Double Granularity Relation Network with Self-Criticism for Occluded Person Re-Identification Xuena Ren1,3Dongming Zhang2, Xiuguo Bao2, Lei Shi2

1Institute of Information Engineering, Chinese Academy of Sciences;
2The National Computer Network Emergency Response Technical Team Coordination Center of China;
3School of Cyber Security, University of Chinese Academy of Sciences

2:30pm Joint Re-Detection and Re-Identification for Multi-Object Tracking Jian He1Xian Zhong1,2, Jingling Yuan1, Ming Tan1, Shilei Zhao1, Luo Zhong1

1School of Computer and Artificial Intelligence, Wuhan University of Technology, China;
2School of Electronics Engineering and Computer Science, Peking University, China

2:45pm A Complementary Fusion Strategy for RGB-D Face Recognition Haoyuan Zheng, Weihang Wang, Fei Wen, Peilin Liu

Brain-inspired Application Technology Center,
School of Electronic Information and Electrical Engineering,
Shanghai Jiao Tong University, Shanghai, China

  • SSMACHU: Special Session: MACHU (Ruby 2)
Time Title Speaker
2:00pm Introduction to MACHU
2:05pm An Investigation into Keystroke Dynamics and Heart Rate Variability as Indicators of Stress Srijith Unni, Sushma Suryanarayana Gowda, Alan F. Smeaton

Dublin City University, Ireland

2:10pm Fall detection using multimodal data Thao Ha1,2, Hoang Nguyen1, Son Huynh1,4, Trung Nguyen3Binh Nguyen1,2,4

1University of Science, Ho Chi Minh City, Vietnam;
2Vietnam National University in Ho Chi Minh City, Vienam;
3Hong Bang International University, Ho Chi Minh City, Vietnam;
4AISIA Research Lab, Ho Chi Minh City, Vietnam

2:15pm Multimodal Embedding for Lifelog Retrieval Liting Zhou1, Cathal Gurrin2

1Dublin City University, Ireland;
2Dublin City University, Ireland

2:20pm Prediction of Blood Glucose using Contextual LifeLog Data Tenzin Palbar, Manoj Kesavulu, Renaat Verbruggen, Cathal Gurrin
Dublin City University
2:25pm MACHU Panel Discussion

3:00pm – 6:00pm

  • PosterDemo: Poster & Demo Session (Foyer Ruby 1&4)
Time Title Speaker
3:00pm Poster & Demo Boaster Session
3:45pm Poster & Demo Session
3:30pm – 6:30pm POSTER
Long-range Feature Dependencies Capturing for Low-resolution Image Classification Sheng Kang, Yang Wang, Yang Cao, Zheng-Jun Zha University of Science and Technology of China
An IBC reference block enhancement model based on GAN for Screen Content Video Coding Pengjian Yang1, Jun Wang2, Guangyu Zhong1, Pengyuan Zhang3, Lai Zhang1, Fan Liang1, Jianxin Yang2

1School of Electronics and Information Technology, Sun Yat-sen University, Guangzhou, China;
2Zhuhai Jieli Technology Co.,Ltd;
3Wuhan Research Institute of Posts and Telecommunications, Wuhan, China

AS-Net: Class-aware Assistance and Suppression Network for Few-shot Learning Ruijing Zhao, Kai Zhu, Yang Cao, Zhengjun Zha

University of Science and Technology of China, China, People’s Republic of

DIG: A Data-driven Impact-based Grouping Method for Video Rebuffering Optimization Shengbin Meng1,2, Chunyu Qiao1, Junlin Li1, Yue Wang1, Zongming Guo2

1Video Architecture Team, ByteDance Inc.;
2Wangxuan Institute of Computer Technology, Peking University

Fast Detection of Multi-Direction Remote Sensing Ship Object Based on Scale Space Pyramid Ziying Song, Kuihe Yang, Yu Zhang, Yi Liu

Hebei University of Science and Technology

Indie Games Popularity Prediction by Considering Multimodal Features Yu-Heng Huang, Wei-Ta Chu

National Cheng Kung University, Taiwan

An Iterative Correction Phase of Light Field for novel view Reconstruction Changjian Zhu1, Hong Zhang2, Ying Wei3, Nan He4, Qiuming Liu5

1Guangxi Normal University;
2Guilin Normal College;
3Guangxi Normal University;
4Guilin Normal College;
5JiangXi University of Science and Technology

Multi-object Tracking with A Hierarchical Single-branch Network Fan Wang, Lei Luo, En Zhu, SiWei Wang

National University of Defense Technology, School of Computer, Changsha, China

ILMICA – Interactive Learning Model of Image Collage Assessment: A Transfer Learning Approach for Aesthetic Principles Ani Withöft1, Larbi Abdenebaoui1, Susanne Boll2

1OFFIS – Institute for Information Technology, Oldenburg, Germany;
2Carl von Ossietzky University, Oldenburg, Germany

Exploring Implicit and Explicit Relations with the Dual Relation-Aware Network for Image Captioning Zhiwei Zha1, Pengfei Zhou1,2, Cong Bai1

1College of Computer Science and Technology, Zhejiang University of Technology;
2Institute of Computing Technology, Chinese Academy of Sciences

Generative Landmarks Guided Eyeglasses Removal 3D Face Reconstruction Dapeng Zhao1, Yue Qi1,2,3

1State Key Laboratory of Virtual Reality Technology and Systems,School of Computer Science and Engineering at Beihang University, Beijing, China;
2Peng Cheng Laboratory, Shenzhen, China;
3Qingdao Research Institute of Beihang University, Qingdao, China

Patching Your Clothes: Semantic-aware Learning for Cloth-Changed Person Re-Identification Xuemei Jia1, Xian Zhong1,2, Mang Ye3, Wenxuan Liu1, Wenxin Huang4, Shilei Zhao1

1School of Computer and Artificial Intelligence, Wuhan University of Technology, China;
2School of Electronics Engineering and Computer Science, Peking University, China;
3School of Computer Science, Wuhan University, China;
4School of Computer Science and Information Engineering, Hubei University, China

Lightweight Wavelet-Based Network for JPEG Artifacts Removal Yuejin Sun, Yang Wang, Yang Cao, Zheng-Jun Zha

University of Science and Technology of China, China, People’s Republic of

Shared Latent Space of Font Shapes and Their Noisy Impressions Jihun Kang1, Daichi Haraguchi1, Seiya Matsuda1, Akisato Kimura2, Seiichi Uchida1

1Kyushu University, Japan;
2NTT Communication Science Laboratories

Reconstructing 3D Contour Models of General Scenes from RGB-D Sequences Weiran Wang, Huijun Di, Lingxiao Song

Beijing Laboratory of Intelligent Information Technology, School of Computer Science and Technology, Beijing Institute of Technology

Creating Controllable Eye Blink for Talking Face Generation Jiaqi Hao, Shiguang Liu, Qing Xu
Tianjin University, China, People’s Republic of
SUnet++:Joint Demosaicing and Denoising of Extreme Low-light Raw Image JingZhong Qi1, Na Qi1,2, Qing Zhu1,2

1Beijing University Of Technology, China; 2Beijing Institute of Artificial Intelligence, China

HyText – a Scene-Text Extraction Method for Video Retrieval Alexander Theus, Luca Rossetto, Abraham Bernstein

University of Zurich, Switzerland

Depthwise-separable Residual Capsule for Robust Keyword Spotting Huang Xianghong, Yang Qun, Liu Shaohan

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, Jiangsu, China

Adaptive Speech Intelligibility Enhancement for Far-and-Near-end Noise Environments Based on Self-Attention StarGAN Dengshi Li1, Lanxin Zhao1, Jing Xiao2, Jiaqi Liu2, Duanzheng Guan1, Qianrui Wang1

1School of Artificial Intelligence, Jianghan University, Wuhan430056, China;
2National Engineering Research Center for Multimedia Software, School of Computer, Wuhan University, Wuhan430072, China

Personalized Fashion Recommendation using Pairwise Attention Donnaphat Trakulwaranont1,2, Marc A. Kastner2, Shin’ichi Satoh2,1

1The University of Tokyo, Japan; 2National Institute of Informatics, Japan

Graph Neural Networks Based Multi-Granularity Feature Representation Learning for Fine-Grained Visual Categorization Hongyan Wu1, Haiyun Guo2, Qinghai Miao1, Min Huang1, Jinqiao Wang1,2

1School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China;
2Institution of Automation, Chinese Academy of Sciences, Beijing, China

Image to Image Translation Makes Malpositioned Teeth Orderly Sanbi Luo1,2

1Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China.;
2School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China.

Skeletonization Based on K-Nearest-Neighbors on Binary Image Yi Ren, Min Zhang, Hongyu Zhou, Ji Liu

Chongqing University, China, People’s Republic of

Classroom Attention Estimation Method Based on Mining Facial Landmarks of Students Liyan Chen1,2, Haoran Yang1, Kunhong Liu1,2

1School of Informatics, Xiamen University, China; 2School of Film, Xiamen University, China

A Novel Chinese Sarcasm Detection Model Based on Retrospective Reader Lei Zhang1,2Xiaoming Zhao1, Xueqiang Song1, Yuwei Fang1, Dong Li4, Haizhou Wang3

1College of Computer Science, Sichuan University, Chengdu, 610065, China;
2Institude for Industrial Internet Research,Sichuan University,Chengdu, 610065,China;
3School of Cyber Science and Engineering,Sichuan University,Chengdu 610065,China;
4Department of Computer Technology and Applications, Qinghai University, Xining, Qinghai, China, 810016

Effects and Combination of Tailored Browser-Based and Mobile Cognitive Software Training Mareike Gabele1,3, Andrea Thoms2, Simon Schröer1, Steffi Hußlein3, Christian Hansen1

1Otto von Guericke University Magdeburg, Faculty of Computer Science, Germany;
2HASOMED GmbH, Paul-Ecke-Straße 1, 39114 Magdeburg, Germany;
3Magdeburg-Stendal University of Applied Sciences, Institute Industrial Design, Germany

Progressive GAN-based Transfer Network for Low-Light Image Enhancement Shuang Jin1, Na Qi1,2, Qing Zhu1,2, Haoran Ouyang1

1Faculty of Information Technology, Beijing University of Technology, China;
2Beijing Institute of Artificial Intelligence

Rethinking Shared Features and Re-Ranking for Cross-Modality Person Re-Identification Na Jiang1, Zhaofa Wang1, Peng Xu1, Xinyue Wu1,2, Lei Zhang2

1Capital Normal University, College of Information Engineering;
2Beihang University

Adversarial Attacks on Deepfake Detectors: A Practical Analysis Ngan Hoang Vo1,2, Khoa D Phan1,2, Anh-Duy Tran1,2, Duc-Tien Dang-Nguyen3,4

1FIT, University of Science, HCMC, Vietnam;
2Vietnam National University, Ho Chi Minh City, Vietnam;
3University of Bergen, Norway;
4Kristiania University College, Norway

Multi-Modal Semantic Inconsistency Detection in Social Media News Posts Scott McCrae, Kehan Wang, Avideh Zakhor

University of California, Berkeley, United States of America

EEG Emotion Recognition Based On Dynamically Organized Graph Neural Network Hanyu Li1,2, Xu Zhang1,2, Ying Xia1,2

1Department of Computer Science and Technology, Chongqing University of Posts and Telecommunications, China;
2Chongqing Engineering Research Center of Spatial Big Data Intelligent Technology, Chongqing, China

An Unsupervised Multi-Scale Generative Adversarial Network for Remote Sensing Image Pan-Sharpening Yajie Wang1, Yanyan Xie2, Yanyan Wu1, Kai Liang2, Jilin Qiao2

1Engineering Training Center, Shenyang Aerospace University, Shenyang, China;
2School of Computer Science, Shenyang Aerospace University, Shenyang, China

Leveraging Selective Prediction for Reliable Image Geolocation Apostolos Panagiotopoulos, Giorgos Kordopatis-Zilos, Symeon Papadopoulos

Information Technologies Institute – CERTH, Greece

Multi-scale Fusion Attention Network for Polyp Segmentation Huang Dongjin, Han Kaili, Xi Yongjie, Che Wenqi

Shanghai University, China, People’s Republic of

Compressive Sensing-based Image Encryption and Authentication in Edge-clouds Hongying Zheng, Yawen Huang, Lin Li, Di Xiao

College of Computer Science, Chongqing University, Chongqing 400044, China

ECAS-ML: Edge Computing Assisted Adaptation Scheme with ML for HAS Jesús Aguilar Armijo, Ekrem Çetinkaya, Christian Timmerer, Hermann Hellwagner

Christian Doppler Laboratory ATHENA,
Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria

Fast CU Depth Decision Algorithm for AVS3 Shiyi Liu1,2, Zhenyu Wang1,2, Ke Qiu1,2, Jiayu Yang1,2, Ronggang Wang1,2

1Peking University Shenzhen Graduate School, Shenzhen, China;
2Pengcheng Laboratory, Shenzhen, China

MEViT: Motion Enhanced Video Transformer for Video Classification Li Li1, Liansheng Zhuang2

1School of Data Science, University of Science and Technology of China;
2SSchool of Information Science and Technology, University of Science and Technology of China

CDeRSNet: Towards High Performance Object Detection in Vietnamese Document Images Thuan Trong Nguyen, Thuan Q. Nguyen, Long Duong, Nguyen D. Vo, Khang Nguyen
University of Information and Technology, Vietnam National University, Ho Chi Minh, Vietnam
DEMOS
Making Few-shot Object Detection Simpler and Less Frustrating Werner Bailer

JOANNEUM RESEARCH, Austria

PicArrange – Visually Sort, Search, and Explore Private Images on a Mac Computer Klaus Jung, Kai Uwe Barthel, Nico Hezel, Konstantin Schall

HTW Berlin, Germany

XQM: Search-Oriented vs.~Classifier-Oriented Relevance Feedback on Mobile Phones Kim I. Schild1, Alexandra M. Bagi1, Magnus Holm Mamsen1, Omar Shahbaz Khan1Björn Þór Jónsson1, Jan Zahálka2

1IT University of Copenhagen, Copenhagen, Denmark; 2Czech Technical University, Prague, Czech Republic

MoViDNN: A Mobile Platform for Evaluating Video Quality Enhancement with Deep Neural Networks Ekrem Çetinkaya, Minh Nguyen, Christian Timmerer

Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria

DataCAP: A Satellite Datacube and Crowdsourced Street-level Images for the Monitoring of the Common Agricultural Policy Vasileios Sitokonstantinou1,2, Alkiviadis Koukos1, Thanassis Drivas1, Charalampos Kontoes1, Vassilia Karathanassi2

1Institute for Space Applications and Remote Sensing, National Observatory of Athens, BEYOND Centre of EO Research and Satellite Remote Sensing, I. Metaxa and Vas. Pavlou St, Penteli, 15236 Athens, Greece;
2Laboratory of Remote Sensing, National Technical University of Athens, 9 HeroonPolytechniou Str., Zographos, 15790 Athens, Greece

A Virtual Reality Reminiscence Interface for Personal Lifelog Ly-Duyen Tran1, Diarmuid Kennedy1, Liting Zhou1, Binh Nguyen2,3, Cathal Gurrin1

1Dublin City University, Ireland;
2AISIA Research Lab;
3Vietnam National University, Ho Chi Minh University of Science

Dissemination of ML model’s next day wildfire risk prediction through web application Stella Girtsou, Alexis Apostolakis, George Giannopoulos, Charalampos Kontoes

National Observatory of Athens, Greece

12:00pm – 1:00pm

  • Keynote3: Keynote 3 (Ruby 1 + 2)

1:00pm – 2:00pm

  • Oral6: Oral Session: Applications III (Ruby 1)
Title Speaker
JVCSR: Video Compressive Sensing Reconstruction with Joint In-loop Reference Enhancement and Out-loop Super-resolution Jian Yang, Chi Do Kim Pham, Jinjia Zhou

Graduate School of Science and Engineering, Hosei University, Japan

Point Cloud Upsampling via a Coarse-to-fine Network Yingrui Wang, Suyu Wang, Longhua Sun

Faculty of Information Technology, Beijing University of Technology, Beijing, China

A Multiple Positives Enhanced NCE Loss for Image-Text Retrieval Yi Li, Dehao Wu, Yuesheng Zhu

Shenzhen Graduate School, Peking University, Beijing, China

SAM: Self Attention Mechanism for Scene Text Recognition based on Swin Transformer Xiang Shuai, Xiao Wang, Wei Wang, Xin Yuan, Xin Xu

Wuhan University of Science and Technology

  • Oral7: Oral Session: Image Analytics (Ruby 2)
Title Speaker
One-Stage Image Inpainting with Hybrid Attention Lulu Zhao1, Ling Shen2, Richang Hong1

1School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China;
2School of Internet,Anhui University, Hefei 230039,China

Fast Single Image Dehazing Using Morphological Reconstruction and Saturation Compensation Shuang Zheng, Liang Wang

Beijing University of Technology

Arbitrary Style Transfer With Adaptive Channel Network Yuzhuo Wang, Yanlin Geng

Xidian Univeristy, China, People’s Republic of

Real-time FPGA Design for OMP Targeting 8K Image Reconstruction Jiayao Xu, Chen Fu, Zhiqiang Zhang, Jinjia Zhou

Graduate School of Science and Engineering, Hosei University, Tokyo, Japan


2:00pm – 3:00pm

  • Oral8: Oral Session: Speech & Music (Ruby 1)
Title Speaker
Time-Frequency Attention for Speech Emotion Recognition with Squeeze-and-Excitation Blocks Ke Liu, Chen Wang, Jiayue Chen, Jun Feng

the School of Information Science and Technology, Northwest University, Xi’an, 710127, Shaanxi, China

SPEECH INTELLIGIBILITY ENHANCEMENT BY NON-PARALLEL SPEECH STYLE CONVERSION USING CWT AND iMetricGAN BASED CycleGAN Jing Xiao1,2, Jiaqi Liu1,2, Dengshi Li3, Lanxin Zhao3, Qianrui Wang3

1National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University,Wuhan430072, China;
2Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan430072, China;
3School of Artificial Intelligence, Jianghan University, Wuhan430056, China

Melody Generation from Lyrics Using Three Branch Conditional LSTM-GAN Abhishek Srivastava1, Wei Duan2, Rajiv Ratn Shah1, Jianming Wu3, Suhua Tang4, Wei Li5, Yi Yu2

1Multimodal Digital Media Analysis Lab, Indraprastha Institute of Information Technology, Delhi;
2Digital Content and Media Sciences Research Division, National Institute of Informatics, Japan;
3KDDI Research, Inc, Japan;
4Graduate School of Informatics and Engineering, The University of Electro-Communications;
5School of Computer Science and Technology, Fudan University, China

A-Muze-Net: Music Generation by Composing the Harmony based on the Generated Melody Or Goren1, Eliya Nachmani1,2, Lior Wolf1

1Tel Aviv University, Israel; 2Facebook AI Research

  • Oral9: Oral Session: Multimodal Analytics (Ruby 2)
Title Speaker
Non-Uniform Attention Network for Multi-modal Sentiment Analysis Binqiang Wang1,2,3, Gang Dong1,2,3,
Yaqian Zhao1,2,3, Rengang Li1,2,3,
Qichun Cao1,2,3, YinYin Chao1,2,31Inspur (Beijing) Electronic Information Industry Co., Ltd.;
2Inspur Electronic Information Industry Co., Ltd.;
3Shandong Massive Information Technology Research Institute
Multimodal Unsupervised Image-to-Image Translation Without Independent Style Encoder Yanbei Sun, Yao Lu, Haowei Lu, Qingjie Zhao, Shunzhou Wang

BEIJING INSTITUTE OF TECHNOLOGY, China, People’s Republic of

Combining Knowledge and Multi-modal Fusion for Meme Classification Qi Zhong, Qian Wang, Ji Liu

College of Computer Science, Chongqing University

Bi-attention modal separation network for multimodal video fusion Pengfei Du1,2, Yali Gao1,2, Xiaoyong Li1,2

1Beijing University of Posts and Telecommunications, China, People’s Republic of;
2Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education


3:00pm – 3:30pm

  • Closing: Closing Session