Abstract
The fusion of visible light and infrared images has garnered significant attention in the field of imaging due to its pivotal role in various applications, including surveillance, remote sensing, and medical imaging. Therefore, this paper introduces a novel fusion framework using Res2Net architecture, capturing features across diverse receptive fields and scales for effective extraction of global and local features. Our methodology is structured into three fundamental components: the first part involves the Res2Net-based encoder, followed by the second part, which encompasses the fusion layer, and finally, the third part, which comprises the decoder. The encoder based on Res2Net is utilized for extracting multi-scale features from the input image. Simultaneously, with a single image as input, we introduce a pioneering training strategy tailored for a Res2Net-based encoder. We further enhance the fusion process with a novel strategy based on the attention model, ensuring precise reconstruction by the decoder for the fused image. Experimental results unequivocally showcase our method’s unparalleled fusion performance, surpassing existing techniques, as evidenced by rigorous subjective and objective evaluations.
Original language | English |
---|---|
Title of host publication | 2023 International Conference on Machine Vision, Image Processing and Imaging Technology (MVIPIT) |
Editors | Lisa Trinh |
Publisher | IEEE |
Pages | 17-23 |
Number of pages | 7 |
ISBN (Electronic) | 979-8-3503-0654-5 |
ISBN (Print) | 979-8-3503-0655-2 |
DOIs | |
Publication status | Published - 5 Jul 2024 |
Event | 2023 International Conference on Machine Vision, Image Processing and Imaging Technology - Hangzhou, China Duration: 22 Sept 2023 → 24 Sept 2023 |
Conference
Conference | 2023 International Conference on Machine Vision, Image Processing and Imaging Technology |
---|---|
Abbreviated title | MVIPIT |
Country/Territory | China |
City | Hangzhou |
Period | 22/09/23 → 24/09/23 |
Bibliographical note
©2023 IEEEFunding
This work was supported by the National Natural Science Foundation of China (62202205, 62020106012), the Fundamental Research Funds for the Central Universities (JUSRP123030).
Funders | Funder number |
---|---|
National Natural Science Foundation of China | 62020106012, 62202205 |
Fundamental Research Funds for the Central Universities | JUSRP123030 |
Keywords
- Visible image
- infrared image
- image fusion
- training strategy
- multi-scale
- attention model