Skip to main content

Showing 1–50 of 112 results for author: Gao, S

  1. arXiv:2407.09645  [pdf, other

    eess.SY cs.LG cs.RO

    Hamilton-Jacobi Reachability in Reinforcement Learning: A Survey

    Authors: Milan Ganai, Sicun Gao, Sylvia Herbert

    Abstract: Recent literature has proposed approaches that learn control policies with high performance while maintaining safety guarantees. Synthesizing Hamilton-Jacobi (HJ) reachable sets has become an effective tool for verifying safety and supervising the training of reinforcement learning-based control policies for complex, high-dimensional systems. Previously, HJ reachability was limited to verifying lo… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.07372  [pdf, other

    eess.IV cs.CV

    Trustworthy Contrast-enhanced Brain MRI Synthesis

    Authors: Jiyao Liu, Yuxin Li, Shangqi Gao, Yuncheng Zhou, Xin Gao, Ningsheng Xu, Xiao-Yong Zhang, Xiahai Zhuang

    Abstract: Contrast-enhanced brain MRI (CE-MRI) is a valuable diagnostic technique but may pose health risks and incur high costs. To create safer alternatives, multi-modality medical image translation aims to synthesize CE-MRI images from other available modalities. Although existing methods can generate promising predictions, they still face two challenges, i.e., exhibiting over-confidence and lacking inte… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures

  3. arXiv:2407.00896  [pdf, other

    eess.SP cs.AI

    Channel Modeling Aided Dataset Generation for AI-Enabled CSI Feedback: Advances, Challenges, and Solutions

    Authors: Yupeng Li, Gang Li, Zirui Wen, Shuangfeng Han, Shijian Gao, Guangyi Liu, Jiangzhou Wang

    Abstract: The AI-enabled autoencoder has demonstrated great potential in channel state information (CSI) feedback in frequency division duplex (FDD) multiple input multiple output (MIMO) systems. However, this method completely changes the existing feedback strategies, making it impractical to deploy in recent years. To address this issue, this paper proposes a channel modeling aided data augmentation metho… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  4. arXiv:2406.14440  [pdf, other

    eess.SP

    LLM4CP: Adapting Large Language Models for Channel Prediction

    Authors: Boxun Liu, Xuanyu Liu, Shijian Gao, Xiang Cheng, Liuqing Yang

    Abstract: Channel prediction is an effective approach for reducing the feedback or estimation overhead in massive multi-input multi-output (m-MIMO) systems. However, existing channel prediction methods lack precision due to model mismatch errors or network generalization issues. Large language models (LLMs) have demonstrated powerful modeling and generalization abilities, and have been successfully applied… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  5. arXiv:2406.09304  [pdf

    physics.app-ph eess.SP

    Self-reconfigurable Multifunctional Memristive Nociceptor for Intelligent Robotics

    Authors: Shengbo Wang, Mingchao Fang, Lekai Song, Cong Li, Jian Zhang, Arokia Nathan, Guohua Hu, Shuo Gao

    Abstract: Artificial nociceptors, mimicking human-like stimuli perception, are of significance for intelligent robotics to work in hazardous and dynamic scenarios. One of the most essential characteristics of the human nociceptor is its self-adjustable attribute, which indicates that the threshold of determination of a potentially hazardous stimulus relies on environmental knowledge. This critical attribute… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 14 pages, 4 figures

  6. arXiv:2405.18435  [pdf, other

    eess.IV cs.CV

    QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

    Authors: Hongwei Bran Li, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

    Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More

    Submitted 24 June, 2024; v1 submitted 19 March, 2024; originally announced May 2024.

    Comments: initial technical report

  7. arXiv:2405.14347  [pdf, other

    eess.SP cs.AI

    Doubly-Dynamic ISAC Precoding for Vehicular Networks: A Constrained Deep Reinforcement Learning (CDRL) Approach

    Authors: Zonghui Yang, Shijian Gao, Xiang Cheng

    Abstract: Integrated sensing and communication (ISAC) technology is essential for enabling the vehicular networks. However, the communication channel in this scenario exhibits time-varying characteristics, and the potential targets may move rapidly, creating a doubly-dynamic phenomenon. This nature poses a challenge for real-time precoder design. While optimization-based solutions are widely researched, the… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  8. arXiv:2405.09778  [pdf, other

    eess.SP

    Beam Pattern Modulation Embedded Hybrid Transceiver Optimization for Integrated Sensing and Communication

    Authors: Boxun Liu, Shijian Gao, Zonghui Yang, Xiang Cheng, Liuqing Yang

    Abstract: Integrated sensing and communication (ISAC) emerges as a promising technology for B5G/6G, particularly in the millimeter-wave (mmWave) band. However, the widely utilized hybrid architecture in mmWave systems compromises multiplexing gain due to the constraints of limited radio frequency chains. Moreover, additional sensing functionalities exacerbate the impairment of spectrum efficiency (SE). In t… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  9. arXiv:2405.09663  [pdf

    eess.SP

    Design and Implementation of mmWave Surface Wave Enabled Fluid Antennas and Experimental Results for Fluid Antenna Multiple Access

    Authors: Yuanjun Shen, Boyi Tang, Shuai Gao, Kin-Fai Tong, Hang Wong, Kai-Kit Wong, Yangyang Zhang

    Abstract: While multiple-input multiple-output (MIMO) technologies continue to advance, concerns arise as to how MIMO can remain scalable if more users are to be accommodated with an increasing number of antennas at the base station (BS) in the upcoming sixth generation (6G). Recently, the concept of fluid antenna system (FAS) has emerged, which promotes position flexibility to enable transmitter channel st… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Submitted to IEEE Transactions on Antennas and Propagation

  10. arXiv:2405.02942  [pdf, other

    physics.optics cs.CV cs.RO eess.IV

    Design, analysis, and manufacturing of a glass-plastic hybrid minimalist aspheric panoramic annular lens

    Authors: Shaohua Gao, Qi Jiang, Yiqi Liao, Yi Qiu, Wanglei Ying, Kailun Yang, Kaiwei Wang, Benhao Zhang, Jian Bai

    Abstract: We propose a high-performance glass-plastic hybrid minimalist aspheric panoramic annular lens (ASPAL) to solve several major limitations of the traditional panoramic annular lens (PAL), such as large size, high weight, and complex system. The field of view (FoV) of the ASPAL is 360°x(35°~110°) and the imaging quality is close to the diffraction limit. This large FoV ASPAL is composed of only 4 len… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted to Optics & Laser Technology

  11. arXiv:2404.19201  [pdf, other

    eess.IV cs.CV cs.RO physics.optics

    Global Search Optics: Automatically Exploring Optimal Solutions to Compact Computational Imaging Systems

    Authors: Yao Gao, Qi Jiang, Shaohua Gao, Lei Sun, Kailun Yang, Kaiwei Wang

    Abstract: The popularity of mobile vision creates a demand for advanced compact computational imaging systems, which call for the development of both a lightweight optical system and an effective image reconstruction model. Recently, joint design pipelines come to the research forefront, where the two significant components are simultaneously optimized via data-driven learning to realize the optimal system… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: The source code will be made publicly available at https://github.com/wumengshenyou/GSO

  12. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  13. arXiv:2404.13388  [pdf

    eess.IV cs.CV cs.LG

    Diagnosis of Multiple Fundus Disorders Amidst a Scarcity of Medical Experts Via Self-supervised Machine Learning

    Authors: Yong Liu, Mengtian Kang, Shuo Gao, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Arokia Nathan, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Luigi Occhipinti

    Abstract: Fundus diseases are major causes of visual impairment and blindness worldwide, especially in underdeveloped regions, where the shortage of ophthalmologists hinders timely diagnosis. AI-assisted fundus image analysis has several advantages, such as high accuracy, reduced workload, and improved accessibility, but it requires a large amount of expert-annotated data to build reliable models. To addres… ▽ More

    Submitted 23 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  14. arXiv:2404.13386  [pdf

    eess.IV cs.CV cs.LG

    SSVT: Self-Supervised Vision Transformer For Eye Disease Diagnosis Based On Fundus Images

    Authors: Jiaqi Wang, Mengtian Kang, Yong Liu, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Shuo Gao, Luigi G. Occhipinti

    Abstract: Machine learning-based fundus image diagnosis technologies trigger worldwide interest owing to their benefits such as reducing medical resource power and providing objective evaluation results. However, current methods are commonly based on supervised methods, bringing in a heavy workload to biomedical staff and hence suffering in expanding effective databases. To address this issue, in this artic… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: ISBI 2024

  15. arXiv:2403.19425  [pdf, ps, other

    eess.IV cs.CV

    A Robust Ensemble Algorithm for Ischemic Stroke Lesion Segmentation: Generalizability and Clinical Utility Beyond the ISLES Challenge

    Authors: Ezequiel de la Rosa, Mauricio Reyes, Sook-Lei Liew, Alexandre Hutton, Roland Wiest, Johannes Kaesmacher, Uta Hanning, Arsany Hakim, Richard Zubal, Waldo Valenzuela, David Robben, Diana M. Sima, Vincenzo Anania, Arne Brys, James A. Meakin, Anne Mickan, Gabriel Broocks, Christian Heitkamp, Shengbo Gao, Kongming Liang, Ziji Zhang, Md Mahfuzur Rahman Siddiquee, Andriy Myronenko, Pooya Ashtari, Sabine Van Huffel , et al. (33 additional authors not shown)

    Abstract: Diffusion-weighted MRI (DWI) is essential for stroke diagnosis, treatment decisions, and prognosis. However, image and disease variability hinder the development of generalizable AI algorithms with clinical value. We address this gap by presenting a novel ensemble algorithm derived from the 2022 Ischemic Stroke Lesion Segmentation (ISLES) challenge. ISLES'22 provided 400 patient scans with ischemi… ▽ More

    Submitted 3 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  16. arXiv:2403.15228  [pdf, ps, other

    math.OC eess.SY

    On moment relaxations for linear state feedback controller synthesis with non-convex quadratic costs and constraints

    Authors: Dennis Gramlich, Sheng Gao, Hao Zhang, Carsten W. Scherer, Christian Ebenbauer

    Abstract: We present a simple and effective way to account for non-convex costs and constraints~in~state feedback synthesis, and an interpretation for the variables in which state feedback synthesis is typically convex. We achieve this by deriving the controller design using moment matrices of state and input. It turns out that this approach allows the consideration of non-convex constraints by relaxing the… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Preprent to be submitted to IEEE Conference on Decision and Control

  17. arXiv:2403.13314  [pdf, other

    eess.SP

    Superposed IM-OFDM (S-IM-OFDM): An Enhanced OFDM for Integrated Sensing and Communications

    Authors: Zonghui Yang, Shijian Gao, Xiang Cheng, Liuqing Yang

    Abstract: Integrated sensing and communications (ISAC) is a critical enabler for emerging 6G applications, and at its core lies in the dual-functional waveform design. While orthogonal frequency division multiplexing (OFDM) has been a popular basic waveform, its primitive version falls short in sensing due to the inherent unregulated auto-correlation properties. Furthermore, the sensitivity to Doppler shift… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  18. arXiv:2403.10417  [pdf, other

    eess.SP

    Beam Pattern Modulation Embedded mmWave Hybrid Transceiver Design Towards ISAC

    Authors: Boxun Liu, Shijian Gao, Zonghui Yang, Xiang Cheng

    Abstract: Integrated Sensing and Communication (ISAC) emerges as a promising technology for B5G/6G, particularly in the millimeter-wave (mmWave) band. However, the widespread adoption of hybrid architecture in mmWave systems compromises multiplexing gain due to limited radio-frequency chains, resulting in mediocre performance when embedding sensing functionality. To avoid sacrificing the spectrum efficiency… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  19. arXiv:2403.10012  [pdf, other

    cs.CV cs.RO eess.IV physics.optics

    Real-World Computational Aberration Correction via Quantized Domain-Mixing Representation

    Authors: Qi Jiang, Zhonghua Yi, Shaohua Gao, Yao Gao, Xiaolong Qian, Hao Shi, Lei Sun, Zhijie Xu, Kailun Yang, Kaiwei Wang

    Abstract: Relying on paired synthetic data, existing learning-based Computational Aberration Correction (CAC) methods are confronted with the intricate and multifaceted synthetic-to-real domain gap, which leads to suboptimal performance in real-world applications. In this paper, in contrast to improving the simulation pipeline, we deliver a novel insight into real-world CAC from the perspective of Unsupervi… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Codes and datasets will be made publicly available at https://github.com/zju-jiangqi/QDMR

  20. arXiv:2403.08505  [pdf, other

    eess.IV cs.AI cs.CV cs.MM

    Content-aware Masked Image Modeling Transformer for Stereo Image Compression

    Authors: Xinjie Zhang, Shenyuan Gao, Zhening Liu, Jiawei Shao, Xingtong Ge, Dailan He, Tongda Xu, Yan Wang, Jun Zhang

    Abstract: Existing learning-based stereo image codec adopt sophisticated transformation with simple entropy models derived from single image codecs to encode latent representations. However, those entropy models struggle to effectively capture the spatial-disparity characteristics inherent in stereo images, which leads to suboptimal rate-distortion results. In this paper, we propose a stereo image compressi… ▽ More

    Submitted 19 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  21. arXiv:2403.05834  [pdf, other

    cs.MM cs.SD eess.AS

    Enhancing Expressiveness in Dance Generation via Integrating Frequency and Music Style Information

    Authors: Qiaochu Huang, Xu He, Boshi Tang, Haolin Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen Meng

    Abstract: Dance generation, as a branch of human motion generation, has attracted increasing attention. Recently, a few works attempt to enhance dance expressiveness, which includes genre matching, beat alignment, and dance dynamics, from certain aspects. However, the enhancement is quite limited as they lack comprehensive consideration of the aforementioned three factors. In this paper, we propose Expressi… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  22. arXiv:2402.16908  [pdf

    cs.ET cond-mat.mtrl-sci cs.LG eess.IV

    Lightweight, error-tolerant edge detection using memristor-enabled stochastic logics

    Authors: Lekai Song, Pengyu Liu, Jingfang Pei, Yang Liu, Songwei Liu, Shengbo Wang, Leonard W. T. Ng, Tawfique Hasan, Kong-Pang Pun, Shuo Gao, Guohua Hu

    Abstract: The demand for efficient edge vision has spurred the interest in developing stochastic computing approaches for performing image processing tasks. Memristors with inherent stochasticity readily introduce probability into the computations and thus enable stochastic image processing computations. Here, we present a stochastic computing approach for edge detection, a fundamental image processing tech… ▽ More

    Submitted 20 March, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

  23. arXiv:2402.09463  [pdf

    eess.IV

    Multi-Center Fetal Brain Tissue Annotation (FeTA) Challenge 2022 Results

    Authors: Kelly Payette, Céline Steger, Roxane Licandro, Priscille de Dumast, Hongwei Bran Li, Matthew Barkovich, Liu Li, Maik Dannecker, Chen Chen, Cheng Ouyang, Niccolò McConnell, Alina Miron, Yongmin Li, Alena Uus, Irina Grigorescu, Paula Ramirez Gilliland, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Haoyu Wang, Ziyan Huang, Jin Ye, Mireia Alenyà, Valentin Comte, Oscar Camara , et al. (42 additional authors not shown)

    Abstract: Segmentation is a critical step in analyzing the developing human fetal brain. There have been vast improvements in automatic segmentation methods in the past several years, and the Fetal Brain Tissue Annotation (FeTA) Challenge 2021 helped to establish an excellent standard of fetal brain segmentation. However, FeTA 2021 was a single center study, and the generalizability of algorithms across dif… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Results from FeTA Challenge 2022, held at MICCAI; Manuscript submitted. Supplementary Info (including submission methods descriptions) available here: https://zenodo.org/records/10628648

  24. arXiv:2402.08916  [pdf, other

    eess.SP cs.IT

    Lightweight Deep Learning Based Channel Estimation for Extremely Large-Scale Massive MIMO Systems

    Authors: Shen Gao, Peihao Dong, Zhiwen Pan, Xiaohu You

    Abstract: Extremely large-scale massive multiple-input multiple-output (XL-MIMO) systems introduce the much higher channel dimensionality and incur the additional near-field propagation effect, aggravating the computation load and the difficulty to acquire the prior knowledge for channel estimation. In this article, an XL-MIMO channel network (XLCNet) is developed to estimate the high-dimensional channel, w… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE Transactions on Vehicular Technology

  25. arXiv:2402.01194  [pdf, other

    eess.SP

    A Robust Super-resolution Gridless Imaging Framework for UAV-borne SAR Tomography

    Authors: Silin Gao, Wenlong Wang, Muhan Wang, Zhe Zhang, Zai Yang, Xiaolan Qiu, Bingchen Zhang, Yirong Wu

    Abstract: Synthetic aperture radar (SAR) tomography (TomoSAR) retrieves three-dimensional (3-D) information from multiple SAR images, effectively addresses the layover problem, and has become pivotal in urban mapping. Unmanned aerial vehicle (UAV) has gained popularity as a TomoSAR platform, offering distinct advantages such as the ability to achieve 3-D imaging in a single flight, cost-effectiveness, rapid… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  26. arXiv:2312.14776  [pdf, other

    cs.CV eess.IV

    Compressing Image-to-Image Translation GANs Using Local Density Structures on Their Learned Manifold

    Authors: Alireza Ganjdanesh, Shangqian Gao, Hirad Alipanah, Heng Huang

    Abstract: Generative Adversarial Networks (GANs) have shown remarkable success in modeling complex data distributions for image-to-image translation. Still, their high computational demands prohibit their deployment in practical scenarios like edge devices. Existing GAN compression methods mainly rely on knowledge distillation or convolutional classifiers' pruning techniques. Thus, they neglect the critical… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: The 38th Annual AAAI Conference on Artificial Intelligence, AAAI 2024

  27. arXiv:2311.15683  [pdf

    eess.AS cs.SD eess.SP

    Ultrasensitive Textile Strain Sensors Redefine Wearable Silent Speech Interfaces with High Machine Learning Efficiency

    Authors: Chenyu Tang, Muzi Xu, Wentian Yi, Zibo Zhang, Edoardo Occhipinti, Chaoqun Dong, Dafydd Ravenscroft, Sung-Min Jung, Sanghyo Lee, Shuo Gao, Jong Min Kim, Luigi G. Occhipinti

    Abstract: Our research presents a wearable Silent Speech Interface (SSI) technology that excels in device comfort, time-energy efficiency, and speech decoding accuracy for real-world use. We developed a biocompatible, durable textile choker with an embedded graphene-based strain sensor, capable of accurately detecting subtle throat movements. This sensor, surpassing other strain sensors in sensitivity by 42… ▽ More

    Submitted 7 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: 5 figures in the article; 11 figures and 4 tables in supplementary information

    Journal ref: npj Flexible Electronics (2024)

  28. arXiv:2310.02561  [pdf, other

    eess.SP

    Integrated Sensing and Communications Towards Proactive Beamforming in mmWave V2I via Multi-Modal Feature Fusion (MMFF)

    Authors: Haotian Zhang, Shijian Gao, Xiang Cheng, Liuqing Yang

    Abstract: The future of vehicular communication networks relies on mmWave massive multi-input-multi-output antenna arrays for intensive data transfer and massive vehicle access. However, reliable vehicle-to-infrastructure links require exact alignment between the narrow beams, which traditionally involves excessive signaling overhead. To address this issue, we propose a novel proactive beamforming scheme th… ▽ More

    Submitted 26 March, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: 14 pages, 12 figures, 5 tables

  29. arXiv:2309.08835  [pdf

    eess.SP cs.LG cs.NE cs.RO

    Intelligent machines work in unstructured environments by differential neuromorphic computing

    Authors: Shengbo Wang, Shuo Gao, Chenyu Tang, Edoardo Occhipinti, Cong Li, Shurui Wang, Jiaqi Wang, Hubin Zhao, Guohua Hu, Arokia Nathan, Ravinder Dahiya, Luigi Occhipinti

    Abstract: Efficient operation of intelligent machines in the real world requires methods that allow them to understand and predict the uncertainties presented by the unstructured environments with good accuracy, scalability and generalization, similar to humans. Current methods rely on pretrained networks instead of continuously learning from the dynamic signal properties of working environments and suffer… ▽ More

    Submitted 17 November, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: 16 pages, 5 figures

    Journal ref: Nat Commun, vol. 15, no. 1, p. 4671, May 2024

  30. arXiv:2309.07765  [pdf, other

    cs.SD cs.CL eess.AS

    Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks

    Authors: Sizhou Chen, Songyang Gao, Sen Fang

    Abstract: The Transformer architecture has proven to be highly effective for Automatic Speech Recognition (ASR) tasks, becoming a foundational component for a plethora of research in the domain. Historically, many approaches have leaned on fixed-length attention windows, which becomes problematic for varied speech samples in duration and complexity, leading to data over-smoothing and neglect of essential lo… ▽ More

    Submitted 7 April, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

  31. Human Body Digital Twin: A Master Plan

    Authors: Chenyu Tang, Wentian Yi, Edoardo Occhipinti, Yanning Dai, Shuo Gao, Luigi G. Occhipinti

    Abstract: A human body digital twin (DT) is a virtual representation of an individual's physiological state, created using real-time data from sensors and medical test devices, with the purpose of simulating, predicting, and optimizing health outcomes through advanced analytics and simulations. The human body DT has the potential to revolutionize healthcare and wellness, but its responsible and effective im… ▽ More

    Submitted 12 September, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

    Comments: 3 figures, 2 boxes

  32. arXiv:2307.01445  [pdf, ps, other

    eess.SP

    Distributed fusion filter over lossy wireless sensor networks with the presence of non-Gaussian noise

    Authors: Jiacheng He, Bei Peng, Zhenyu Feng, Xuemei Mao, Song Gao, Gang Wang

    Abstract: The information transmission between nodes in a wireless sensor networks (WSNs) often causes packet loss due to denial-of-service (DoS) attack, energy limitations, and environmental factors, and the information that is successfully transmitted can also be contaminated by non-Gaussian noise. The presence of these two factors poses a challenge for distributed state estimation (DSE) over WSNs. In thi… ▽ More

    Submitted 6 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

  33. arXiv:2307.01146  [pdf, other

    cs.CV cs.LG cs.SD eess.AS

    AVSegFormer: Audio-Visual Segmentation with Transformer

    Authors: Shengyi Gao, Zhe Chen, Guo Chen, Wenhai Wang, Tong Lu

    Abstract: The combination of audio and vision has long been a topic of interest in the multi-modal community. Recently, a new audio-visual segmentation (AVS) task has been introduced, aiming to locate and segment the sounding objects in a given video. This task demands audio-driven pixel-level scene understanding for the first time, posing significant challenges. In this paper, we propose AVSegFormer, a nov… ▽ More

    Submitted 18 December, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: 7 pages, 6 figures

  34. arXiv:2306.14143  [pdf, other

    eess.SP

    Intelligent Multi-Modal Sensing-Communication Integration: Synesthesia of Machines

    Authors: Xiang Cheng, Haotian Zhang, Jianan Zhang, Shijian Gao, Sijiang Li, Ziwei Huang, Lu Bai, Zonghui Yang, Xinhu Zheng, Liuqing Yang

    Abstract: In the era of sixth-generation (6G) wireless communications, integrated sensing and communications (ISAC) is recognized as a promising solution to upgrade the physical system by endowing wireless communications with sensing capability. Existing ISAC is mainly oriented to static scenarios with radio-frequency (RF) sensors being the primary participants, thus lacking a comprehensive environment feat… ▽ More

    Submitted 20 November, 2023; v1 submitted 25 June, 2023; originally announced June 2023.

    Comments: This paper has been accepted by IEEE Communications Surveys & Tutorials

  35. arXiv:2306.12992  [pdf, other

    cs.CV eess.IV physics.optics

    Minimalist and High-Quality Panoramic Imaging with PSF-aware Transformers

    Authors: Qi Jiang, Shaohua Gao, Yao Gao, Kailun Yang, Zhonghua Yi, Hao Shi, Lei Sun, Kaiwei Wang

    Abstract: High-quality panoramic images with a Field of View (FoV) of 360° are essential for contemporary panoramic computer vision tasks. However, conventional imaging systems come with sophisticated lens designs and heavy optical components. This disqualifies their usage in many mobile and wearable applications where thin and portable, minimalist imaging systems are desired. In this paper, we propose a Pa… ▽ More

    Submitted 4 July, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

    Comments: Accepted to IEEE Transactions on Image Processing (TIP). The dataset and code will be available at https://github.com/zju-jiangqi/PCIE-PART

  36. arXiv:2306.11476  [pdf, other

    eess.SP

    A Model Fusion Distributed Kalman Filter For Non-Gaussian Observation Noise

    Authors: Xuemei Mao, Gang Wang, Bei Peng, Jiacheng He, Kun Zhang, Song Gao

    Abstract: The distributed Kalman filter (DKF) has attracted extensive research as an information fusion method for wireless sensor systems(WSNs). And the DKF in non-Gaussian environments is still a pressing problem. In this paper, we approximate the non-Gaussian noise as a Gaussian mixture model and estimate the parameters through the expectation-maximization algorithm. A DKF, called model fusion DKF (MFDKF… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

  37. arXiv:2306.02894  [pdf, ps, other

    eess.IV

    Recyclable Semi-supervised Method Based on Multi-model Ensemble for Video Scene Parsing

    Authors: Biao Wu, Shaoli Liu, Diankai Zhang, Chengjian Zheng, Si Gao, Xiaofeng Zhang, Ning Wang

    Abstract: Pixel-level Scene Understanding is one of the fundamental problems in computer vision, which aims at recognizing object classes, masks and semantics of each pixel in the given image. Since the real-world is actually video-based rather than a static state, learning to perform video semantic segmentation is more reasonable and practical for realistic applications. In this paper, we adopt Mask2Former… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  38. arXiv:2305.11438  [pdf, other

    cs.CL eess.AS

    Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring

    Authors: Kaiqi Fu, Shaojun Gao, Shuju Shi, Xiaohai Tian, Wei Li, Zejun Ma

    Abstract: Speech fluency/disfluency can be evaluated by analyzing a range of phonetic and prosodic features. Deep neural networks are commonly trained to map fluency-related features into the human scores. However, the effectiveness of deep learning-based models is constrained by the limited amount of labeled training samples. To address this, we introduce a self-supervised learning (SSL) approach that take… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  39. arXiv:2304.10780  [pdf, other

    cs.CV eess.IV

    Omni-Line-of-Sight Imaging for Holistic Shape Reconstruction

    Authors: Binbin Huang, Xingyue Peng, Siyuan Shen, Suan Xia, Ruiqian Li, Yanhua Yu, Yuehan Wang, Shenghua Gao, Wenzheng Chen, Shiying Li, Jingyi Yu

    Abstract: We introduce Omni-LOS, a neural computational imaging method for conducting holistic shape reconstruction (HSR) of complex objects utilizing a Single-Photon Avalanche Diode (SPAD)-based time-of-flight sensor. As illustrated in Fig. 1, our method enables new capabilities to reconstruct near-$360^\circ$ surrounding geometry of an object from a single scan spot. In such a scenario, traditional line-o… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

  40. arXiv:2304.09850  [pdf, other

    cs.RO eess.SY

    Patching Neural Barrier Functions Using Hamilton-Jacobi Reachability

    Authors: Sander Tonkens, Alex Toofanian, Zhizhen Qin, Sicun Gao, Sylvia Herbert

    Abstract: Learning-based control algorithms have led to major advances in robotics at the cost of decreased safety guarantees. Recently, neural networks have also been used to characterize safety through the use of barrier functions for complex nonlinear systems. Learned barrier functions approximately encode and enforce a desired safety constraint through a value function, but do not provide any formal gua… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: 8 pages, submitted to IEEE Conference on Decision and Control (CDC), 2023

  41. arXiv:2303.04374  [pdf, other

    eess.SP

    Efficient Gridless DoA Estimation Method of Non-uniform Linear Arrays with Applications in Automotive Radars

    Authors: Silin Gao, Zhe Zhang, Muhan Wang, Yan Zhang, Jie Zhao, Bingchen Zhang, Yue Wang, Yirong Wu

    Abstract: This paper focuses on the gridless direction-of-arrival (DoA) estimation for data acquired by non-uniform linear arrays (NLAs) in automotive applications. Atomic norm minimization (ANM) is a promising gridless sparse recovery algorithm under the Toeplitz model and solved by convex relaxation, thus it is only applicable to uniform linear arrays (ULAs) with array manifolds having a Vandermonde struc… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  42. arXiv:2301.05867  [pdf, other

    eess.SP

    State Estimation of Wireless Sensor Networks in the Presence of Data Packet Drops and Non-Gaussian Noise

    Authors: Jiacheng He, Gang Wang, Xuemei Mao, Song Gao, Bei Peng

    Abstract: Distributed Kalman filter approaches based on the maximum correntropy criterion have recently demonstrated superior state estimation performance to that of conventional distributed Kalman filters for wireless sensor networks in the presence of non-Gaussian impulsive noise. However, these algorithms currently fail to take account of data packet drops. The present work addresses this issue by propos… ▽ More

    Submitted 3 September, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

  43. arXiv:2212.14747  [pdf, other

    eess.IV cs.CV

    VertMatch: A Semi-supervised Framework for Vertebral Structure Detection in 3D Ultrasound Volume

    Authors: Hongye Zeng, kang Zhou, Songhan Ge, Yuchong Gao, Jianhao Zhao, Shenghua Gao, Rui Zheng

    Abstract: Three-dimensional (3D) ultrasound imaging technique has been applied for scoliosis assessment, but current assessment method only uses coronal projection image and cannot illustrate the 3D deformity and vertebra rotation. The vertebra detection is essential to reveal 3D spine information, but the detection task is challenging due to complex data and limited annotations. We propose VertMatch, a two… ▽ More

    Submitted 28 December, 2022; originally announced December 2022.

    Comments: 15 pages, 8 figures

  44. ATASI-Net: An Efficient Sparse Reconstruction Network for Tomographic SAR Imaging with Adaptive Threshold

    Authors: Muhan Wang, Zhe Zhang, Xiaolan Qiu, Silin Gao, Yue Wang

    Abstract: Tomographic SAR technique has attracted remarkable interest for its ability of three-dimensional resolving along the elevation direction via a stack of SAR images collected from different cross-track angles. The emerged compressed sensing (CS)-based algorithms have been introduced into TomoSAR considering its super-resolution ability with limited samples. However, the conventional CS-based methods… ▽ More

    Submitted 30 November, 2022; originally announced November 2022.

  45. arXiv:2211.11257  [pdf, other

    cs.CV eess.IV physics.optics

    Computational Imaging for Machine Perception: Transferring Semantic Segmentation beyond Aberrations

    Authors: Qi Jiang, Hao Shi, Shaohua Gao, Jiaming Zhang, Kailun Yang, Lei Sun, Huajian Ni, Kaiwei Wang

    Abstract: Semantic scene understanding with Minimalist Optical Systems (MOS) in mobile and wearable applications remains a challenge due to the corrupted imaging quality induced by optical aberrations. However, previous works only focus on improving the subjective imaging quality through the Computational Imaging (CI) technique, ignoring the feasibility of advancing semantic segmentation. In this paper, we… ▽ More

    Submitted 14 March, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted to IEEE Transactions on Computational Imaging (TCI). The project page is at https://github.com/zju-jiangqi/CIADA

  46. arXiv:2211.10026  [pdf, other

    eess.IV cs.CV

    DGD-cGAN: A Dual Generator for Image Dewatering and Restoration

    Authors: Salma Gonzalez-Sabbagh, Antonio Robles-Kelly, Shang Gao

    Abstract: Underwater images are usually covered with a blue-greenish colour cast, making them distorted, blurry or low in contrast. This phenomenon occurs due to the light attenuation given by the scattering and absorption in the water column. In this paper, we present an image enhancement approach for dewatering which employs a conditional generative adversarial network (cGAN) with two generators. Our Dual… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

    Comments: 12 pages and 61 images

  47. arXiv:2211.05256  [pdf, other

    eess.IV cs.CV

    Power Efficient Video Super-Resolution on Mobile NPUs with Deep Learning, Mobile AI & AIM 2022 challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Cheng-Ming Chiang, Hsien-Kai Kuo, Yu-Syuan Xu, Man-Yu Lee, Allen Lu, Chia-Ming Cheng, Chih-Cheng Chen, Jia-Ying Yong, Hong-Han Shuai, Wen-Huang Cheng, Zhuang Jia, Tianyu Xu, Yijian Zhang, Long Bao, Heng Sun, Diankai Zhang, Si Gao, Shaoli Liu, Biao Wu, Xiaofeng Zhang, Chengjian Zheng, Kaidi Lu, Ning Wang , et al. (29 additional authors not shown)

    Abstract: Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this prob… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2105.08826, arXiv:2105.07809, arXiv:2211.04470, arXiv:2211.03885

  48. arXiv:2210.09378  [pdf, other

    cs.RO cs.AI cs.MA eess.SY

    Learning Control Admissibility Models with Graph Neural Networks for Multi-Agent Navigation

    Authors: Chenning Yu, Hongzhan Yu, Sicun Gao

    Abstract: Deep reinforcement learning in continuous domains focuses on learning control policies that map states to distributions over actions that ideally concentrate on the optimal choices in each step. In multi-agent navigation problems, the optimal actions depend heavily on the agents' density. Their interaction patterns grow exponentially with respect to such density, making it hard for learning-based… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

  49. arXiv:2209.07070  [pdf, ps, other

    eess.SY cs.SI stat.ML

    Fixed-Point Centrality for Networks

    Authors: Shuang Gao

    Abstract: This paper proposes a family of network centralities called fixed-point centralities. This centrality family is defined via the fixed point of permutation equivariant mappings related to the underlying network. Such a centrality notion is immediately extended to define fixed-point centralities for infinite graphs characterized by graphons. Variation bounds of such centralities with respect to the… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: 8 pages, Accepted for presentation at IEEE Conference on Decision and Control

  50. arXiv:2208.03616  [pdf, other

    cs.LG cs.SI eess.SY math.DS

    Transmission Neural Networks: From Virus Spread Models to Neural Networks

    Authors: Shuang Gao, Peter E. Caines

    Abstract: This work connects models for virus spread on networks with their equivalent neural network representations. Based on this connection, we propose a new neural network architecture, called Transmission Neural Networks (TransNNs) where activation functions are primarily associated with links and are allowed to have different activation levels. Furthermore, this connection leads to the discovery and… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: 15 pages