Skip to main content

Showing 1–50 of 205 results for author: Ma, J

  1. arXiv:2407.04719  [pdf

    eess.SP

    UAV-Assisted Weather Radar Calibration: A Theoretical Model for Wind Influence on Metal Sphere Reflectivity

    Authors: Jiabiao Zhao, Da Li, Jiayuan Cui, Houjun Sun, Jianjun Ma

    Abstract: The calibration of weather radar for detecting meteorological phenomena has advanced rapidly, aiming to enhance accuracy. Utilizing an unmanned aerial vehicle (UAV) equipped with a suspended metal sphere introduces an efficient calibration method by allowing dynamic adjustment of the UAV's position, effectively acting as a mobile calibration platform. However, external factors such as wind can int… ▽ More

    Submitted 20 June, 2024; originally announced July 2024.

    Comments: to be published in the 2024 International Conference on Microwave and Millimeter Wave Technology

  2. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  3. arXiv:2407.03374  [pdf

    cs.AI cs.SE eess.SP eess.SY

    An Outline of Prognostics and Health Management Large Model: Concepts, Paradigms, and Challenges

    Authors: Laifa Tao, Shangyu Li, Haifei Liu, Qixuan Huang, Liang Ma, Guoao Ning, Yiling Chen, Yunlong Wu, Bin Li, Weiwei Zhang, Zhengduo Zhao, Wenchao Zhan, Wenyan Cao, Chao Wang, Hongmei Liu, Jian Ma, Mingliang Suo, Yujie Cheng, Yu Ding, Dengwei Song, Chen Lu

    Abstract: Prognosis and Health Management (PHM), critical for ensuring task completion by complex systems and preventing unexpected failures, is widely adopted in aerospace, manufacturing, maritime, rail, energy, etc. However, PHM's development is constrained by bottlenecks like generalization, interpretation and verification abilities. Presently, generative artificial intelligence (AI), represented by Larg… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  4. arXiv:2407.02264  [pdf, other

    cs.CV cs.SD eess.AS

    SOAF: Scene Occlusion-aware Neural Acoustic Field

    Authors: Huiyu Gao, Jiahao Ma, David Ahmedt-Aristizabal, Chuong Nguyen, Miaomiao Liu

    Abstract: This paper tackles the problem of novel view audio-visual synthesis along an arbitrary trajectory in an indoor scene, given the audio-video recordings from other known trajectories of the scene. Existing methods often overlook the effect of room geometry, particularly wall occlusion to sound propagation, making them less accurate in multi-room environments. In this work, we propose a new approach… ▽ More

    Submitted 2 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  5. arXiv:2406.11519  [pdf, other

    cs.CV eess.IV

    HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

    Authors: Di Wang, Meiqi Hu, Yao Jin, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi Han, Naoto Yokoya, Jing Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo Du, Dacheng Tao, Liangpei Zhang

    Abstract: Foundation models (FMs) are revolutionizing the analysis and understanding of remote sensing (RS) scenes, including aerial RGB, multispectral, and SAR images. However, hyperspectral images (HSIs), which are rich in spectral information, have not seen much application of FMs, with existing methods often restricted to specific tasks and lacking generality. To fill this gap, we introduce HyperSIGMA,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: The code and models will be released at https://github.com/WHU-Sigma/HyperSIGMA

  6. arXiv:2406.10724  [pdf, other

    eess.IV cs.CV cs.LG

    Beyond the Visible: Jointly Attending to Spectral and Spatial Dimensions with HSI-Diffusion for the FINCH Spacecraft

    Authors: Ian Vyse, Rishit Dagli, Dav Vrat Chadha, John P. Ma, Hector Chen, Isha Ruparelia, Prithvi Seran, Matthew Xie, Eesa Aamer, Aidan Armstrong, Naveen Black, Ben Borstein, Kevin Caldwell, Orrin Dahanaggamaarachchi, Joe Dai, Abeer Fatima, Stephanie Lu, Maxime Michet, Anoushka Paul, Carrie Ann Po, Shivesh Prakash, Noa Prosser, Riddhiman Roy, Mirai Shinjo, Iliya Shofman , et al. (4 additional authors not shown)

    Abstract: Satellite remote sensing missions have gained popularity over the past fifteen years due to their ability to cover large swaths of land at regular intervals, making them ideal for monitoring environmental trends. The FINCH mission, a 3U+ CubeSat equipped with a hyperspectral camera, aims to monitor crop residue cover in agricultural fields. Although hyperspectral imaging captures both spectral and… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: To appear in 38th Annual Small Satellite Conference

  7. arXiv:2406.01138  [pdf, ps, other

    eess.SP cs.IT

    Precise Analysis of Covariance Identifiability for Activity Detection in Grant-Free Random Access

    Authors: Shengsong Luo, Junjie Ma, Chongbin Xu, Xin Wang

    Abstract: We consider the identifiability issue of maximum likelihood based activity detection in massive MIMO based grant-free random access. A prior work by Chen et al. indicates that the identifiability undergoes a phase transition for commonly-used random signatures. In this paper, we provide an analytical characterization of the boundary of the phase transition curve. Our theoretical results agree well… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  8. arXiv:2405.19338  [pdf, other

    eess.SP cs.AI cs.CV

    Accurate Patient Alignment without Unnecessary Imaging Dose via Synthesizing Patient-specific 3D CT Images from 2D kV Images

    Authors: Yuzhen Ding, Jason M. Holmes, Hongying Feng, Baoxin Li, Lisa A. McGee, Jean-Claude M. Rwigema, Sujay A. Vora, Daniel J. Ma, Robert L. Foote, Samir H. Patel, Wei Liu

    Abstract: In radiotherapy, 2D orthogonally projected kV images are used for patient alignment when 3D-on-board imaging(OBI) unavailable. But tumor visibility is constrained due to the projection of patient's anatomy onto a 2D plane, potentially leading to substantial setup errors. In treatment room with 3D-OBI such as cone beam CT(CBCT), the field of view(FOV) of CBCT is limited with unnecessarily high imag… ▽ More

    Submitted 1 April, 2024; originally announced May 2024.

    Comments: 17 pages, 8 figures and tables

  9. arXiv:2405.18435  [pdf, other

    eess.IV cs.CV

    QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

    Authors: Hongwei Bran Li, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

    Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More

    Submitted 24 June, 2024; v1 submitted 19 March, 2024; originally announced May 2024.

    Comments: initial technical report

  10. arXiv:2405.18255  [pdf, other

    cs.CR cs.SI eess.SP

    Channel Reciprocity Based Attack Detection for Securing UWB Ranging by Autoencoder

    Authors: Wenlong Gou, Chuanhang Yu, Juntao Ma, Gang Wu, Vladimir Mordachev

    Abstract: A variety of ranging threats represented by Ghost Peak attack have raised concerns regarding the security performance of Ultra-Wide Band (UWB) systems with the finalization of the IEEE 802.15.4z standard. Based on channel reciprocity, this paper proposes a low complexity attack detection scheme that compares Channel Impulse Response (CIR) features of both ranging sides utilizing an autoencoder wit… ▽ More

    Submitted 10 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    ACM Class: H.1.1

  11. arXiv:2405.17702  [pdf

    eess.SY

    A Two-sided Model for EV Market Dynamics and Policy Implications

    Authors: Haoxuan Ma, Brian Yueshuai He, Tomas Kaljevic, Jiaqi Ma

    Abstract: The diffusion of Electric Vehicles (EVs) plays a pivotal role in mitigating greenhouse gas emissions, particularly in the U.S., where ambitious zero-emission and carbon neutrality objectives have been set. In pursuit of these goals, many states have implemented a range of incentive policies aimed at stimulating EV adoption and charging infrastructure development, especially public EV charging stat… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Conference preprint, 8 pages, 3 figures

  12. arXiv:2405.12408  [pdf, other

    cs.RO eess.SY

    Flexible Active Safety Motion Control for Robotic Obstacle Avoidance: A CBF-Guided MPC Approach

    Authors: Jinhao Liu, Jun Yang, Jianliang Mao, Tianqi Zhu, Qihang Xie, Yimeng Li, Xiangyu Wang, Shihua Li

    Abstract: A flexible active safety motion (FASM) control approach is proposed for the avoidance of dynamic obstacles and the reference tracking in robot manipulators. The distinctive feature of the proposed method lies in its utilization of control barrier functions (CBF) to design flexible CBF-guided safety criteria (CBFSC) with dynamically optimized decay rates, thereby offering flexibility and active saf… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 11 pages, 11 figures

  13. arXiv:2405.00316  [pdf, other

    cs.RO eess.SY

    Enhance Planning with Physics-informed Safety Controller for End-to-end Autonomous Driving

    Authors: Hang Zhou, Haichao Liu, Hongliang Lu, Dan Xu, Jun Ma, Yiding Ji

    Abstract: Recent years have seen a growing research interest in applications of Deep Neural Networks (DNN) on autonomous vehicle technology. The trend started with perception and prediction a few years ago and it is gradually being applied to motion planning tasks. Despite the performance of networks improve over time, DNN planners inherit the natural drawbacks of Deep Learning. Learning-based planners have… ▽ More

    Submitted 5 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  14. arXiv:2404.04879  [pdf, other

    cs.RO eess.SY

    Multi-Type Map Construction via Semantics-Aware Autonomous Exploration in Unknown Indoor Environments

    Authors: Jianfang Mao, Yuheng Xie, Si Chen, Zhixiong Nan, Xiao Wang

    Abstract: This paper proposes a novel semantics-aware autonomous exploration model to handle the long-standing issue: the mainstream RRT (Rapid-exploration Random Tree) based exploration models usually make the mobile robot switch frequently between different regions, leading to the excessively-repeated explorations for the same region. Our proposed semantics-aware model encourages a mobile robot to fully e… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  15. arXiv:2404.02663  [pdf

    eess.SP cs.IT

    Ground-to-UAV sub-Terahertz channel measurement and modeling

    Authors: Da Li, Peian Li, Jiabiao Zhao, Jianjian Liang, Jiacheng Liu, Guohao Liu, Yuanshuai Lei, Wenbo Liu, Jianqin Deng, Fuyong Liu, Jianjun Ma

    Abstract: Unmanned Aerial Vehicle (UAV) assisted terahertz (THz) wireless communications have been expected to play a vital role in the next generation of wireless networks. UAVs can serve as either repeaters or data collectors within the communication link, thereby potentially augmenting the efficacy of communication systems. Despite their promise, the channel analysis and modeling specific to THz wireless… ▽ More

    Submitted 28 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: Submitted to Optics Express

  16. arXiv:2404.02661  [pdf

    physics.app-ph eess.SP

    Terahertz channel modeling based on surface sensing characteristics

    Authors: Jiayuan Cui, Da Li, Jiabiao Zhao, Jiacheng Liu, Guohao Liu, Xiangkun He, Yue Su, Fei Song, Peian Li, Jianjun Ma

    Abstract: The dielectric properties of environmental surfaces, including walls, floors and the ground, etc., play a crucial role in shaping the accuracy of terahertz (THz) channel modeling, thereby directly impacting the effectiveness of communication systems. Traditionally, acquiring these properties has relied on methods such as terahertz time-domain spectroscopy (THz-TDS) or vector network analyzers (VNA… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Submitted to Nano Communication Networks

  17. arXiv:2404.01654  [pdf, other

    cs.CV cs.AI eess.IV eess.SP

    AI WALKUP: A Computer-Vision Approach to Quantifying MDS-UPDRS in Parkinson's Disease

    Authors: Xiang Xiang, Zihan Zhang, Jing Ma, Yao Deng

    Abstract: Parkinson's Disease (PD) is the second most common neurodegenerative disorder. The existing assessment method for PD is usually the Movement Disorder Society - Unified Parkinson's Disease Rating Scale (MDS-UPDRS) to assess the severity of various types of motor symptoms and disease progression. However, manual assessment suffers from high subjectivity, lack of consistency, and high cost and low ef… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Technical report for AI WALKUP, an APP winning 3rd Prize of 2022 HUST GS AI Innovation and Design Competition

  18. arXiv:2403.19943  [pdf, other

    cs.LG cs.AI eess.SP

    TDANet: A Novel Temporal Denoise Convolutional Neural Network With Attention for Fault Diagnosis

    Authors: Zhongzhi Li, Rong Fan, Jingqi Tu, Jinyi Ma, Jianliang Ai, Yiqun Dong

    Abstract: Fault diagnosis plays a crucial role in maintaining the operational integrity of mechanical systems, preventing significant losses due to unexpected failures. As intelligent manufacturing and data-driven approaches evolve, Deep Learning (DL) has emerged as a pivotal technique in fault diagnosis research, recognized for its ability to autonomously extract complex features. However, the practical ap… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  19. arXiv:2403.17615  [pdf, other

    eess.IV cs.CV q-bio.QM

    Grad-CAMO: Learning Interpretable Single-Cell Morphological Profiles from 3D Cell Painting Images

    Authors: Vivek Gopalakrishnan, Jingzhe Ma, Zhiyong Xie

    Abstract: Despite their black-box nature, deep learning models are extensively used in image-based drug discovery to extract feature vectors from single cells in microscopy images. To better understand how these networks perform representation learning, we employ visual explainability techniques (e.g., Grad-CAM). Our analyses reveal several mechanisms by which supervised models cheat, exploiting biologicall… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  20. arXiv:2403.13225  [pdf, other

    eess.IV

    Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation

    Authors: Linshan Wu, Zhun Zhong, Jiayi Ma, Yunchao Wei, Hao Chen, Leyuan Fang, Shutao Li

    Abstract: Weakly-Supervised Semantic Segmentation (WSSS) aims to train segmentation models by weak labels, which is receiving significant attention due to its low annotation cost. Existing approaches focus on generating pseudo labels for supervision while largely ignoring to leverage the inherent semantic correlation among different pseudo labels. We observe that pseudo-labeled pixels that are close to each… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  21. arXiv:2403.06474  [pdf, other

    eess.SP

    Non-Intrusive Load Monitoring in Smart Grids: A Comprehensive Review

    Authors: Yinyan Liu, Yi Wang, Jin Ma

    Abstract: Non-Intrusive Load Monitoring (NILM) is pivotal in today's energy landscape, offering vital solutions for energy conservation and efficient management. Its growing importance in enhancing energy savings and understanding consumer behavior makes it a pivotal technology for addressing global energy challenges. This paper delivers an in-depth review of NILM, highlighting its critical role in smart ho… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: a comprehensive summary with a dataset list

  22. arXiv:2403.04245  [pdf, other

    cs.SD cs.CV cs.LG cs.MM eess.AS

    A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

    Authors: Yusheng Dai, Hang Chen, Jun Du, Ruoyu Wang, Shihao Chen, Jiefeng Ma, Haotian Wang, Chin-Hui Lee

    Abstract: Advanced Audio-Visual Speech Recognition (AVSR) systems have been observed to be sensitive to missing video frames, performing even worse than single-modality models. While applying the dropout technique to the video modality enhances robustness to missing frames, it simultaneously results in a performance loss when dealing with complete data input. In this paper, we investigate this contrasting p… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: the paper is accepted by CVPR2024

  23. arXiv:2402.17487  [pdf, other

    cs.CV cs.LG eess.IV

    Bit Rate Matching Algorithm Optimization in JPEG-AI Verification Model

    Authors: Panqi Jia, A. Burakhan Koyuncu, Jue Mao, Ze Cui, Yi Ma, Tiansheng Guo, Timofey Solovyev, Alexander Karabutov, Yin Zhao, Jing Wang, Elena Alshina, Andre Kaup

    Abstract: The research on neural network (NN) based image compression has shown superior performance compared to classical compression frameworks. Unlike the hand-engineered transforms in the classical frameworks, NN-based models learn the non-linear transforms providing more compact bit representations, and achieve faster coding speed on parallel devices over their classical counterparts. Those properties… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted at (IEEE) PCS 2024; 6 pages

  24. arXiv:2402.17470  [pdf, other

    cs.CV cs.LG eess.IV

    Bit Distribution Study and Implementation of Spatial Quality Map in the JPEG-AI Standardization

    Authors: Panqi Jia, Jue Mao, Esin Koyuncu, A. Burakhan Koyuncu, Timofey Solovyev, Alexander Karabutov, Yin Zhao, Elena Alshina, Andre Kaup

    Abstract: Currently, there is a high demand for neural network-based image compression codecs. These codecs employ non-linear transforms to create compact bit representations and facilitate faster coding speeds on devices compared to the hand-crafted transforms used in classical frameworks. The scientific and industrial communities are highly interested in these properties, leading to the standardization ef… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 5 pages, 3 figures, 4 tables

  25. arXiv:2402.09451  [pdf, other

    eess.SP physics.optics

    Effects of Transceiver Jitter on the Performance of Optical Scattering Communication Systems

    Authors: Zanqiu Shen, Jianshe Ma, Serge B. Provost, Ping Su

    Abstract: In ultraviolet communications, the transceiver jitter effects have been ignored in previous studies, which can result in non-negligible performance degradation especially in vibration states or in mobile scenes. To address this issue, we model the relationship between the received power and transceiver jitter by making use of a moment-based density function approximation method. Based on this rela… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 5 pages, 2 figures, comments are welcome!

    Journal ref: Optics Letters, 45(20), 5680-5683 (2020)

  26. arXiv:2402.05373  [pdf, other

    eess.IV cs.CV

    Unleashing the Infinity Power of Geometry: A Novel Geometry-Aware Transformer (GOAT) for Whole Slide Histopathology Image Analysis

    Authors: Mingxin Liu, Yunzan Liu, Pengbo Xu, Jiquan Ma

    Abstract: The histopathology analysis is of great significance for the diagnosis and prognosis of cancers, however, it has great challenges due to the enormous heterogeneity of gigapixel whole slide images (WSIs) and the intricate representation of pathological features. However, recent methods have not adequately exploited geometrical representation in WSIs which is significant in disease diagnosis. Theref… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 5 pages, 3 figures. Accepted by 21st IEEE International Symposium on Biomedical Imaging (ISBI 2024)

  27. arXiv:2401.16714  [pdf

    eess.IV

    A Point Cloud Enhancement Method for 4D mmWave Radar Imagery

    Authors: Qingmian Wan, Hongli Peng, Xing Liao, Kuayue Liu, Junfa Mao

    Abstract: A point cloud enhancement method for 4D mmWave radar imagery is proposed in this paper. Based on the patch antenna and MIMO array theories, the MIMO array with small redundancy and high SNR is designed to provide the probability of high angular resolution and detection rate. The antenna array is deployed using a ladder shape in vertical direction to decrease the redundancy and improve the resoluti… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  28. arXiv:2401.09032  [pdf, other

    cs.RO cs.MA eess.SY

    Improved Consensus ADMM for Cooperative Motion Planning of Large-Scale Connected Autonomous Vehicles with Limited Communication

    Authors: Haichao Liu, Zhenmin Huang, Zicheng Zhu, Yulin Li, Shaojie Shen, Jun Ma

    Abstract: This paper investigates a cooperative motion planning problem for large-scale connected autonomous vehicles (CAVs) under limited communications, which addresses the challenges of high communication and computing resource requirements. Our proposed methodology incorporates a parallel optimization algorithm with improved consensus ADMM considering a more realistic locally connected topology network,… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: 15 pages, 10 figures

  29. arXiv:2401.08851  [pdf

    cs.LG cs.CL cs.SD eess.AS q-bio.NC

    Using i-vectors for subject-independent cross-session EEG transfer learning

    Authors: Jonathan Lasko, Jeff Ma, Mike Nicoletti, Jonathan Sussman-Fort, Sooyoung Jeong, William Hartmann

    Abstract: Cognitive load classification is the task of automatically determining an individual's utilization of working memory resources during performance of a task based on physiologic measures such as electroencephalography (EEG). In this paper, we follow a cross-disciplinary approach, where tools and methodologies from speech processing are used to tackle this problem. The corpus we use was released pub… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 11 pages

  30. arXiv:2401.04968  [pdf, other

    cs.RO eess.SY

    A Universal Cooperative Decision-Making Framework for Connected Autonomous Vehicles with Generic Road Topologies

    Authors: Zhenmin Huang, Shaojie Shen, Jun Ma

    Abstract: Cooperative decision-making of Connected Autonomous Vehicles (CAVs) presents a longstanding challenge due to its inherent nonlinearity, non-convexity, and discrete characteristics, compounded by the diverse road topologies encountered in real-world traffic scenarios. The majority of current methodologies are only applicable to a single and specific scenario, predicated on scenario-specific assumpt… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  31. arXiv:2401.04722  [pdf, other

    eess.IV cs.CV cs.LG

    U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation

    Authors: Jun Ma, Feifei Li, Bo Wang

    Abstract: Convolutional Neural Networks (CNNs) and Transformers have been the most popular architectures for biomedical image segmentation, but both of them have limited ability to handle long-range dependencies because of inherent locality or computational complexity. To address this challenge, we introduce U-Mamba, a general-purpose network for biomedical image segmentation. Inspired by the State Space Se… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  32. arXiv:2401.02673  [pdf, other

    eess.AS cs.AI cs.SD

    A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

    Authors: Dongdi Zhao, Jianbo Ma, Lu Lu, Jinke Li, Xuan Ji, Lei Zhu, Fuming Fang, Ming Liu, Feijun Jiang

    Abstract: Far-field speech recognition is a challenging task that conventionally uses signal processing beamforming to attack noise and interference problem. But the performance has been found usually limited due to heavy reliance on environmental assumption. In this paper, we propose a unified multichannel far-field speech recognition system that combines the neural beamforming and transformer-based Listen… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  33. arXiv:2312.16247  [pdf, other

    cs.CV eess.IV

    Toward Accurate and Temporally Consistent Video Restoration from Raw Data

    Authors: Shi Guo, Jianqi Ma, Xi Yang, Zhengqiang Zhang, Lei Zhang

    Abstract: Denoising and demosaicking are two fundamental steps in reconstructing a clean full-color video from raw data, while performing video denoising and demosaicking jointly, namely VJDD, could lead to better video restoration performance than performing them separately. In addition to restoration accuracy, another key challenge to VJDD lies in the temporal consistency of consecutive frames. This issue… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  34. arXiv:2312.15389  [pdf, other

    eess.IV cs.CV

    TJDR: A High-Quality Diabetic Retinopathy Pixel-Level Annotation Dataset

    Authors: Jingxin Mao, Xiaoyu Ma, Yanlong Bi, Rongqing Zhang

    Abstract: Diabetic retinopathy (DR), as a debilitating ocular complication, necessitates prompt intervention and treatment. Despite the effectiveness of artificial intelligence in aiding DR grading, the progression of research toward enhancing the interpretability of DR grading through precise lesion segmentation faces a severe hindrance due to the scarcity of pixel-level annotated DR datasets. To mitigate… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  35. LMMSE-based SIMO Receiver for Ultraviolet Scattering Communication with Nonlinear Conversion

    Authors: Zanqiu Shen, Jianshe Ma, Ping Su

    Abstract: Linear minimum mean square error (LMMSE) receivers are often applied in practical communication scenarios for single-input-multiple-output (SIMO) systems owing to their low computational complexity and competitive performance. However, their performance is only the best among all the linear receivers, as they minimize the bit mean square error (MSE) alone in linear space. To overcome this limitati… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: 5 pages, 3 figures, comments are welcome!

    Journal ref: IEEE Wireless Communications Letters, 10(10), 2140-2144 (2021)

  36. arXiv:2312.11868  [pdf, other

    cs.RO eess.SY

    Dynamic Loco-manipulation on HECTOR: Humanoid for Enhanced ConTrol and Open-source Research

    Authors: Junheng Li, Junchao Ma, Omar Kolt, Manas Shah, Quan Nguyen

    Abstract: Despite their remarkable advancement in locomotion and manipulation, humanoid robots remain challenged by a lack of synchronized loco-manipulation control, hindering their full dynamic potential. In this work, we introduce a versatile and effective approach to controlling and generalizing dynamic locomotion and loco-manipulation on humanoid robots via a Force-and-moment-based Model Predictive Cont… ▽ More

    Submitted 21 December, 2023; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: 14 pages, 13 figures

  37. arXiv:2311.16531  [pdf

    physics.app-ph eess.SY physics.ao-ph

    Channel Modeling for Terahertz Communications in Rain

    Authors: Peian Li, Wenbo Liu, Jiacheng Liu, Da Li, Guohao Liu, Yuanshuai Lei, Jiabiao Zhao, Xiaopeng Wang, Houjun Sun, Jianjun Ma, John F. Federici

    Abstract: Terahertz (THz) communication channels, integral to outdoor applications, are critically influenced by natural factors like rainfall. Our research focused on the nuanced effects of rain on these channels, employing an advanced rainfall emulation system. By analyzing key parameters such as rain rate, altitude based variations in rainfall, and diverse raindrop sizes, we identified the paramount sign… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: submitted to IEEE Transactions on Antennas and Propagation

  38. arXiv:2311.04769  [pdf

    eess.IV cs.CV

    An attention-based deep learning network for predicting Platinum resistance in ovarian cancer

    Authors: Haoming Zhuang, Beibei Li, Jingtong Ma, Patrice Monkam, Shouliang Qi, Wei Qian, Dianning He

    Abstract: Background: Ovarian cancer is among the three most frequent gynecologic cancers globally. High-grade serous ovarian cancer (HGSOC) is the most common and aggressive histological type. Guided treatment for HGSOC typically involves platinum-based combination chemotherapy, necessitating an assessment of whether the patient is platinum-resistant. The purpose of this study is to propose a deep learning… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  39. arXiv:2310.17974   

    cs.CV eess.IV

    FaultSeg Swin-UNETR: Transformer-Based Self-Supervised Pretraining Model for Fault Recognition

    Authors: Zeren Zhang, Ran Chen, Jinwen Ma

    Abstract: This paper introduces an approach to enhance seismic fault recognition through self-supervised pretraining. Seismic fault interpretation holds great significance in the fields of geophysics and geology. However, conventional methods for seismic fault recognition encounter various issues, including dependence on data quality and quantity, as well as susceptibility to interpreter subjectivity. Curre… ▽ More

    Submitted 8 January, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

    Comments: The logical flow and background of the article need significant revisions

  40. arXiv:2310.12570  [pdf, other

    eess.IV cs.CV cs.GR cs.LG

    DA-TransUNet: Integrating Spatial and Channel Dual Attention with Transformer U-Net for Medical Image Segmentation

    Authors: Guanqun Sun, Yizhi Pan, Weikun Kong, Zichang Xu, Jianhua Ma, Teeradaj Racharak, Le-Minh Nguyen, Junyi Xin

    Abstract: Accurate medical image segmentation is critical for disease quantification and treatment evaluation. While traditional Unet architectures and their transformer-integrated variants excel in automated segmentation tasks. However, they lack the ability to harness the intrinsic position and channel features of image. Existing models also struggle with parameter efficiency and computational complexity,… ▽ More

    Submitted 14 November, 2023; v1 submitted 19 October, 2023; originally announced October 2023.

  41. arXiv:2310.05547  [pdf, other

    cs.RO eess.SY

    Geometry-Aware Safety-Critical Local Reactive Controller for Robot Navigation in Unknown and Cluttered Environments

    Authors: Yulin Li, Xindong Tang, Kai Chen, Chunxin Zheng, Haichao Liu, Jun Ma

    Abstract: This work proposes a safety-critical local reactive controller that enables the robot to navigate in unknown and cluttered environments. In particular, the trajectory tracking task is formulated as a constrained polynomial optimization problem. Then, safety constraints are imposed on the control variables invoking the notion of polynomial positivity certificates in conjunction with their Sum-of-Sq… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  42. arXiv:2309.10758  [pdf, other

    cs.IT eess.SP

    RIS-Assisted Over-the-Air Adaptive Federated Learning with Noisy Downlink

    Authors: Jiayu Mao, Aylin Yener

    Abstract: Over-the-air federated learning (OTA-FL) exploits the inherent superposition property of wireless channels to integrate the communication and model aggregation. Though a naturally promising framework for wireless federated learning, it requires care to mitigate physical layer impairments. In this work, we consider a heterogeneous edge-intelligent network with different edge device resources and no… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: Appeared in 2023 IEEE ICC Workshop on Edge Learning over 5G Mobile Networks and Beyond

  43. arXiv:2309.09883  [pdf, other

    cs.IT eess.SP

    ROAR-Fed: RIS-Assisted Over-the-Air Adaptive Resource Allocation for Federated Learning

    Authors: Jiayu Mao, Aylin Yener

    Abstract: Over-the-air federated learning (OTA-FL) integrates communication and model aggregation by exploiting the innate superposition property of wireless channels. The approach renders bandwidth efficient learning, but requires care in handling the wireless physical layer impairments. In this paper, federated edge learning is considered for a network that is heterogeneous with respect to client (edge no… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: Appeared in 2023 IEEE International Conference on Communications (ICC): Wireless Communications Symposium

  44. arXiv:2309.07925  [pdf, other

    eess.AS cs.AI cs.MM cs.SD

    Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023

    Authors: Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song, Qing Wang, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu, Ya Jiang, Shi Cheng, Jie Zhang, Yuzhe Weng

    Abstract: In this paper, we propose a novel framework for recognizing both discrete and dimensional emotions. In our framework, deep features extracted from foundation models are used as robust acoustic and visual representations of raw video. Three different structures based on attention-guided feature gathering (AFG) are designed for deep feature fusion. Then, we introduce a joint decoding structure for e… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: 5 pages, 4 figures

    Journal ref: The 31st ACM International Conference on Multimedia (MM'23), 2023

  45. arXiv:2309.07185  [pdf

    eess.SP cs.AI cs.HC

    A Health Monitoring System Based on Flexible Triboelectric Sensors for Intelligence Medical Internet of Things and its Applications in Virtual Reality

    Authors: Junqi Mao, Puen Zhou, Xiaoyao Wang, Hongbo Yao, Liuyang Liang, Yiqiao Zhao, Jiawei Zhang, Dayan Ban, Haiwu Zheng

    Abstract: The Internet of Medical Things (IoMT) is a platform that combines Internet of Things (IoT) technology with medical applications, enabling the realization of precision medicine, intelligent healthcare, and telemedicine in the era of digitalization and intelligence. However, the IoMT faces various challenges, including sustainable power supply, human adaptability of sensors and the intelligence of s… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

  46. arXiv:2309.05298  [pdf, other

    cs.RO eess.SY

    Real-Time Parallel Trajectory Optimization with Spatiotemporal Safety Constraints for Autonomous Driving in Congested Traffic

    Authors: Lei Zheng, Rui Yang, Zengqi Peng, Haichao Liu, Michael Yu Wang, Jun Ma

    Abstract: Multi-modal behaviors exhibited by surrounding vehicles (SVs) can typically lead to traffic congestion and reduce the travel efficiency of autonomous vehicles (AVs) in dense traffic. This paper proposes a real-time parallel trajectory optimization method for the AV to achieve high travel efficiency in dynamic and congested environments. A spatiotemporal safety module is developed to facilitate the… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: 8 pages, 7 figures, accepted for publication in the 26th IEEE International Conference on Intelligent Transportation Systems (ITSC 2023)

  47. arXiv:2309.02574  [pdf, other

    eess.IV

    An Improved Upper Bound on the Rate-Distortion Function of Images

    Authors: Zhihao Duan, Jack Ma, Jiangpeng He, Fengqing Zhu

    Abstract: Recent work has shown that Variational Autoencoders (VAEs) can be used to upper-bound the information rate-distortion (R-D) function of images, i.e., the fundamental limit of lossy image compression. In this paper, we report an improved upper bound on the R-D function of images implemented by (1) introducing a new VAE model architecture, (2) applying variable-rate compression techniques, and (3) p… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Conference paper at ICIP 2023. The first two authors share equal contributions

  48. arXiv:2308.13789  [pdf

    eess.SP

    Sensiverse: A dataset for ISAC study

    Authors: Jiajin Luo, Baojian Zhou, Yang Yu, Ping Zhang, Xiaohui Peng, Jianglei Ma, Peiying Zhu, Jianmin Lu, Wen Tong

    Abstract: In order to address the lack of applicable channel models for ISAC research and evaluation, we release Sensiverse, a dataset that can be used for ISAC research. In this paper, we present the method of generating Sensiverse, including the acquisition and formatting of the 3D scene models, the generation of the channel data and associations with Tx/Rx deployment. The file structure and usage of the… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  49. arXiv:2308.13164  [pdf, other

    cs.CV eess.IV

    Diff-Retinex: Rethinking Low-light Image Enhancement with A Generative Diffusion Model

    Authors: Xunpeng Yi, Han Xu, Hao Zhang, Linfeng Tang, Jiayi Ma

    Abstract: In this paper, we rethink the low-light image enhancement task and propose a physically explainable and generative diffusion model for low-light image enhancement, termed as Diff-Retinex. We aim to integrate the advantages of the physical model and the generative network. Furthermore, we hope to supplement and even deduce the information missing in the low-light image through the generative networ… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV 2023

  50. arXiv:2308.11940  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Audio Generation with Multiple Conditional Diffusion Model

    Authors: Zhifang Guo, Jianguo Mao, Rui Tao, Long Yan, Kazushige Ouchi, Hong Liu, Xiangdong Wang

    Abstract: Text-based audio generation models have limitations as they cannot encompass all the information in audio, leading to restricted controllability when relying solely on text. To address this issue, we propose a novel model that enhances the controllability of existing pre-trained text-to-audio models by incorporating additional conditions including content (timestamp) and style (pitch contour and e… ▽ More

    Submitted 28 December, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: Accepted by AAAI 2024