Skip to main content

Showing 1–50 of 142 results for author: Han, S

  1. arXiv:2407.03753  [pdf

    eess.SP

    Low-Complexity SVM Signal Recovery in Bandwidth-Limited 100Gb/s PAM4 PON Upstream

    Authors: Liyan Wu, Yanlu Huang, Kai Jin, Shangya Han, Kun Xu, Yanni Ou

    Abstract: We proposed a low-complexity SVM-based signal recovery algorithm and evaluated it in 100G-PON with 25G-class devices. For the first time, it experimentally achieved 24 dB power budget @ FEC threshold 1E-3 over 40 km SMF, improving receiver sensitivity over 2 dB compared to FFE&DFE.

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2407.00896  [pdf, other

    eess.SP cs.AI

    Channel Modeling Aided Dataset Generation for AI-Enabled CSI Feedback: Advances, Challenges, and Solutions

    Authors: Yupeng Li, Gang Li, Zirui Wen, Shuangfeng Han, Shijian Gao, Guangyi Liu, Jiangzhou Wang

    Abstract: The AI-enabled autoencoder has demonstrated great potential in channel state information (CSI) feedback in frequency division duplex (FDD) multiple input multiple output (MIMO) systems. However, this method completely changes the existing feedback strategies, making it impractical to deploy in recent years. To address this issue, this paper proposes a channel modeling aided data augmentation metho… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  3. arXiv:2406.19856  [pdf

    eess.SP

    LUT-boosted CDR and Equalization for Burst-mode 50/100 Gbit/s Bandwidth-limited Flexible PON

    Authors: Yanlu Huang, Liyan Wu, Shangya Han, Kai Jin, Kun Xu, Yanni Ou

    Abstract: We proposed and experimentally demonstrated a look-up table boosted fast CDR and equalization scheme for the burst-mode 50/100 Gbps bandwidth-limited flexible PON, requiring no preamble for convergence and achieved the same bit error rate performance as in the case of long preambles.

    Submitted 28 June, 2024; originally announced June 2024.

  4. arXiv:2406.19135  [pdf, other

    eess.AS cs.AI

    DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability

    Authors: Hyun Joon Park, Jin Sob Kim, Wooseok Shin, Sung Won Han

    Abstract: Expressive Text-to-Speech (TTS) using reference speech has been studied extensively to synthesize natural speech, but there are limitations to obtaining well-represented styles and improving model generalization ability. In this study, we present Diffusion-based EXpressive TTS (DEX-TTS), an acoustic model designed for reference-based speech synthesis with enhanced style representations. Based on a… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Preprint

  5. arXiv:2406.03274  [pdf, other

    eess.AS cs.AI cs.SD

    Enhancing CTC-based speech recognition with diverse modeling units

    Authors: Shiyi Han, Zhihong Lei, Mingbin Xu, Xingyu Na, Zhen Huang

    Abstract: In recent years, the evolution of end-to-end (E2E) automatic speech recognition (ASR) models has been remarkable, largely due to advances in deep learning architectures like transformer. On top of E2E systems, researchers have achieved substantial accuracy improvement by rescoring E2E model's N-best hypotheses with a phoneme-based model. This raises an interesting question about where the improvem… ▽ More

    Submitted 11 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  6. arXiv:2405.14770  [pdf, other

    eess.IV

    Physics-informed Score-based Diffusion Model for Limited-angle Reconstruction of Cardiac Computed Tomography

    Authors: Shuo Han, Yongshun Xu, Dayang Wang, Bahareh Morovati, Li Zhou, Jonathan S. Maltz, Ge Wang, Hengyong Yu

    Abstract: Cardiac computed tomography (CT) has emerged as a major imaging modality for the diagnosis and monitoring of cardiovascular diseases. High temporal resolution is essential to ensure diagnostic accuracy. Limited-angle data acquisition can reduce scan time and improve temporal resolution, but typically leads to severe image degradation and motivates for improved reconstruction techniques. In this pa… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 12 pages

  7. Cepstral Analysis Based Artifact Detection, Recognition and Removal for Prefrontal EEG

    Authors: Siqi Han, Chao Zhang, Jiaxin Lei, Qingquan Han, Yuhui Du, Anhe Wang, Shuo Bai, Milin Zhang

    Abstract: This paper proposes to use cepstrum for artifact detection, recognition and removal in prefrontal EEG. This work focuses on the artifact caused by eye movement. A database containing artifact-free EEG and eye movement contaminated EEG from different subjects is established. A cepstral analysis-based feature extraction with support vector machine (SVM) based classifier is designed to identify the a… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 5 pages, 4 figures, published by TCAS-II

    Journal ref: IEEE Transactions on Circuits and Systems II: Express Briefs, 2023

  8. arXiv:2403.18695  [pdf, other

    eess.SY cs.RO

    An Efficient Risk-aware Branch MPC for Automated Driving that is Robust to Uncertain Vehicle Behaviors

    Authors: Luyao Zhang, George Pantazis, Shaohang Han, Sergio Grammatico

    Abstract: One of the critical challenges in automated driving is ensuring safety of automated vehicles despite the unknown behavior of the other vehicles. Although motion prediction modules are able to generate a probability distribution associated with various behavior modes, their probabilistic estimates are often inaccurate, thus leading to a possibly unsafe trajectory. To overcome this challenge, we pro… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  9. arXiv:2403.05912  [pdf, other

    eess.IV cs.CV

    Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation

    Authors: Hairong Shi, Songhao Han, Shaofei Huang, Yue Liao, Guanbin Li, Xiangxing Kong, Hua Zhu, Xiaomu Wang, Si Liu

    Abstract: Tumor lesion segmentation on CT or MRI images plays a critical role in cancer diagnosis and treatment planning. Considering the inherent differences in tumor lesion segmentation data across various medical imaging modalities and equipment, integrating medical knowledge into the Segment Anything Model (SAM) presents promising capability due to its versatility and generalization potential. Recent st… ▽ More

    Submitted 11 July, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  10. arXiv:2402.17127  [pdf, other

    cs.SD eess.AS

    Experimental Study: Enhancing Voice Spoofing Detection Models with wav2vec 2.0

    Authors: Taein Kang, Soyul Han, Sunmook Choi, Jaejin Seo, Sanghyeok Chung, Seungeun Lee, Seungsang Oh, Il-Youp Kwak

    Abstract: Conventional spoofing detection systems have heavily relied on the use of handcrafted features derived from speech data. However, a notable shift has recently emerged towards the direct utilization of raw speech waveforms, as demonstrated by methods like SincNet filters. This shift underscores the demand for more sophisticated audio sample features. Moreover, the success of deep learning models, p… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 5 pages

    MSC Class: 00A71 ACM Class: I.2.6

  11. arXiv:2402.16581  [pdf, other

    eess.IV

    Rate Splitting Multiple Access-Enabled Adaptive Panoramic Video Semantic Transmission

    Authors: Haixiao Gao, Mengying Sun, Xiaodong Xu, Shujun Han, Bizhu Wang, Jingxuan Zhang, Ping Zhang

    Abstract: In this paper, we propose an adaptive panoramic video semantic transmission (APVST) framework enabled by rate splitting multiple access (RSMA). The APVST framework consists of a semantic transmitter and receiver, utilizing a deep joint source-channel coding structure to adaptively extract and encode semantic features from panoramic frames. To achieve higher spectral efficiency and conserve bandwid… ▽ More

    Submitted 23 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

  12. arXiv:2402.15151  [pdf, other

    cs.CV cs.CL eess.AS eess.IV

    Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing

    Authors: Jeong Hun Yeo, Seunghee Han, Minsu Kim, Yong Man Ro

    Abstract: In visual speech processing, context modeling capability is one of the most important requirements due to the ambiguous nature of lip movements. For example, homophenes, words that share identical lip movements but produce different sounds, can be distinguished by considering the context. In this paper, we propose a novel framework, namely Visual Speech Processing incorporated with LLMs (VSP-LLM),… ▽ More

    Submitted 13 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: An Erratum was added on the last page of this paper

  13. arXiv:2402.13776  [pdf, other

    eess.IV cs.CV cs.LG

    Cas-DiffCom: Cascaded diffusion model for infant longitudinal super-resolution 3D medical image completion

    Authors: Lianghu Guo, Tianli Tao, Xinyi Cai, Zihao Zhu, Jiawei Huang, Lixuan Zhu, Zhuoyang Gu, Haifeng Tang, Rui Zhou, Siyan Han, Yan Liang, Qing Yang, Dinggang Shen, Han Zhang

    Abstract: Early infancy is a rapid and dynamic neurodevelopmental period for behavior and neurocognition. Longitudinal magnetic resonance imaging (MRI) is an effective tool to investigate such a crucial stage by capturing the developmental trajectories of the brain structures. However, longitudinal MRI acquisition always meets a serious data-missing problem due to participant dropout and failed scans, makin… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  14. arXiv:2401.17450  [pdf, other

    quant-ph cs.AR eess.SY

    Qplacer: Frequency-Aware Component Placement for Superconducting Quantum Computers

    Authors: Junyao Zhang, Hanrui Wang, Qi Ding, Jiaqi Gu, Reouven Assouly, William D. Oliver, Song Han, Kenneth R. Brown, Hai "Helen" Li, Yiran Chen

    Abstract: Noisy Intermediate-Scale Quantum (NISQ) computers face a critical limitation in qubit numbers, hindering their progression towards large-scale and fault-tolerant quantum computing. A significant challenge impeding scaling is crosstalk, characterized by unwanted interactions among neighboring components on quantum chips, including qubits, resonators, and substrate. We motivate a general approach to… ▽ More

    Submitted 8 May, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

  15. Multi-Channel Multi-Domain based Knowledge Distillation Algorithm for Sleep Staging with Single-Channel EEG

    Authors: Chao Zhang, Yiqiao Liao, Siqi Han, Milin Zhang, Zhihua Wang, Xiang Xie

    Abstract: This paper proposed a Multi-Channel Multi-Domain (MCMD) based knowledge distillation algorithm for sleep staging using single-channel EEG. Both knowledge from different domains and different channels are learnt in the proposed algorithm, simultaneously. A multi-channel pre-training and single-channel fine-tuning scheme is used in the proposed work. The knowledge from different channels in the sour… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: 5 pages, 2 figures, published by IEEE TCAS-II

    Journal ref: IEEE Transactions on Circuits and Systems II: Express Briefs, 2022, 69(11): 4608-4612

  16. arXiv:2401.02101  [pdf, ps, other

    eess.SP

    ICI-Free Channel Estimation and Wireless Gesture Recognition Based on Cellular Signals

    Authors: Rui Peng, Yafei Tian, Shengqian Han

    Abstract: Device-free wireless sensing attracts enormous attentions since it senses the environment without additional devices. While cellular signals are good opportunistic radio sources, the influence of inter-cell interference (ICI) on wireless sensing has not been adequately addressed. In this letter, we first investigate the cause of ICI and its impact on wireless sensing. Then we propose an ICI-free c… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  17. arXiv:2312.15873  [pdf, other

    cs.NI eess.SY

    Investigating Inter-Satellite Link Spanning Patterns on Networking Performance in Mega-constellations

    Authors: Xiangtong Wang, Xiaodong Han, Menglong Yang, Chuan Xing, Yuqi Wang, Songchen Han, Wei Li

    Abstract: Low Earth orbit (LEO) mega-constellations rely on inter-satellite links (ISLs) to provide global connectivity. We note that in addition to the general constellation parameters, the ISL spanning patterns are also greatly influence the final network structure and thus the network performance. In this work, we formulate the ISL spanning patterns, apply different patterns to mega-constellation and g… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: 5pages

  18. arXiv:2311.18762  [pdf, ps, other

    cs.IT eess.SP

    Performance Analysis of Integrated Sensing and Communications Under Gain-Phase Imperfections

    Authors: Shuaishuai Han, Mohammad Ahmad Al-Jarrah, Emad Alsusa

    Abstract: This paper evaluates the performance of uplink integrated sensing and communication systems in the presence of gain and phase imperfections. Specifically, we consider multiple unmanned aerial vehicles (UAVs) transmitting data to a multiple-input-multiple-output base-station (BS) that is responsible for estimating the transmitted information in addition to localising the transmitting UAVs. The sign… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: 38 pages, 7 figures

  19. arXiv:2311.15313  [pdf, ps, other

    eess.SP cs.IT

    Low-Complexity Joint Beamforming for RIS-Assisted MU-MISO Systems Based on Model-Driven Deep Learning

    Authors: Weijie Jin, Jing Zhang, Chao-Kai Wen, Shi Jin, Xiao Li, Shuangfeng Han

    Abstract: Reconfigurable intelligent surfaces (RIS) can improve signal propagation environments by adjusting the phase of the incident signal. However, optimizing the phase shifts jointly with the beamforming vector at the access point is challenging due to the non-convex objective function and constraints. In this study, we propose an algorithm based on weighted minimum mean square error optimization and p… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

    Comments: 14 pages, 9 figures, 2 tables. This paper has been accepted for publication by the IEEE Transactions on Wireless Communications. Copyright may be transferred without notice, after which this version may no longer be accessible

  20. arXiv:2311.14916  [pdf, other

    eess.SY cs.RO

    Automated Lane Merging via Game Theory and Branch Model Predictive Control

    Authors: Luyao Zhang, Shaohang Han, Sergio Grammatico

    Abstract: We propose an integrated behavior and motion planning framework for the automated lane-merging problem. The behavior planner combines search-based planning with game theory to model the interaction between vehicles and select multi-vehicle trajectories. Inspired by human drivers, we model the lane-merging problem as a gap selection process. To overcome the challenge of multi-modal driving behavior… ▽ More

    Submitted 8 March, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

  21. arXiv:2310.16869  [pdf

    eess.IV physics.optics

    Single-pixel imaging based on deep learning

    Authors: Kai Song, Yaoxing Bian, Ku Wu, Hongrui Liu, Shuangping Han, Jiaming Li, Jiazhao Tian, Chengbin Qin, Jianyong Hu, Liantuan Xiao

    Abstract: Single-pixel imaging can collect images at the wavelengths outside the reach of conventional focal plane array detectors. However, the limited image quality and lengthy computational times for iterative reconstruction still impede the practical application of single-pixel imaging. Recently, deep learning has been introduced into single-pixel imaging, which has attracted a lot of attention due to i… ▽ More

    Submitted 16 November, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

  22. arXiv:2310.12405  [pdf, other

    eess.IV cs.CV

    LoMAE: Low-level Vision Masked Autoencoders for Low-dose CT Denoising

    Authors: Dayang Wang, Yongshun Xu, Shuo Han, Zhan Wu, Li Zhou, Bahareh Morovati, Hengyong Yu

    Abstract: Low-dose computed tomography (LDCT) offers reduced X-ray radiation exposure but at the cost of compromised image quality, characterized by increased noise and artifacts. Recently, transformer models emerged as a promising avenue to enhance LDCT image quality. However, the success of such models relies on a large amount of paired noisy and clean images, which are often scarce in clinical settings.… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  23. arXiv:2310.07161  [pdf, ps, other

    cs.SD cs.CL eess.AS

    Psychoacoustic Challenges Of Speech Enhancement On VoIP Platforms

    Authors: Joseph Konan, Ojas Bhargave, Shikhar Agnihotri, Shuo Han, Yunyang Zeng, Ankit Shah, Bhiksha Raj

    Abstract: Within the ambit of VoIP (Voice over Internet Protocol) telecommunications, the complexities introduced by acoustic transformations merit rigorous analysis. This research, rooted in the exploration of proprietary sender-side denoising effects, meticulously evaluates platforms such as Google Meets and Zoom. The study draws upon the Deep Noise Suppression (DNS) 2020 dataset, ensuring a structured ex… ▽ More

    Submitted 21 November, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

  24. arXiv:2310.07062  [pdf, other

    cs.SD cs.LG eess.AS

    Acoustic Model Fusion for End-to-end Speech Recognition

    Authors: Zhihong Lei, Mingbin Xu, Shiyi Han, Leo Liu, Zhen Huang, Tim Ng, Yuanyuan Zhang, Ernest Pusateri, Mirko Hannemann, Yaqiao Deng, Man-Hung Siu

    Abstract: Recent advances in deep learning and automatic speech recognition (ASR) have enabled the end-to-end (E2E) ASR system and boosted the accuracy to a new level. The E2E systems implicitly model all conventional ASR components, such as the acoustic model (AM) and the language model (LM), in a single network trained on audio-text pairs. Despite this simpler system architecture, fusing a separate LM, tr… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  25. arXiv:2308.16551  [pdf

    eess.IV cs.CV

    Object Detection for Caries or Pit and Fissure Sealing Requirement in Children's First Permanent Molars

    Authors: Chenyao Jiang, Shiyao Zhai, Hengrui Song, Yuqing Ma, Yachen Fan, Yancheng Fang, Dongmei Yu, Canyang Zhang, Sanyang Han, Runming Wang, Yong Liu, Jianbo Li, Peiwu Qin

    Abstract: Dental caries is one of the most common oral diseases that, if left untreated, can lead to a variety of oral problems. It mainly occurs inside the pits and fissures on the occlusal/buccal/palatal surfaces of molars and children are a high-risk group for pit and fissure caries in permanent molars. Pit and fissure sealing is one of the most effective methods that is widely used in prevention of pit… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

  26. arXiv:2308.11335  [pdf, other

    cs.IT eess.SP

    Graph Neural Network-Enhanced Expectation Propagation Algorithm for MIMO Turbo Receivers

    Authors: Xingyu Zhou, Jing Zhang, Chao-Kai Wen, Shi Jin, Shuangfeng Han

    Abstract: Deep neural networks (NNs) are considered a powerful tool for balancing the performance and complexity of multiple-input multiple-output (MIMO) receivers due to their accurate feature extraction, high parallelism, and excellent inference ability. Graph NNs (GNNs) have recently demonstrated outstanding capability in learning enhanced message passing rules and have shown success in overcoming the dr… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: 15 pages, 12 figures, 2 tables. This paper has been accepted for publication by the IEEE Transactions on Signal Processing. Copyright may be transferred without notice, after which this version may no longer be accessible

  27. arXiv:2308.06634  [pdf, other

    quant-ph eess.SY

    DISQ: Dynamic Iteration Skipping for Variational Quantum Algorithms

    Authors: Junyao Zhang, Hanrui Wang, Gokul Subramanian Ravi, Frederic T. Chong, Song Han, Frank Mueller, Yiran Chen

    Abstract: This paper proposes DISQ to craft a stable landscape for VQA training and tackle the noise drift challenge. DISQ adopts a "drift detector" with a reference circuit to identify and skip iterations that are severely affected by noise drift errors. Specifically, the circuits from the previous training iteration are re-executed as a reference circuit in the current iteration to estimate noise drift im… ▽ More

    Submitted 12 July, 2024; v1 submitted 12 August, 2023; originally announced August 2023.

  28. arXiv:2307.16228  [pdf, other

    cs.MA cs.AI cs.LG eess.SY

    Robust Electric Vehicle Balancing of Autonomous Mobility-On-Demand System: A Multi-Agent Reinforcement Learning Approach

    Authors: Sihong He, Shuo Han, Fei Miao

    Abstract: Electric autonomous vehicles (EAVs) are getting attention in future autonomous mobility-on-demand (AMoD) systems due to their economic and societal benefits. However, EAVs' unique charging patterns (long charging time, high charging frequency, unpredictable charging behaviors, etc.) make it challenging to accurately predict the EAVs supply in E-AMoD systems. Furthermore, the mobility demand's pred… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

    Comments: accepted to International Conference on Intelligent Robots and Systems (IROS2023)

  29. arXiv:2307.16212  [pdf, other

    cs.LG cs.AI cs.GT cs.MA eess.SY

    Robust Multi-Agent Reinforcement Learning with State Uncertainty

    Authors: Sihong He, Songyang Han, Sanbao Su, Shuo Han, Shaofeng Zou, Fei Miao

    Abstract: In real-world multi-agent reinforcement learning (MARL) applications, agents may not have perfect state information (e.g., due to inaccurate measurement or malicious attacks), which challenges the robustness of agents' policies. Though robustness is getting important in MARL deployment, little prior work has studied state uncertainties in MARL, neither in problem formulation nor algorithm design.… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

    Comments: 50 pages, Published in TMLR, Transactions on Machine Learning Research (06/2023)

  30. arXiv:2307.05799  [pdf

    eess.IV cs.CV

    3D Medical Image Segmentation based on multi-scale MPU-Net

    Authors: Zeqiu. Yu, Shuo. Han, Ziheng. Song

    Abstract: The high cure rate of cancer is inextricably linked to physicians' accuracy in diagnosis and treatment, therefore a model that can accomplish high-precision tumor segmentation has become a necessity in many applications of the medical industry. It can effectively lower the rate of misdiagnosis while considerably lessening the burden on clinicians. However, fully automated target organ segmentation… ▽ More

    Submitted 24 July, 2023; v1 submitted 11 July, 2023; originally announced July 2023.

    Comments: 37 pages

  31. arXiv:2306.16193  [pdf

    cs.NI eess.SP

    Deterministic End-to-End Transmission to Optimize the Network Efficiency and Quality of Service: A Paradigm Shift in 6G

    Authors: Xiaoyun Wang, Shuangfeng Han, Zhiming Liu, Qixing Wang

    Abstract: Toward end-to-end mobile service provision with optimized network efficiency and quality of service, tremendous efforts have been devoted in upgrading mobile applications, transport and internet networks, and wireless communication networks for many years. However, the inherent loose coordination between different layers in the end-to-end communication networks leads to unreliable data transmissio… ▽ More

    Submitted 2 July, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: 5 pages, 2 figures

  32. arXiv:2306.09164  [pdf

    cs.NI eess.SP

    Network Architecture Design toward Convergence of Mobile Applications and Networks

    Authors: Shuangfeng Han, Zhiming Liu, Tao Sun, Xiaoyun Wang

    Abstract: With the quick proliferation of extended reality (XR) services, the mobile communications networks are faced with gigantic challenges to meet the diversified and challenging service requirements. A tight coordination or even convergence of applications and mobile networks is highly motivated. In this paper, a multi-domain (e.g. application layer, transport layer, the core network, radio access net… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 7 pages, 5 figures, IEEE communications magazine, under review

  33. arXiv:2306.04628  [pdf, other

    cs.SD cs.MM eess.AS

    Systematic Analysis of Music Representations from BERT

    Authors: Sangjun Han, Hyeongrae Ihm, Woohyung Lim

    Abstract: There have been numerous attempts to represent raw data as numerical vectors that effectively capture semantic and contextual information. However, in the field of symbolic music, previous works have attempted to validate their music embeddings by observing the performance improvement of various fine-tuning tasks. In this work, we directly analyze embeddings from BERT and BERT with contrastive lea… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  34. arXiv:2305.09793  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Reinforcement Learning for Safe Robot Control using Control Lyapunov Barrier Functions

    Authors: Desong Du, Shaohang Han, Naiming Qi, Haitham Bou Ammar, Jun Wang, Wei Pan

    Abstract: Reinforcement learning (RL) exhibits impressive performance when managing complicated control tasks for robots. However, its wide application to physical robots is limited by the absence of strong safety guarantees. To overcome this challenge, this paper explores the control Lyapunov barrier function (CLBF) to analyze the safety and reachability solely based on data without explicitly employing a… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

  35. arXiv:2305.08878  [pdf, other

    eess.IV cs.CV cs.LG

    Learning to Learn Unlearned Feature for Brain Tumor Segmentation

    Authors: Seungyub Han, Yeongmo Kim, Seokhyeon Ha, Jungwoo Lee, Seunghong Choi

    Abstract: We propose a fine-tuning algorithm for brain tumor segmentation that needs only a few data samples and helps networks not to forget the original tasks. Our approach is based on active learning and meta-learning. One of the difficulties in medical image segmentation is the lack of datasets with proper annotations, because it requires doctors to tag reliable annotation and there are many variants of… ▽ More

    Submitted 13 May, 2023; originally announced May 2023.

    Comments: Medical Imaging Meets NeurIPS 2018

  36. arXiv:2305.05587  [pdf, other

    eess.SY

    Predictive Control of Linear Discrete-Time Markovian Jump Systems by Learning Recurrent Patterns

    Authors: SooJean Han, Soon-Jo Chung, John C. Doyle

    Abstract: Incorporating pattern-learning for prediction (PLP) in many discrete-time or discrete-event systems allows for computation-efficient controller design by memorizing patterns to schedule control policies based on their future occurrences. In this paper, we demonstrate the effect of PLP by designing a controller architecture for a class of linear Markovian jump systems (MJS) where the aforementioned… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: Preprint submitted to Automatica as of Jan 2023

  37. arXiv:2304.14467  [pdf, other

    eess.SP

    Distributed Quantized Detection of Sparse Signals Under Byzantine Attacks

    Authors: Chen Quan, Yunghsiang S. Han, Baocheng Geng, Pramod K. Varshney

    Abstract: This paper investigates distributed detection of sparse stochastic signals with quantized measurements under Byzantine attacks. Under this type of attack, sensors in the networks might send falsified data to degrade system performance. The Bernoulli-Gaussian (BG) distribution in terms of the sparsity degree of the stochastic signal is utilized for modeling the sparsity of signals. Several detector… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  38. arXiv:2304.06246  [pdf, other

    eess.IV

    Rapid Brain Meninges Surface Reconstruction with Layer Topology Guarantee

    Authors: Peiyu Duan, Yuan Xue, Shuo Han, Lianrui Zuo, Aaron Carass, Caitlyn Bernhard, Savannah Hays, Peter A. Calabresi, Susan M. Resnick, James S. Duncan, Jerry L. Prince

    Abstract: The meninges, located between the skull and brain, are composed of three membrane layers: the pia, the arachnoid, and the dura. Reconstruction of these layers can aid in studying volume differences between patients with neurodegenerative diseases and normal aging subjects. In this work, we use convolutional neural networks (CNNs) to reconstruct surfaces representing meningeal layer boundaries from… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Comments: ISBI 2023 Oral

  39. arXiv:2303.15703  [pdf, other

    eess.AS

    AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and Detection

    Authors: Jin Sob Kim, Hyun Joon Park, Wooseok Shin, Sung Won Han

    Abstract: Sound event localization and detection (SELD) combines the identification of sound events with the corresponding directions of arrival (DOA). Recently, event-oriented track output formats have been adopted to solve this problem; however, they still have limited generalization toward real-world problems in an unknown polyphony environment. To address the issue, we proposed an angular-distance-based… ▽ More

    Submitted 10 May, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: 5 pages, 3 figures, accepted for publication in IEEE ICASSP 2023

  40. arXiv:2303.09463  [pdf, other

    cs.RO eess.SY

    An Autonomous System for Head-to-Head Race: Design, Implementation and Analysis; Team KAIST at the Indy Autonomous Challenge

    Authors: Chanyoung Jung, Andrea Finazzi, Hyunki Seong, Daegyu Lee, Seungwook Lee, Bosung Kim, Gyuri Gang, Seungil Han, David Hyunchul Shim

    Abstract: While the majority of autonomous driving research has concentrated on everyday driving scenarios, further safety and performance improvements of autonomous vehicles require a focus on extreme driving conditions. In this context, autonomous racing is a new area of research that has been attracting considerable interest recently. Due to the fact that a vehicle is driven by its perception, planning,… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 35 pages, 31 figures, 5 tables, Field Robotics (accepted)

  41. arXiv:2303.09057  [pdf, other

    eess.AS cs.SD

    TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion

    Authors: Hyun Joon Park, Seok Woo Yang, Jin Sob Kim, Wooseok Shin, Sung Won Han

    Abstract: Voice Conversion (VC) must be achieved while maintaining the content of the source speech and representing the characteristics of the target speaker. The existing methods do not simultaneously satisfy the above two aspects of VC, and their conversion outputs suffer from a trade-off problem between maintaining source contents and target characteristics. In this study, we propose Triple Adaptive Att… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: To appear in ICASSP 2023

  42. arXiv:2303.09048  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms

    Authors: Joseph Konan, Ojas Bhargave, Shikhar Agnihotri, Hojeong Lee, Ankit Shah, Shuo Han, Yunyang Zeng, Amanda Shu, Haohui Liu, Xuankai Chang, Hamza Khalid, Minseon Gwak, Kawon Lee, Minjeong Kim, Bhiksha Raj

    Abstract: In this paper, we present a method for fine-tuning models trained on the Deep Noise Suppression (DNS) 2020 Challenge to improve their performance on Voice over Internet Protocol (VoIP) applications. Our approach involves adapting the DNS 2020 models to the specific acoustic characteristics of VoIP communications, which includes distortion and artifacts caused by compression, transmission, and plat… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: Under review at European Association for Signal Processing. 5 pages

  43. arXiv:2302.08095  [pdf, other

    cs.SD cs.CL eess.AS

    PAAPLoss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement

    Authors: Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj

    Abstract: Despite rapid advancement in recent years, current speech enhancement models often produce speech that differs in perceptual quality from real clean speech. We propose a learning objective that formalizes differences in perceptual quality, by using domain knowledge of acoustic-phonetics. We identify temporal acoustic parameters -- such as spectral tilt, spectral flux, shimmer, etc. -- that are non… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: Accepted at ICASSP 2023

  44. arXiv:2302.08088  [pdf, other

    cs.CL cs.SD eess.AS

    TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement

    Authors: Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj

    Abstract: Speech enhancement models have greatly progressed in recent years, but still show limits in perceptual quality of their speech outputs. We propose an objective for perceptual quality based on temporal acoustic parameters. These are fundamental speech features that play an essential role in various applications, including speaker recognition and paralinguistic analysis. We provide a differentiable… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

    Comments: Accepted at ICASSP 2023

  45. arXiv:2301.10815  [pdf, other

    eess.SP

    Human-machine Hierarchical Networks for Decision Making under Byzantine Attacks

    Authors: Chen Quan, Baocheng Geng, Yunghsiang S. Han, Pramod K. Varshney

    Abstract: This paper proposes a belief-updating scheme in a human-machine collaborative decision-making network to combat Byzantine attacks. A hierarchical framework is used to realize the network where local decisions from physical sensors act as reference decisions to improve the quality of human sensor decisions. During the decision-making process, the belief that each physical sensor is malicious is upd… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

  46. arXiv:2301.01349  [pdf, other

    eess.SY

    Quantitative Planning with Action Deception in Concurrent Stochastic Games

    Authors: Chongyang Shi, Shuo Han, Jie Fu

    Abstract: We study a class of two-player competitive concurrent stochastic games on graphs with reachability objectives. Specifically, player 1 aims to reach a subset $F_1$ of game states, and player 2 aims to reach a subset $F_2$ of game states where $F_2\cap F_1=\emptyset$. Both players aim to satisfy their reachability objectives before their opponent does. Yet, the information players have about the gam… ▽ More

    Submitted 22 March, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

  47. arXiv:2212.00661  [pdf, other

    quant-ph eess.SY

    Hybrid Gate-Pulse Model for Variational Quantum Algorithms

    Authors: Zhiding Liang, Zhixin Song, Jinglei Cheng, Zichang He, Ji Liu, Hanrui Wang, Ruiyang Qin, Yiru Wang, Song Han, Xuehai Qian, Yiyu Shi

    Abstract: Current quantum programs are mostly synthesized and compiled on the gate-level, where quantum circuits are composed of quantum gates. The gate-level workflow, however, introduces significant redundancy when quantum gates are eventually transformed into control signals and applied on quantum devices. For superconducting quantum computers, the control signals are microwave pulses. Therefore, pulse-l… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

    Comments: 8 pages, 6 figures

  48. arXiv:2211.13797  [pdf, other

    math.OC cs.RO eess.SY

    Data-Driven Distributionally Robust Electric Vehicle Balancing for Autonomous Mobility-on-Demand Systems under Demand and Supply Uncertainties

    Authors: Sihong He, Zhili Zhang, Shuo Han, Lynn Pepin, Guang Wang, Desheng Zhang, John Stankovic, Fei Miao

    Abstract: Electric vehicles (EVs) are being rapidly adopted due to their economic and societal benefits. Autonomous mobility-on-demand (AMoD) systems also embrace this trend. However, the long charging time and high recharging frequency of EVs pose challenges to efficiently managing EV AMoD systems. The complicated dynamic charging and mobility process of EV AMoD systems makes the demand and supply uncertai… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Comments: 16 pages

  49. arXiv:2211.11248  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Video Background Music Generation: Dataset, Method and Evaluation

    Authors: Le Zhuo, Zhaokai Wang, Baisen Wang, Yue Liao, Chenxi Bao, Stanley Peng, Songhao Han, Aixi Zhang, Fei Fang, Si Liu

    Abstract: Music is essential when editing videos, but selecting music manually is difficult and time-consuming. Thus, we seek to automatically generate background music tracks given video input. This is a challenging task since it requires music-video datasets, efficient architectures for video-to-music generation, and reasonable metrics, none of which currently exist. To close this gap, we introduce a comp… ▽ More

    Submitted 4 August, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted by ICCV2023

  50. arXiv:2211.09385  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    ComMU: Dataset for Combinatorial Music Generation

    Authors: Lee Hyun, Taehyun Kim, Hyolim Kang, Minjoo Ki, Hyeonchan Hwang, Kwanho Park, Sharang Han, Seon Joo Kim

    Abstract: Commercial adoption of automatic music composition requires the capability of generating diverse and high-quality music suitable for the desired context (e.g., music for romantic movies, action games, restaurants, etc.). In this paper, we introduce combinatorial music generation, a new task to create varying background music based on given conditions. Combinatorial music generation creates short s… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

    Comments: 19 pages, 12 figures