Skip to main content

Showing 1–50 of 140 results for author: Yan, Y

  1. arXiv:2407.09935  [pdf, other

    cs.CV cs.MM eess.IV

    LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation

    Authors: Jiacheng Li, Chang Chen, Fenglong Song, Youliang Yan, Zhiwei Xiong

    Abstract: Image resampling is a basic technique that is widely employed in daily applications, such as camera photo editing. Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors. Still, these methods are not the perfect substitute for interpolation, due to the drawbacks in efficiency and versatility. In this work, we propose a novel method of Lea… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: Code: https://github.com/ddlee-cn/LeRF-PyTorch

  2. arXiv:2407.01891  [pdf, other

    cs.RO eess.SY

    Refined Motion Compensation with Soft Laser Manipulators using Data-Driven Surrogate Models

    Authors: Yongjun Yan, Qingpeng Ding, Mingwu Li, Junyan Yan, Shing Shin Cheng

    Abstract: Non-contact laser ablation, a precise thermal technique, simultaneously cuts and coagulates tissue without the insertion errors associated with rigid needles. Human organ motions, such as those in the liver, exhibit rhythmic components influenced by respiratory and cardiac cycles, making effective laser energy delivery to target lesions while compensating for tumor motion crucial. This research in… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2406.08196  [pdf, other

    cs.SD eess.AS

    FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter

    Authors: Yuanjun Lv, Hai Li, Ying Yan, Junhui Liu, Danming Xie, Lei Xie

    Abstract: Vocoders reconstruct speech waveforms from acoustic features and play a pivotal role in modern TTS systems. Frequent-domain GAN vocoders like Vocos and APNet2 have recently seen rapid advancements, outperforming time-domain models in inference speed while achieving comparable audio quality. However, these frequency-domain vocoders suffer from large parameter sizes, thus introducing extra memory bu… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by InterSpeech 2024; 5 pages, 5 figures

  4. arXiv:2405.19363  [pdf, other

    eess.SP cs.AI cs.LG

    Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series Classification

    Authors: Yihe Wang, Nan Huang, Taida Li, Yujun Yan, Xiang Zhang

    Abstract: Medical time series data, such as Electroencephalography (EEG) and Electrocardiography (ECG), play a crucial role in healthcare, such as diagnosing brain and heart diseases. Existing methods for medical time series classification primarily rely on handcrafted biomarkers extraction and CNN-based models, with limited exploration of transformers tailored for medical time series. In this paper, we int… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 20pages (14 pages main paper + 6 pages supplementary materials)

  5. arXiv:2405.18435  [pdf, other

    eess.IV cs.CV

    QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

    Authors: Hongwei Bran Li, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

    Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More

    Submitted 24 June, 2024; v1 submitted 19 March, 2024; originally announced May 2024.

    Comments: initial technical report

  6. arXiv:2405.10977  [pdf, other

    eess.SY physics.app-ph

    Frequency stabilization of self-sustained oscillations in a sideband-driven electromechanical resonator

    Authors: B. Zhang, Yingming Yan, X. Dong, M. I. Dykman, H. B. Chan

    Abstract: We present a method to stabilize the frequency of self-sustained vibrations in micro- and nanomechanical resonators. The method refers to a two-mode system with the vibrations at significantly different frequencies. The signal from one mode is used to control the other mode. In the experiment, self-sustained oscillations of micromechanical modes are excited by pumping at the blue-detuned sideband… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  7. arXiv:2405.09979  [pdf

    eess.SP

    Harmonic and Interharmonic Detection in Power Systems Based on Fractal-Optimized Variational Mode Decomposition

    Authors: Pei Yuhang, Yu Min, Yu Yan

    Abstract: The proposed method introduces a parameter determination approach based on the minimum Fractal box dimension (FBD) of Variational Mode Decomposition (VMD) components, aiming to address the issue of manual determination of VMD decomposition layers in advance. Initially, VMD is applied to the original power signal, and the layer number for VMD decomposition is determined by selecting the K value ass… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: in Chinese language

  8. arXiv:2405.04111  [pdf, other

    cs.LG eess.SP

    Adaptive Least Mean pth Power Graph Neural Networks

    Authors: Changran Peng, Yi Yan, Ercan E. Kuruoglu

    Abstract: In the presence of impulsive noise, and missing observations, accurate online prediction of time-varying graph signals poses a crucial challenge in numerous application domains. We propose the Adaptive Least Mean $p^{th}$ Power Graph Neural Networks (LMP-GNN), a universal framework combining adaptive filter and graph neural network for online graph signal estimation. LMP-GNN retains the advantage… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  9. arXiv:2405.04107  [pdf, other

    eess.SP

    Adaptive Graph Normalized Sign Algorithm

    Authors: Changran Peng, Yi Yan, Ercan E. Kuruoglu

    Abstract: Efficient and robust prediction of graph signals is challenging when the signals are under impulsive noise and have missing data. Exploiting graph signal processing (GSP) and leveraging the simplicity of the classical adaptive sign algorithm, we propose an adaptive algorithm on graphs named the Graph Normalized Sign (GNS). GNS approximated a normalization term into the update, therefore achieving… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  10. arXiv:2405.04098  [pdf, other

    cs.LG eess.SP

    Binarized Simplicial Convolutional Neural Networks

    Authors: Yi Yan, Ercan E. Kuruoglu

    Abstract: Graph Neural Networks have a limitation of solely processing features on graph nodes, neglecting data on high-dimensional structures such as edges and triangles. Simplicial Convolutional Neural Networks (SCNN) represent higher-order structures using simplicial complexes to break this limitation albeit still lacking time efficiency. In this paper, we propose a novel neural network architecture on s… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  11. arXiv:2405.02191  [pdf

    cs.CV cs.LG eess.IV

    Non-Destructive Peat Analysis using Hyperspectral Imaging and Machine Learning

    Authors: Yijun Yan, Jinchang Ren, Barry Harrison, Oliver Lewis, Yinhe Li, Ping Ma

    Abstract: Peat, a crucial component in whisky production, imparts distinctive and irreplaceable flavours to the final product. However, the extraction of peat disrupts ancient ecosystems and releases significant amounts of carbon, contributing to climate change. This paper aims to address this issue by conducting a feasibility study on enhancing peat use efficiency in whisky manufacturing through non-destru… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 4 pages,4 figures

  12. arXiv:2404.19387  [pdf, other

    eess.SY

    Online Electricity Purchase for Data Center with Dynamic Virtual Battery from Flexibility Aggregation

    Authors: Kekun Gao, Yuejun Yan, Yixuan Liu, Endong Liu, Pengcheng You

    Abstract: As a critical component of modern infrastructure, data centers account for a huge amount of power consumption and greenhouse gas emission. This paper studies the electricity purchase strategy for a data center to lower its energy cost while integrating local renewable generation under uncertainty. To facilitate efficient and scalable decision-making, we propose a two-layer hierarchy where the lowe… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  13. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  14. arXiv:2404.15349  [pdf, other

    eess.SP cs.LG cs.MM

    A Survey on Multimodal Wearable Sensor-based Human Action Recognition

    Authors: Jianyuan Ni, Hao Tang, Syed Tousiful Haque, Yan Yan, Anne H. H. Ngu

    Abstract: The combination of increased life expectancy and falling birth rates is resulting in an aging population. Wearable Sensor-based Human Activity Recognition (WSHAR) emerges as a promising assistive technology to support the daily lives of older individuals, unlocking vast potential for human-centric applications. However, recent surveys in WSHAR have been limited, focusing either solely on deep lear… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Multimodal Survey for Wearable Sensor-based Human Action Recognition

  15. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  16. arXiv:2404.09466  [pdf, other

    cs.SD cs.LG eess.AS

    Scoring Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription

    Authors: Yujia Yan, Zhiyao Duan

    Abstract: The neural semi-Markov Conditional Random Field (semi-CRF) framework has demonstrated promise for event-based piano transcription. In this framework, all events (notes or pedals) are represented as closed intervals tied to specific event types. The neural semi-CRF approach requires an interval scoring matrix that assigns a score for every candidate interval. However, designing an efficient and exp… ▽ More

    Submitted 23 May, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: Fixed Typos

  17. arXiv:2403.03055  [pdf, other

    cs.MA cs.LG cs.RO eess.SY

    Distributed Policy Gradient for Linear Quadratic Networked Control with Limited Communication Range

    Authors: Yuzi Yan, Yuan Shen

    Abstract: This paper proposes a scalable distributed policy gradient method and proves its convergence to near-optimal solution in multi-agent linear quadratic networked systems. The agents engage within a specified network under local communication constraints, implying that each agent can only exchange information with a limited number of neighboring agents. On the underlying graph of the network, each ag… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 14 pages, 6 figures

  18. arXiv:2403.02601  [pdf, other

    eess.IV cs.CV

    Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning

    Authors: Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Haoze Sun, Xueyi Zou, Zhensong Zhang, Youliang Yan, Lei Zhu

    Abstract: For image super-resolution (SR), bridging the gap between the performance on synthetic datasets and real-world degradation scenarios remains a challenge. This work introduces a novel "Low-Res Leads the Way" (LWay) training framework, merging Supervised Pre-training with Self-supervised Learning to enhance the adaptability of SR models to real-world images. Our approach utilizes a low-resolution (L… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  19. arXiv:2402.07619  [pdf, other

    cs.SD cs.AI eess.AS

    Developing a Multi-variate Prediction Model For COVID-19 From Crowd-sourced Respiratory Voice Data

    Authors: Yuyang Yan, Wafaa Aljbawi, Sami O. Simons, Visara Urovi

    Abstract: COVID-19 has affected more than 223 countries worldwide and in the Post-COVID Era, there is a pressing need for non-invasive, low-cost, and highly scalable solutions to detect COVID-19. We develop a deep learning model to identify COVID-19 from voice recording data. The novelty of this work is in the development of deep learning models for COVID-19 identification from only voice recordings. We use… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2209.03727

  20. arXiv:2402.00697  [pdf, other

    eess.SY

    Combining Belief Function Theory and Stochastic Model Predictive Control for Multi-Modal Uncertainty in Autonomous Driving

    Authors: Tommaso Benciolini, Yuntian Yan, Dirk Wollherr, Marion Leibold

    Abstract: In automated driving, predicting and accommodating the uncertain future motion of other traffic participants is challenging, especially in unstructured environments in which the high-level intention of traffic participants is difficult to predict. Several possible uncertain future behaviors of traffic participants must be considered, resulting in multi-modal uncertainty. We propose a novel combina… ▽ More

    Submitted 2 February, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: This work has been accepted to the 2024 American Control Conference

  21. arXiv:2401.15304  [pdf, other

    cs.LG eess.SP

    Adaptive Least Mean Squares Graph Neural Networks and Online Graph Signal Estimation

    Authors: Yi Yan, Changran Peng, Ercan Engin Kuruoglu

    Abstract: The online prediction of multivariate signals, existing simultaneously in space and time, from noisy partial observations is a fundamental task in numerous applications. We propose an efficient Neural Network architecture for the online estimation of time-varying graph signals named the Adaptive Least Mean Squares Graph Neural Networks (LMS-GNN). LMS-GNN aims to capture the time variation and brid… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

  22. arXiv:2401.04965  [pdf

    eess.SP cs.LG

    ConvConcatNet: a deep convolutional neural network to reconstruct mel spectrogram from the EEG

    Authors: Xiran Xu, Bo Wang, Yujie Yan, Haolin Zhu, Zechen Zhang, Xihong Wu, Jing Chen

    Abstract: To investigate the processing of speech in the brain, simple linear models are commonly used to establish a relationship between brain signals and speech features. However, these linear models are ill-equipped to model a highly dynamic and complex non-linear system like the brain. Although non-linear methods with neural networks have been developed recently, reconstructing unseen stimuli from unse… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 2 pages, 1 figure, 2 tables

  23. arXiv:2401.04964  [pdf

    eess.SP cs.SD eess.AS

    Self-supervised speech representation and contextual text embedding for match-mismatch classification with EEG recording

    Authors: Bo Wang, Xiran Xu, Zechen Zhang, Haolin Zhu, YuJie Yan, Xihong Wu, Jing Chen

    Abstract: Relating speech to EEG holds considerable importance but is challenging. In this study, a deep convolutional network was employed to extract spatiotemporal features from EEG data. Self-supervised speech representation and contextual text embedding were used as speech features. Contrastive learning was used to relate EEG features to speech features. The experimental results demonstrate the benefits… ▽ More

    Submitted 31 January, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: 2 pages, 2 figures, accepted by ICASSP 2024

  24. arXiv:2401.02566  [pdf

    cs.SD cs.LG cs.MM eess.AS

    Siamese Residual Neural Network for Musical Shape Evaluation in Piano Performance Assessment

    Authors: Xiaoquan Li, Stephan Weiss, Yijun Yan, Yinhe Li, Jinchang Ren, John Soraghan, Ming Gong

    Abstract: Understanding and identifying musical shape plays an important role in music education and performance assessment. To simplify the otherwise time- and cost-intensive musical shape evaluation, in this paper we explore how artificial intelligence (AI) driven models can be applied. Considering musical shape evaluation as a classification problem, a light-weight Siamese residual neural network (S-ResN… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: X.Li, S.Weiss, Y.Yan, Y.Li, J.Ren, J.Soraghan, M.Gong,"Siamese residual neural network for musical shape evaluation in piano performance assessment" in Proc. of the 31st European Signal Processing Conference, Helsinki, Finland

  25. arXiv:2401.01673  [pdf, other

    cs.IT eess.SP

    Coded Beam Training

    Authors: Tianyue Zheng, Jieao Zhu, Qiumo Yu, Yongli Yan, Linglong Dai

    Abstract: In extremely large-scale multiple input multiple output (XL-MIMO) systems for future sixth-generation (6G) communications, codebook-based beam training stands out as a promising technology to acquire channel state information (CSI). Despite their effectiveness, when the pilot overhead is limited, existing beam training methods suffer from significant achievable rate degradation for remote users wi… ▽ More

    Submitted 6 March, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: In this paper, we introduce channel coding theory into hierarchical beam training and propose a beam training scheme called coded beam training. By leveraging the error-correcting capability of channel codes, the proposed coded beam training method can enable reliable beam training performance for remote users with low SNR, while keeping training overhead low

  26. arXiv:2312.13310  [pdf, other

    eess.IV cs.CV

    Computational Spectral Imaging with Unified Encoding Model: A Comparative Study and Beyond

    Authors: Xinyuan Liu, Lizhi Wang, Lingen Li, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan

    Abstract: Computational spectral imaging is drawing increasing attention owing to the snapshot advantage, and amplitude, phase, and wavelength encoding systems are three types of representative implementations. Fairly comparing and understanding the performance of these systems is essential, but challenging due to the heterogeneity in encoding design. To overcome this limitation, we propose the unified enco… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  27. arXiv:2312.12833  [pdf, other

    eess.IV cs.CV

    Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence

    Authors: Hongyuan Wang, Lizhi Wang, Jiang Xu, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan

    Abstract: Spectral super-resolution that aims to recover hyperspectral image (HSI) from easily obtainable RGB image has drawn increasing interest in the field of computational photography. The crucial aspect of spectral super-resolution lies in exploiting the correlation within HSIs. However, two types of bottlenecks in existing Transformers limit performance improvement and practical applications. First, e… ▽ More

    Submitted 18 March, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  28. arXiv:2311.09655  [pdf, other

    cs.SD cs.CV eess.AS

    Multi-View Spectrogram Transformer for Respiratory Sound Classification

    Authors: Wentao He, Yuchen Yan, Jianfeng Ren, Ruibin Bai, Xudong Jiang

    Abstract: Deep neural networks have been applied to audio spectrograms for respiratory sound classification. Existing models often treat the spectrogram as a synthetic image while overlooking its physical characteristics. In this paper, a Multi-View Spectrogram Transformer (MVST) is proposed to embed different views of time-frequency characteristics into the vision transformer. Specifically, the proposed MV… ▽ More

    Submitted 30 May, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: The paper was published at ICASSP 2024

  29. arXiv:2311.00656  [pdf, other

    eess.SP cs.LG

    Online Signal Estimation on the Graph Edges via Line Graph Transformation

    Authors: Yi Yan, Ercan Engin Kuruoglu

    Abstract: The processing of signals on graph edges is challenging considering that Graph Signal Processing techniques are defined only on the graph nodes. Leveraging the Line Graph to transform a graph edge signal onto the node of its edge-to-vertex dual, we propose the Line Graph Least Mean Square (LGLMS) algorithm for online time-varying graph edge signal prediction. By setting up an $l_2$-norm optimizati… ▽ More

    Submitted 28 February, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

  30. arXiv:2310.12768  [pdf, other

    eess.SP cs.AI cs.IT cs.LG cs.NI

    SemantIC: Semantic Interference Cancellation Towards 6G Wireless Communications

    Authors: Wensheng Lin, Yuna Yan, Lixin Li, Zhu Han, Tad Matsumoto

    Abstract: This letter proposes a novel anti-interference technique, semantic interference cancellation (SemantIC), for enhancing information quality towards the sixth-generation (6G) wireless networks. SemantIC only requires the receiver to concatenate the channel decoder with a semantic auto-encoder. This constructs a turbo loop which iteratively and alternately eliminates noise in the signal domain and th… ▽ More

    Submitted 14 June, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

  31. arXiv:2310.07987  [pdf, other

    cs.NI cs.IT cs.LG eess.SP

    Semantic-Forward Relaying: A Novel Framework Towards 6G Cooperative Communications

    Authors: Wensheng Lin, Yuna Yan, Lixin Li, Zhu Han, Tad Matsumoto

    Abstract: This letter proposes a novel relaying framework, semantic-forward (SF), for cooperative communications towards the sixth-generation (6G) wireless networks. The SF relay extracts and transmits the semantic features, which reduces forwarding payload, and also improves the network robustness against intra-link errors. Based on the theoretical basis for cooperative communications with side information… ▽ More

    Submitted 12 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  32. arXiv:2309.07690  [pdf

    eess.SP cs.LG

    A DenseNet-based method for decoding auditory spatial attention with EEG

    Authors: Xiran Xu, Bo Wang, Yujie Yan, Xihong Wu, Jing Chen

    Abstract: Auditory spatial attention detection (ASAD) aims to decode the attended spatial location with EEG in a multiple-speaker setting. ASAD methods are inspired by the brain lateralization of cortical neural responses during the processing of auditory spatial attention, and show promising performance for the task of auditory attention decoding (AAD) with neural recordings. In the previous ASAD methods,… ▽ More

    Submitted 17 January, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: 5 pages, 3 figures, has been accepted by ICASSP 2024

  33. arXiv:2308.12548  [pdf, other

    eess.SY physics.atom-ph

    Relations Between Generalized JST Algorithm and Kalman Filtering Algorithm for Time Scale Generation

    Authors: Yuyue Yan, Takahiro Kawaguchi, Yuichiro Yano, Yuko Hanado, Takayuki Ishizaki

    Abstract: In this paper, we present a generalized Japan Standard Time algorithm (JST-algo) for higher-order atomic clock ensembles and mathematically clarify the relations of the (generalized) JST-algo and the conventional Kalman filtering algorithm (CKF-algo) in the averaged atomic time and the clock residuals for time scale generation. In particular, we reveal the fact that the averaged atomic time of the… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: 11 pages, 10 figures

    MSC Class: 93-08; 93E10 ACM Class: G.1.7

  34. arXiv:2308.10910  [pdf, other

    eess.IV cs.AI cs.CV

    Federated Pseudo Modality Generation for Incomplete Multi-Modal MRI Reconstruction

    Authors: Yunlu Yan, Chun-Mei Feng, Yuexiang Li, Rick Siow Mong Goh, Lei Zhu

    Abstract: While multi-modal learning has been widely used for MRI reconstruction, it relies on paired multi-modal data which is difficult to acquire in real clinical scenarios. Especially in the federated setting, the common situation is that several medical institutions only have single-modal data, termed the modality missing issue. Therefore, it is infeasible to deploy a standard federated learning framew… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: 10 pages, 5 figures,

  35. arXiv:2308.06547  [pdf, other

    eess.AS cs.CL cs.SD

    Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition

    Authors: Han Zhu, Dongji Gao, Gaofeng Cheng, Daniel Povey, Pengyuan Zhang, Yonghong Yan

    Abstract: When labeled data is insufficient, semi-supervised learning with the pseudo-labeling technique can significantly improve the performance of automatic speech recognition. However, pseudo-labels are often noisy, containing numerous incorrect tokens. Taking noisy labels as ground-truth in the loss function results in suboptimal performance. Previous works attempted to mitigate this issue by either fi… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: Accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2023

  36. Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture

    Authors: Haoran Miao, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan

    Abstract: Recently, there has been increasing progress in end-to-end automatic speech recognition (ASR) architecture, which transcribes speech to text without any pre-trained alignments. One popular end-to-end approach is the hybrid Connectionist Temporal Classification (CTC) and attention (CTC/attention) based ASR architecture. However, how to deploy hybrid CTC/attention systems for online speech recogniti… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Volume 28, 2020, Pages 1452 - 1465

  37. arXiv:2306.02673  [pdf, other

    eess.IV cs.CV cs.LG

    Cross-Modal Vertical Federated Learning for MRI Reconstruction

    Authors: Yunlu Yan, Hong Wang, Yawen Huang, Nanjun He, Lei Zhu, Yuexiang Li, Yong Xu, Yefeng Zheng

    Abstract: Federated learning enables multiple hospitals to cooperatively learn a shared model without privacy disclosure. Existing methods often take a common assumption that the data from different hospitals have the same modalities. However, such a setting is difficult to fully satisfy in practical applications, since the imaging guidelines may be different between hospitals, which makes the number of ind… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: 12 pages, 7 figures

  38. arXiv:2305.08292  [pdf, other

    cs.SD eess.AS

    ForkNet: Simultaneous Time and Time-Frequency Domain Modeling for Speech Enhancement

    Authors: Feng Dang, Qi Hu, Pengyuan Zhang, Yonghong Yan

    Abstract: Previous research in speech enhancement has mostly focused on modeling time or time-frequency domain information alone, with little consideration given to the potential benefits of simultaneously modeling both domains. Since these domains contain complementary information, combining them may improve the performance of the model. In this letter, we propose a new approach to simultaneously model tim… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

  39. arXiv:2305.05894  [pdf, other

    eess.SY

    Structured Kalman Filter for Time Scale Generation in Atomic Clock Ensembles

    Authors: Yuyue Yan, Takahiro Kawaguchi, Yuichiro Yano, Yuko Hanado, Takayuki Ishizaki

    Abstract: In this article, we present a structured Kalman filter associated with the transformation matrix for observable Kalman canonical decomposition from conventional Kalman filter (CKF) in order to generate a more accurate time scale. The conventional Kalman filter is a special case of the proposed structured Kalman filter which yields the same predicted unobservable or observable states when some cond… ▽ More

    Submitted 7 December, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: 6 pages, 3 figures

    MSC Class: 93-08; 93E10

  40. arXiv:2305.03387  [pdf, other

    eess.IV cs.CV

    AsConvSR: Fast and Lightweight Super-Resolution Network with Assembled Convolutions

    Authors: Jiaming Guo, Xueyi Zou, Yuyi Chen, Yi Liu, Jia Hao, Jianzhuang Liu, Youliang Yan

    Abstract: In recent years, videos and images in 720p (HD), 1080p (FHD) and 4K (UHD) resolution have become more popular for display devices such as TVs, mobile phones and VR. However, these high resolution images cannot achieve the expected visual effect due to the limitation of the internet bandwidth, and bring a great challenge for super-resolution networks to achieve real-time performance. Following this… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

  41. arXiv:2304.10338  [pdf, other

    eess.SY

    Distributed Nash Equilibrium Seeking with Stochastic Event-Triggered Mechanism

    Authors: Wei Huo, Kam Fai Elvis Tsang, Yamin Yan, Karl Henrik Johansson, Ling Shi

    Abstract: In this paper, we study the problem of consensus-based distributed Nash equilibrium (NE) seeking where a network of players, abstracted as a directed graph, aim to minimize their own local cost functions non-cooperatively. Considering the limited energy of players and constrained bandwidths, we propose a stochastic event-triggered algorithm by triggering each player with a probability depending on… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  42. arXiv:2304.07278  [pdf, ps, other

    cs.LG cs.IT eess.SY math.ST stat.ML

    Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning

    Authors: Gen Li, Yuling Yan, Yuxin Chen, Jianqing Fan

    Abstract: This paper studies reward-agnostic exploration in reinforcement learning (RL) -- a scenario where the learner is unware of the reward functions during the exploration stage -- and designs an algorithm that improves over the state of the art. More precisely, consider a finite-horizon inhomogeneous Markov decision process with $S$ states, $A$ actions, and horizon length $H$, and suppose that there a… ▽ More

    Submitted 23 May, 2024; v1 submitted 14 April, 2023; originally announced April 2023.

    Comments: accepted for presentation in COLT 2024

  43. arXiv:2303.13057  [pdf, other

    eess.IV

    Lightweight High-Performance Blind Image Quality Assessment

    Authors: Zhanxuan Mei, Yun-Cheng Wang, Xingze He, Yong Yan, C. -C. Jay Kuo

    Abstract: Blind image quality assessment (BIQA) is a task that predicts the perceptual quality of an image without its reference. Research on BIQA attracts growing attention due to the increasing amount of user-generated images and emerging mobile applications where reference images are unavailable. The problem is challenging due to the wide range of content and mixed distortion types. Many existing BIQA me… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

  44. arXiv:2303.06475  [pdf, other

    eess.AS cs.CL

    Transcription free filler word detection with Neural semi-CRFs

    Authors: Ge Zhu, Yujia Yan, Juan-Pablo Caceres, Zhiyao Duan

    Abstract: Non-linguistic filler words, such as "uh" or "um", are prevalent in spontaneous speech and serve as indicators for expressing hesitation or uncertainty. Previous works for detecting certain non-linguistic filler words are highly dependent on transcriptions from a well-established commercial automatic speech recognition (ASR) system. However, certain ASR systems are not universally accessible from… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  45. Distributed Data-driven Predictive Control via Dissipative Behavior Synthesis

    Authors: Yitao Yan, Jie Bao, Biao Huang

    Abstract: This paper presents a distributed data-driven predictive control (DDPC) approach using the behavioral framework. It aims to design a network of controllers for an interconnected system with linear time-invariant (LTI) subsystems such that a given global (network-wide) cost function is minimized while desired control performance (e.g., network stability and disturbance rejection) is achieved using… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Journal ref: IEEE Transactions on Automatic Control, 2023

  46. arXiv:2302.13222  [pdf, other

    cs.CL cs.SD eess.AS

    Speech Corpora Divergence Based Unsupervised Data Selection for ASR

    Authors: Changfeng Gao, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan

    Abstract: Selecting application scenarios matching data is important for the automatic speech recognition (ASR) training, but it is difficult to measure the matching degree of the training corpus. This study proposes a unsupervised target-aware data selection method based on speech corpora divergence (SCD), which can measure the similarity between two speech corpora. We first use the self-supervised Hubert… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

  47. arXiv:2301.12024  [pdf, other

    math.OC eess.SY

    Stability of Finite Receding Horizon Control: A Complementary Approach

    Authors: Wen-Hua Chen, Yunda Yan

    Abstract: This paper presents a complementary approach to establish stability of finite receding horizon control with a terminal cost. First a new augmented stage cost is defined by rotating the terminal cost. Then a one-step optimisation problem is defined based on this augmented stage cost. It is shown that a slightly modified Model Predictive Control (MPC) algorithm is stable if the value function of the… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  48. Dual Control of Exploration and Exploitation for Auto-Optimisation Control with Active Learning

    Authors: Zhongguo Li, Wen-Hua Chen, Jun Yang, Yunda Yan

    Abstract: The quest for optimal operation in environments with unknowns and uncertainties is highly desirable but critically challenging across numerous fields. This paper develops a dual control framework for exploration and exploitation (DCEE) to solve an auto-optimisation problem in such complex settings. In general, there is a fundamental conflict between tracking an unknown optimal operational conditio… ▽ More

    Submitted 12 March, 2024; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: This paper has been accepted by IEEE Transactions on Automation Science and Engineering (DOI: 10.1109/TASE.2024.3375373)

    Journal ref: IEEE Transactions on Automation Science and Engineering, 2024

  49. arXiv:2301.11831  [pdf, other

    cs.DC cs.PF eess.SY

    Data Volume-aware Computation Task Scheduling for Smart Grid Data Analytic Applications

    Authors: Binquan Guo, Hongyan Li, Ye Yan, Zhou Zhang, Peng Wang

    Abstract: Emerging smart grid applications analyze large amounts of data collected from millions of meters and systems to facilitate distributed monitoring and real-time control tasks. However, current parallel data processing systems are designed for common applications, unaware of the massive volume of the collected data, causing long data transfer delay during the computation and slow response time of sm… ▽ More

    Submitted 2 February, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: Accepted to appear in IEEE ICC 2023. The source code is available at Github: https://github.com/wilixx/ICCTS

  50. arXiv:2211.13229  [pdf, other

    eess.IV cs.CL cs.CV cs.LG

    DeltaNet:Conditional Medical Report Generation for COVID-19 Diagnosis

    Authors: Xian Wu, Shuxin Yang, Zhaopeng Qiu, Shen Ge, Yangtian Yan, Xingwang Wu, Yefeng Zheng, S. Kevin Zhou, Li Xiao

    Abstract: Fast screening and diagnosis are critical in COVID-19 patient treatment. In addition to the gold standard RT-PCR, radiological imaging like X-ray and CT also works as an important means in patient screening and follow-up. However, due to the excessive number of patients, writing reports becomes a heavy burden for radiologists. To reduce the workload of radiologists, we propose DeltaNet to generate… ▽ More

    Submitted 12 November, 2022; originally announced November 2022.