Skip to main content

Showing 1–20 of 20 results for author: Lv, Z

  1. arXiv:2407.08401  [pdf, other

    eess.SY

    Application of Data-Driven Model Predictive Control for Autonomous Vehicle Steering

    Authors: Jiarui Zhang, Aijing Kong, Yu Tang, Zhichao Lv, Lulu Guo, Peng Hang

    Abstract: With the development of autonomous driving technology, there are increasing demands for vehicle control, and MPC has become a widely researched topic in both industry and academia. Existing MPC control methods based on vehicle kinematics or dynamics have challenges such as difficult modeling, numerous parameters, strong nonlinearity, and high computational cost. To address these issues, this paper… ▽ More

    Submitted 18 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2406.11364  [pdf, other

    cs.SD eess.AS

    AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection

    Authors: Anbai Jiang, Bing Han, Zhiqiang Lv, Yufeng Deng, Wei-Qiang Zhang, Xie Chen, Yanmin Qian, Jia Liu, Pingyi Fan

    Abstract: Large pre-trained models have demonstrated dominant performances in multiple areas, where the consistency between pre-training and fine-tuning is the key to success. However, few works reported satisfactory results of pre-trained models for the machine anomalous sound detection (ASD) task. This may be caused by the inconsistency of the pre-trained model and the inductive bias of machine audio, res… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  3. arXiv:2406.09664  [pdf, other

    cs.SD eess.AS

    Frequency-mix Knowledge Distillation for Fake Speech Detection

    Authors: Cunhang Fan, Shunbo Dong, Jun Xue, Yujie Chen, Jiangyan Yi, Zhao Lv

    Abstract: In the telephony scenarios, the fake speech detection (FSD) task to combat speech spoofing attacks is challenging. Data augmentation (DA) methods are considered effective means to address the FSD task in telephony scenarios, typically divided into time domain and frequency domain stages. While each has its advantages, both can result in information loss. To tackle this issue, we propose a novel DA… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  4. arXiv:2310.15767  [pdf, ps, other

    eess.IV cs.CV cs.LG

    Unpaired MRI Super Resolution with Contrastive Learning

    Authors: Hao Li, Quanwei Liu, Jianan Liu, Xiling Liu, Yanni Dong, Tao Huang, Zhihan Lv

    Abstract: Magnetic resonance imaging (MRI) is crucial for enhancing diagnostic accuracy in clinical settings. However, the inherent long scan time of MRI restricts its widespread applicability. Deep learning-based image super-resolution (SR) methods exhibit promise in improving MRI resolution without additional cost. Due to lacking of aligned high-resolution (HR) and low-resolution (LR) MRI image pairs, uns… ▽ More

    Submitted 16 February, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

  5. arXiv:2310.15194  [pdf

    q-bio.NC cs.HC eess.SP q-bio.QM

    How do the resting EEG preprocessing states affect the outcomes of postprocessing?

    Authors: Shiang Hu, Jie Ruan, Juan Hou, Pedro Antonio Valdes-Sosa, Zhao Lv

    Abstract: Plenty of artifact removal tools and pipelines have been developed to correct the EEG recordings and discover the values below the waveforms. Without visual inspection from the experts, it is susceptible to derive improper preprocessing states, like the insufficient preprocessed EEG (IPE), and the excessive preprocessed EEG (EPE). However, little is known about the impacts of IPE or EPE on the pos… ▽ More

    Submitted 12 December, 2023; v1 submitted 22 October, 2023; originally announced October 2023.

  6. arXiv:2310.11994  [pdf

    cs.HC eess.SP q-bio.NC

    Spectral homogeneity cross frequencies can be a quality metric for the large-scale resting EEG preprocessing

    Authors: Shiang Hu, Jie Ruan, Nicolas Langer, Jorge Bosch-Bayard, Zhao Lv, Dezhong Yao, Pedro Antonio Valdes-Sosa

    Abstract: The brain projects require the collection of massive electrophysiological data, aiming to the longitudinal, sectional, or populational neuroscience studies. Quality metrics automatically label the data after centralized preprocessing. However, although the waveforms-based metrics are partially useful, they may be unreliable by neglecting the spectral profiles. Here, we detected the phenomenon of p… ▽ More

    Submitted 4 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  7. Dual-Branch Knowledge Distillation for Noise-Robust Synthetic Speech Detection

    Authors: Cunhang Fan, Mingming Ding, Jianhua Tao, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Zhao Lv

    Abstract: Most research in synthetic speech detection (SSD) focuses on improving performance on standard noise-free datasets. However, in actual situations, noise interference is usually present, causing significant performance degradation in SSD systems. To improve noise robustness, this paper proposes a dual-branch knowledge distillation synthetic speech detection (DKDSSD) method. Specifically, a parallel… ▽ More

    Submitted 16 April, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

  8. arXiv:2309.07147  [pdf, other

    eess.SP cs.HC cs.LG cs.MM cs.SD eess.AS

    DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial Attention Detection

    Authors: Cunhang Fan, Hongyu Zhang, Wei Huang, Jun Xue, Jianhua Tao, Jiangyan Yi, Zhao Lv, Xiaopei Wu

    Abstract: Auditory Attention Detection (AAD) aims to detect target speaker from brain signals in a multi-speaker environment. Although EEG-based AAD methods have shown promising results in recent years, current approaches primarily rely on traditional convolutional neural network designed for processing Euclidean data like images. This makes it challenging to handle EEG signals, which possess non-Euclidean… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  9. arXiv:2309.06067  [pdf, ps, other

    eess.IV cs.CV physics.med-ph

    Implicit Neural Representation for MRI Parallel Imaging Reconstruction

    Authors: Hao Li, Yusheng Zhou, Jianan Liu, Xiling Liu, Tao Huang, Zhihan Lv, Weidong Cai

    Abstract: Magnetic resonance imaging (MRI) usually faces lengthy acquisition times, prompting the exploration of strategies such as parallel imaging (PI) to alleviate this problem by periodically skipping specific K-space lines and subsequently reconstructing high-quality images from the undersampled K-space. Implicit neural representation (INR) has recently emerged as a promising deep learning technique, c… ▽ More

    Submitted 10 April, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

  10. Spatial Reconstructed Local Attention Res2Net with F0 Subband for Fake Speech Detection

    Authors: Cunhang Fan, Jun Xue, Jianhua Tao, Jiangyan Yi, Chenglong Wang, Chengshi Zheng, Zhao Lv

    Abstract: The rhythm of bonafide speech is often difficult to replicate, which causes that the fundamental frequency (F0) of synthetic speech is significantly different from that of real speech. It is expected that the F0 feature contains the discriminative information for the fake speech detection (FSD) task. In this paper, we propose a novel F0 subband for FSD. In addition, to effectively model the F0 sub… ▽ More

    Submitted 8 July, 2024; v1 submitted 19 August, 2023; originally announced August 2023.

    Comments: Accept by Neural Networks

  11. arXiv:2308.00537  [pdf, other

    eess.SY cs.AI cs.LG

    Graph Embedding Dynamic Feature-based Supervised Contrastive Learning of Transient Stability for Changing Power Grid Topologies

    Authors: Zijian Lv, Xin Chen, Zijian Feng

    Abstract: Accurate online transient stability prediction is critical for ensuring power system stability when facing disturbances. While traditional transient stablity analysis replies on the time domain simulations can not be quickly adapted to the power grid toplogy change. In order to vectorize high-dimensional power grid topological structure information into low-dimensional node-based graph embedding s… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: This work has been submitted to the IEEE Transactions on Power Systems for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  12. arXiv:2306.15389  [pdf, other

    cs.SD cs.LG eess.AS

    Multi-perspective Information Fusion Res2Net with RandomSpecmix for Fake Speech Detection

    Authors: Shunbo Dong, Jun Xue, Cunhang Fan, Kang Zhu, Yujie Chen, Zhao Lv

    Abstract: In this paper, we propose the multi-perspective information fusion (MPIF) Res2Net with random Specmix for fake speech detection (FSD). The main purpose of this system is to improve the model's ability to learn precise forgery information for FSD task in low-quality scenarios. The task of random Specmix, a data augmentation, is to improve the generalization ability of the model and enhance the mode… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: Accepted by DADA2023

  13. arXiv:2303.07138  [pdf, other

    eess.SY cs.LG eess.SP

    Transferable Deep Learning Power System Short-Term Voltage Stability Assessment with Physics-Informed Topological Feature Engineering

    Authors: Zijian Feng, Xin Chen, Zijian Lv, Peiyuan Sun, Kai Wu

    Abstract: Deep learning (DL) algorithms have been widely applied to short-term voltage stability (STVS) assessment in power systems. However, transferring the knowledge learned in one power grid to other power grids with topology changes is still a challenging task. This paper proposed a transferable DL-based model for STVS assessment by constructing the topology-aware voltage dynamic features from raw PMU… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: This work has been submitted to the IEEE Transactions on Power Systems for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  14. arXiv:2303.01211  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Learning From Yourself: A Self-Distillation Method for Fake Speech Detection

    Authors: Jun Xue, Cunhang Fan, Jiangyan Yi, Chenglong Wang, Zhengqi Wen, Dan Zhang, Zhao Lv

    Abstract: In this paper, we propose a novel self-distillation method for fake speech detection (FSD), which can significantly improve the performance of FSD without increasing the model complexity. For FSD, some fine-grained information is very important, such as spectrogram defects, mute segments, and so on, which are often perceived by shallow networks. However, shallow networks have much noise, which can… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  15. arXiv:2301.01732  [pdf, ps, other

    eess.IV cs.CV physics.med-ph

    Explicit Abnormality Extraction for Unsupervised Motion Artifact Reduction in Magnetic Resonance Imaging

    Authors: Yusheng Zhou, Hao Li, Jianan Liu, Zhengmin Kong, Tao Huang, Euijoon Ahn, Zhihan Lv, Jinman Kim, David Dagan Feng

    Abstract: Motion artifacts compromise the quality of magnetic resonance imaging (MRI) and pose challenges to achieving diagnostic outcomes and image-guided therapies. In recent years, supervised deep learning approaches have emerged as successful solutions for motion artifact reduction (MAR). One disadvantage of these methods is their dependency on acquiring paired sets of motion artifact-corrupted (MA-corr… ▽ More

    Submitted 5 July, 2024; v1 submitted 4 January, 2023; originally announced January 2023.

  16. arXiv:2208.01214  [pdf, other

    cs.SD cs.LG eess.AS

    Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features

    Authors: Jun Xue, Cunhang Fan, Zhao Lv, Jianhua Tao, Jiangyan Yi, Chengshi Zheng, Zhengqi Wen, Minmin Yuan, Shegang Shao

    Abstract: Recently, pioneer research works have proposed a large number of acoustic features (log power spectrogram, linear frequency cepstral coefficients, constant Q cepstral coefficients, etc.) for audio deepfake detection, obtaining good performance, and showing that different subbands have different contributions to audio deepfake detection. However, this lacks an explanation of the specific informatio… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

  17. arXiv:2206.02442  [pdf, other

    eess.SP

    Pervasive wireless channel modeling theory and applications to 6G GBSMs for all frequency bands and all scenarios

    Authors: Cheng-Xiang Wang, Zhen Lv, Xiqi Gao, Xiaohu You, Yang Hao, Harald Haas

    Abstract: In this paper, a pervasive wireless channel modeling theory is first proposed, which uses a unified channel modeling method and a unified equation of channel impulse response (CIR), and can integrate important channel characteristics at different frequency bands and scenarios. Then, we apply the proposed theory to a three dimensional (3D) space-time-frequency (STF) non-stationary geometry-based st… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

  18. arXiv:2203.15249  [pdf, other

    cs.SD eess.AS

    MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification

    Authors: Yang Zhang, Zhiqiang Lv, Haibin Wu, Shanshan Zhang, Pengfei Hu, Zhiyong Wu, Hung-yi Lee, Helen Meng

    Abstract: In this paper, we present Multi-scale Feature Aggregation Conformer (MFA-Conformer), an easy-to-implement, simple but effective backbone for automatic speaker verification based on the Convolution-augmented Transformer (Conformer). The architecture of the MFA-Conformer is inspired by recent stateof-the-art models in speech recognition and speaker verification. Firstly, we introduce a convolution s… ▽ More

    Submitted 10 November, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: accepted by INTERSPEECH 2022

  19. arXiv:2110.15316  [pdf

    cs.SD eess.AS

    VRM-Phase I VKW system description of long-short video customizable keyword wakeup challenge

    Authors: Yougen Yuan, Zhiqiang Lv, Shen Huang, Pengfei Hu

    Abstract: Keyword wakeup technology has always been a research hotspot in speech processing, but many related works were done on different datasets. We organized a Chinese long-short video keyword wakeup challenge (Video Keyword Wakeup Challenge, VKW) for testing the ability of each participating team to build a keyword wakeup system under the public dataset. All submitted systems not only need to support t… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: 6 pages, in Chinese language, 3 tables, NCMMC 2021 conference paper

  20. arXiv:1910.04541  [pdf

    eess.SY

    Stochastic Dispatch of Energy Storage in Microgrids: An Augmented Reinforcement Learning Approach

    Authors: Yuwei Shang, Wenchuan Wu, Jianbo Guo, Zhe Lv, Zhao Ma, Wanxing Sheng, Ran Chen

    Abstract: The dynamic dispatch (DD) of battery energy storage systems (BESSs) in microgrids integrated with volatile energy resources is essentially a multiperiod stochastic optimization problem (MSOP). Because the life span of a BESS is significantly affected by its charging and discharging behaviors, its lifecycle degradation costs should be incorporated into the DD model of BESSs, which makes it non-conv… ▽ More

    Submitted 4 July, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

    Journal ref: Applied Energy. Volume 261, 1 March 2020