Skip to main content

Showing 1–50 of 333 results for author: Guan, Y

  1. arXiv:2407.12701  [pdf, other

    cs.CR

    Efficient and Flexible Differet-Radix Montgomery Modular Multiplication for Hardware Implementation

    Authors: Yuxuan Zhang, Hua Guo, Chen Chen, Yewei Guan, Xiyong Zhang, Zhenyu Guan

    Abstract: Montgomery modular multiplication is widely-used in public key cryptosystems (PKC) and affects the efficiency of upper systems directly. However, modulus is getting larger due to the increasing demand of security, which results in a heavy computing cost. High-performance implementation of Montgomery modular multiplication is urgently required to ensure the highly-efficient operations in PKC. Howev… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  2. arXiv:2407.12258  [pdf, other

    cs.CV

    Facial Affect Recognition based on Multi Architecture Encoder and Feature Fusion for the ABAW7 Challenge

    Authors: Kang Shen, Xuxiong Liu, Boyan Wang, Jun Yao, Xin Liu, Yujie Guan, Yu Wang, Gengchen Li, Xiao Sun

    Abstract: In this paper, we present our approach to addressing the challenges of the 7th ABAW competition. The competition comprises three sub-challenges: Valence Arousal (VA) estimation, Expression (Expr) classification, and Action Unit (AU) detection. To tackle these challenges, we employ state-of-the-art models to extract powerful visual features. Subsequently, a Transformer Encoder is utilized to integr… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  3. arXiv:2407.07061  [pdf, other

    cs.CL

    Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

    Authors: Weize Chen, Ziming You, Ran Li, Yitong Guan, Chen Qian, Chenyang Zhao, Cheng Yang, Ruobing Xie, Zhiyuan Liu, Maosong Sun

    Abstract: The rapid advancement of large language models (LLMs) has paved the way for the development of highly capable autonomous agents. However, existing multi-agent frameworks often struggle with integrating diverse capable third-party agents due to reliance on agents defined within their own ecosystems. They also face challenges in simulating distributed environments, as most frameworks are limited to… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: work in progress

  4. arXiv:2407.05249  [pdf, ps, other

    cs.IT eess.SP

    RIS-assisted Coverage Enhancement in mmWave Integrated Sensing and Communication Networks

    Authors: Xu Gan, Chongwen Huang, Zhaohui Yang, Xiaoming Chen, Faouzi Bader, Zhaoyang Zhang, Chau Yuen, Yong Liang Guan, Merouane Debbah

    Abstract: Integrated sensing and communication (ISAC) has emerged as a promising technology to facilitate high-rate communications and super-resolution sensing, particularly operating in the millimeter wave (mmWave) band. However, the vulnerability of mmWave signals to blockages severely impairs ISAC capabilities and coverage. To tackle this, an efficient and low-cost solution is to deploy distributed recon… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  5. arXiv:2407.04561  [pdf, other

    cs.NI eess.SP

    Wireless Spectrum in Rural Farmlands: Status, Challenges and Opportunities

    Authors: Mukaram Shahid, Kunal Das, Taimoor Ul Islam, Christ Somiah, Daji Qiao, Arsalan Ahmad, Jimming Song, Zhengyuan Zhu, Sarath Babu, Yong Guan, Tusher Chakraborty, Suraj Jog, Ranveer Chandra, Hongwei Zhang

    Abstract: Due to factors such as low population density and expansive geographical distances, network deployment falls behind in rural regions, leading to a broadband divide. Wireless spectrum serves as the blood and flesh of wireless communications. Shared white spaces such as those in the TVWS and CBRS spectrum bands offer opportunities to expand connectivity, innovate, and provide affordable access to hi… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  6. arXiv:2407.02372  [pdf, ps, other

    cs.DS math.NA

    Finer-Grained Hardness of Kernel Density Estimation

    Authors: Josh Alman, Yunfeng Guan

    Abstract: In batch Kernel Density Estimation (KDE) for a kernel function $f$, we are given as input $2n$ points $x^{(1)}, \cdots, x^{(n)}, y^{(1)}, \cdots, y^{(n)}$ in dimension $m$, as well as a vector $v \in \mathbb{R}^n$. These inputs implicitly define the $n \times n$ kernel matrix $K$ given by $K[i,j] = f(x^{(i)}, y^{(j)})$. The goal is to compute a vector $v$ which approximates $K w$ with… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 30 pages, to appear in the 39th Computational Complexity Conference (CCC 2024)

  7. arXiv:2406.11441  [pdf, other

    cs.CV

    SWCF-Net: Similarity-weighted Convolution and Local-global Fusion for Efficient Large-scale Point Cloud Semantic Segmentation

    Authors: Zhenchao Lin, Li He, Hongqiang Yang, Xiaoqun Sun, Cuojin Zhang, Weinan Chen, Yisheng Guan, Hong Zhang

    Abstract: Large-scale point cloud consists of a multitude of individual objects, thereby encompassing rich structural and underlying semantic contextual information, resulting in a challenging problem in efficiently segmenting a point cloud. Most existing researches mainly focus on capturing intricate local features without giving due consideration to global ones, thus failing to leverage semantic context.… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  8. arXiv:2406.10289  [pdf, other

    cs.CL cs.AI cs.IR

    VeraCT Scan: Retrieval-Augmented Fake News Detection with Justifiable Reasoning

    Authors: Cheng Niu, Yang Guan, Yuanhao Wu, Juno Zhu, Juntong Song, Randy Zhong, Kaihua Zhu, Siliang Xu, Shizhe Diao, Tong Zhang

    Abstract: The proliferation of fake news poses a significant threat not only by disseminating misleading information but also by undermining the very foundations of democracy. The recent advance of generative artificial intelligence has further exacerbated the challenge of distinguishing genuine news from fabricated stories. In response to this challenge, we introduce VeraCT Scan, a novel retrieval-augmente… ▽ More

    Submitted 24 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  9. arXiv:2406.08204  [pdf, other

    cs.CV

    Diffusion-Promoted HDR Video Reconstruction

    Authors: Yuanshen Guan, Ruikang Xu, Mingde Yao, Ruisheng Gao, Lizhi Wang, Zhiwei Xiong

    Abstract: High dynamic range (HDR) video reconstruction aims to generate HDR videos from low dynamic range (LDR) frames captured with alternating exposures. Most existing works solely rely on the regression-based paradigm, leading to adverse effects such as ghosting artifacts and missing details in saturated regions. In this paper, we propose a diffusion-promoted method for HDR video reconstruction, termed… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Arxiv Preprint

  10. arXiv:2406.06582  [pdf, ps, other

    cs.CL cs.LG eess.AS

    Discrete Multimodal Transformers with a Pretrained Large Language Model for Mixed-Supervision Speech Processing

    Authors: Viet Anh Trinh, Rosy Southwell, Yiwen Guan, Xinlu He, Zhiyong Wang, Jacob Whitehill

    Abstract: Recent work on discrete speech tokenization has paved the way for models that can seamlessly perform multiple tasks across modalities, e.g., speech recognition, text to speech, speech to speech translation. Moreover, large language models (LLMs) pretrained from vast text corpora contain rich linguistic information that can improve accuracy in a variety of tasks. In this paper, we present a decoder… ▽ More

    Submitted 25 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  11. arXiv:2406.04801  [pdf, other

    cs.CV

    MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks

    Authors: Xingkui Zhu, Yiran Guan, Dingkang Liang, Yuchao Chen, Yuliang Liu, Xiang Bai

    Abstract: The sparsely activated mixture of experts (MoE) model presents a promising alternative to traditional densely activated (dense) models, enhancing both quality and computational efficiency. However, training MoE models from scratch demands extensive data and computational resources. Moreover, public repositories like timm mainly provide pre-trained dense checkpoints, lacking similar resources for M… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 9 pages, 6 figures

    ACM Class: I.2

  12. arXiv:2406.04632  [pdf, other

    cs.MM

    StreamOptix: A Cross-layer Adaptive Video Delivery Scheme

    Authors: Mufan Liu, Le Yang, Yifan Wang, Yiling Xu, Ye-Kui Wang, Yunfeng Guan

    Abstract: This paper presents a cross-layer video delivery scheme, StreamOptix, and proposes a joint optimization algorithm for video delivery that leverages the characteristics of the physical (PHY), medium access control (MAC), and application (APP) layers. Most existing methods for optimizing video transmission over different layers were developed individually. Realizing a cross-layer design has always b… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: under review in Transactions on Multimedia (TMM)

  13. arXiv:2406.04594  [pdf, other

    cs.DC cs.AI cs.LG

    Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

    Authors: Jianbo Dong, Bin Luo, Jun Zhang, Pengcheng Zhang, Fei Feng, Yikai Zhu, Ang Liu, Zian Chen, Yi Shi, Hairong Jiao, Gang Lu, Yu Guan, Ennan Zhai, Wencong Xiao, Hanyu Zhao, Man Yuan, Siran Yang, Xiang Li, Jiamang Wang, Rui Men, Jianwei Zhang, Huang Zhong, Dennis Cai, Yuan Xie, Binzhang Fu

    Abstract: The emergence of Large Language Models (LLMs) has necessitated the adoption of parallel training techniques, involving the deployment of thousands of GPUs to train a single model. Unfortunately, we have found that the efficiency of current parallel training is often suboptimal, largely due to the following two main issues. Firstly, hardware failures are inevitable, leading to interruptions in the… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  14. arXiv:2406.02511  [pdf, other

    cs.CV cs.AI

    V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation

    Authors: Cong Wang, Kuan Tian, Jun Zhang, Yonghang Guan, Feng Luo, Fei Shen, Zhiwei Jiang, Qing Gu, Xiao Han, Wei Yang

    Abstract: In the field of portrait video generation, the use of single images to generate portrait videos has become increasingly prevalent. A common approach involves leveraging generative models to enhance adapters for controlled generation. However, control signals (e.g., text, audio, reference image, pose, depth map, etc.) can vary in strength. Among these, weaker conditions often struggle to be effecti… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  15. arXiv:2406.01065  [pdf, other

    cs.LG cs.AI

    Causal prompting model-based offline reinforcement learning

    Authors: Xuehui Yu, Yi Guan, Rujia Shen, Xin Li, Chen Tang, Jingchi Jiang

    Abstract: Model-based offline Reinforcement Learning (RL) allows agents to fully utilise pre-collected datasets without requiring additional or unethical explorations. However, applying model-based offline RL to online systems presents challenges, primarily due to the highly suboptimal (noise-filled) and diverse nature of datasets generated by online systems. To tackle these issues, we introduce the Causal… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  16. arXiv:2405.19349  [pdf, other

    eess.SP cs.CV cs.HC cs.LG

    Beyond Isolated Frames: Enhancing Sensor-Based Human Activity Recognition through Intra- and Inter-Frame Attention

    Authors: Shuai Shao, Yu Guan, Victor Sanchez

    Abstract: Human Activity Recognition (HAR) has become increasingly popular with ubiquitous computing, driven by the popularity of wearable sensors in fields like healthcare and sports. While Convolutional Neural Networks (ConvNets) have significantly contributed to HAR, they often adopt a frame-by-frame analysis, concentrating on individual frames and potentially overlooking the broader temporal dynamics in… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  17. arXiv:2405.17458  [pdf, other

    cs.LG cs.AI

    Blood Glucose Control Via Pre-trained Counterfactual Invertible Neural Networks

    Authors: Jingchi Jiang, Rujia Shen, Boran Wang, Yi Guan

    Abstract: Type 1 diabetes mellitus (T1D) is characterized by insulin deficiency and blood glucose (BG) control issues. The state-of-the-art solution for continuous BG control is reinforcement learning (RL), where an agent can dynamically adjust exogenous insulin doses in time to maintain BG levels within the target range. However, due to the lack of action guidance, the agent often needs to learn from rando… ▽ More

    Submitted 18 July, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  18. arXiv:2405.17082  [pdf, other

    cs.CV

    Ensembling Diffusion Models via Adaptive Feature Aggregation

    Authors: Cong Wang, Kuan Tian, Yonghang Guan, Jun Zhang, Zhiwei Jiang, Fei Shen, Xiao Han, Qing Gu, Wei Yang

    Abstract: The success of the text-guided diffusion model has inspired the development and release of numerous powerful diffusion models within the open-source community. These models are typically fine-tuned on various expert datasets, showcasing diverse denoising capabilities. Leveraging multiple high-quality models to produce stronger generation ability is valuable, but has not been extensively studied. E… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  19. arXiv:2405.06890  [pdf, other

    cs.CL cs.AI

    TacoERE: Cluster-aware Compression for Event Relation Extraction

    Authors: Yong Guan, Xiaozhi Wang, Lei Hou, Juanzi Li, Jeff Pan, Jiaoyan Chen, Freddy Lecue

    Abstract: Event relation extraction (ERE) is a critical and fundamental challenge for natural language processing. Existing work mainly focuses on directly modeling the entire document, which cannot effectively handle long-range dependencies and information redundancy. To address these issues, we propose a cluster-aware compression method for improving event relation extraction (TacoERE), which explores a c… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted to LREC-COLING 2024

  20. arXiv:2405.06886  [pdf, other

    cs.IR cs.AI cs.CL

    Event GDR: Event-Centric Generative Document Retrieval

    Authors: Yong Guan, Dingxiao Liu, Jinchen Ma, Hao Peng, Xiaozhi Wang, Lei Hou, Ru Li

    Abstract: Generative document retrieval, an emerging paradigm in information retrieval, learns to build connections between documents and identifiers within a single model, garnering significant attention. However, there are still two challenges: (1) neglecting inner-content correlation during document representation; (2) lacking explicit semantic structure during identifier construction. Nonetheless, event… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted to WWW 2024

  21. arXiv:2405.03196  [pdf, ps, other

    cs.IT

    Design and Analysis of Massive Uncoupled Unsourced Random Access with Bayesian Joint Decoding

    Authors: Feiyan Tian, Xiaoming Chen, Yong Liang Guan, Chau Yuen

    Abstract: In this paper, we investigate unsourced random access for massive machine-type communications (mMTC) in the sixth-generation (6G) wireless networks. Firstly, we establish a high-efficiency uncoupled framework for massive unsourced random access without extra parity check bits. Then, we design a low-complexity Bayesian joint decoding algorithm, including codeword detection and stitching. In particu… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  22. arXiv:2405.02145  [pdf, other

    cs.RO

    Characterized Diffusion and Spatial-Temporal Interaction Network for Trajectory Prediction in Autonomous Driving

    Authors: Haicheng Liao, Xuelin Li, Yongkang Li, Hanlin Kong, Chengyue Wang, Bonan Wang, Yanchen Guan, KaHou Tam, Zhenning Li, Chengzhong Xu

    Abstract: Trajectory prediction is a cornerstone in autonomous driving (AD), playing a critical role in enabling vehicles to navigate safely and efficiently in dynamic environments. To address this task, this paper presents a novel trajectory prediction model tailored for accuracy in the face of heterogeneous and uncertain traffic scenarios. At the heart of this model lies the Characterized Diffusion Module… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI 2024

  23. arXiv:2404.17520  [pdf, other

    cs.RO

    A Cognitive-Driven Trajectory Prediction Model for Autonomous Driving in Mixed Autonomy Environment

    Authors: Haicheng Liao, Zhenning Li, Chengyue Wang, Bonan Wang, Hanlin Kong, Yanchen Guan, Guofa Li, Zhiyong Cui, Chengzhong Xu

    Abstract: As autonomous driving technology progresses, the need for precise trajectory prediction models becomes paramount. This paper introduces an innovative model that infuses cognitive insights into trajectory prediction, focusing on perceived safety and dynamic decision-making. Distinct from traditional approaches, our model excels in analyzing interactions and behavior patterns in mixed autonomy traff… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  24. arXiv:2404.15282  [pdf, other

    cs.DL cs.AI

    Patent Value Characterization -- An Empirical Analysis of Elevator Industry Patents

    Authors: Yuhang Guan, Runzheng Wang, Lei Fu, Huanle Zhang

    Abstract: The global patent application count has steadily increased, achieving eight consecutive years of growth.The global patent industry has shown a general trend of expansion. This is attributed to the increasing innovation activities, particularly in the fields of technology, healthcare, and biotechnology. Some emerging market countries, such as China and India, have experienced significant growth in… ▽ More

    Submitted 20 February, 2024; originally announced April 2024.

  25. arXiv:2403.08343  [pdf, ps, other

    cs.IT eess.SP

    Coverage and Rate Analysis for Integrated Sensing and Communication Networks

    Authors: Xu Gan, Chongwen Huang, Zhaohui Yang, Xiaoming Chen, Jiguang He, Zhaoyang Zhang, Chau Yuen, Yong Liang Guan, Mérouane Debbah

    Abstract: Integrated sensing and communication (ISAC) is increasingly recognized as a pivotal technology for next-generation cellular networks, offering mutual benefits in both sensing and communication capabilities. This advancement necessitates a re-examination of the fundamental limits within networks where these two functions coexist via shared spectrum and infrastructures. However, traditional stochast… ▽ More

    Submitted 22 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  26. arXiv:2403.02622  [pdf, other

    cs.LG cs.AI cs.RO

    World Models for Autonomous Driving: An Initial Survey

    Authors: Yanchen Guan, Haicheng Liao, Zhenning Li, Jia Hu, Runze Yuan, Yunjian Li, Guohui Zhang, Chengzhong Xu

    Abstract: In the rapidly evolving landscape of autonomous driving, the capability to accurately predict future events and assess their implications is paramount for both safety and efficiency, critically aiding the decision-making process. World models have emerged as a transformative approach, enabling autonomous driving systems to synthesize and interpret vast amounts of sensor data, thereby predicting po… ▽ More

    Submitted 7 May, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  27. arXiv:2402.16430  [pdf, other

    cs.CR cs.HC

    Improving behavior based authentication against adversarial attack using XAI

    Authors: Dong Qin, George Amariucai, Daji Qiao, Yong Guan

    Abstract: In recent years, machine learning models, especially deep neural networks, have been widely used for classification tasks in the security domain. However, these models have been shown to be vulnerable to adversarial manipulation: small changes learned by an adversarial attack model, when applied to the input, can cause significant changes in the output. Most research on adversarial attacks and cor… ▽ More

    Submitted 10 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

  28. arXiv:2402.10876  [pdf, other

    cs.DC

    Accelerating Sparse DNNs Based on Tiled GEMM

    Authors: Cong Guo, Fengchen Xue, Jingwen Leng, Yuxian Qiu, Yue Guan, Weihao Cui, Quan Chen, Minyi Guo

    Abstract: Network pruning can reduce the computation cost of deep neural network (DNN) models. However, sparse models often produce randomly-distributed weights to maintain accuracy, leading to irregular computations. Consequently, unstructured sparse models cannot achieve meaningful speedup on commodity hardware built for dense matrix computations. Accelerators are usually modified or designed with structu… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE Transactions on Computers. arXiv admin note: substantial text overlap with arXiv:2008.13006

  29. arXiv:2402.08224  [pdf, ps, other

    cs.IT eess.SP

    Two-Dimensional Direction-of-Arrival Estimation Using Stacked Intelligent Metasurfaces

    Authors: Jiancheng An, Chau Yuen, Yong Liang Guan, Marco Di Renzo, Mérouane Debbah, H. Vincent Poor, Lajos Hanzo

    Abstract: Stacked intelligent metasurfaces (SIM) are capable of emulating reconfigurable physical neural networks by relying on electromagnetic (EM) waves as carriers. They can also perform various complex computational and signal processing tasks. A SIM is fabricated by densely integrating multiple metasurface layers, each consisting of a large number of small meta-atoms that can control the EM waves passi… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 37 pages, 12 figures, and 2 tables. arXiv admin note: text overlap with arXiv:2310.09861

  30. arXiv:2402.00455  [pdf, ps, other

    cs.IT eess.SP

    Tighter Lower Bounds on Aperiodic Ambiguity Function and Their Asymptotic Achievability

    Authors: Lingsheng Meng, Yong Liang Guan, Yao Ge, Zilong Liu, Pingzhi Fan

    Abstract: This paper presents tighter lower bounds on the maximum aperiodic ambiguity function (AF) magnitude of unimodular sequences under certain delay-Doppler low ambiguity zones (LAZ). These bounds are derived by exploiting the upper and lower bounds on the Frobenius norm of the weighted auto- and cross-AF matrices, with the introduction of two weight vectors associated with the delay and Doppler shifts… ▽ More

    Submitted 18 July, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: 25 pages, 2 figure

  31. arXiv:2401.17509  [pdf, other

    cs.CV

    Anything in Any Scene: Photorealistic Video Object Insertion

    Authors: Chen Bai, Zeman Shao, Guoxiang Zhang, Di Liang, Jie Yang, Zhuorui Zhang, Yujian Guo, Chengzhang Zhong, Yiqiao Qiu, Zhendong Wang, Yichen Guan, Xiaoyin Zheng, Tao Wang, Cheng Lu

    Abstract: Realistic video simulation has shown significant potential across diverse applications, from virtual reality to film production. This is particularly true for scenarios where capturing videos in real-world settings is either impractical or expensive. Existing approaches in video simulation often fail to accurately model the lighting environment, represent the object geometry, or achieve high level… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  32. arXiv:2401.15820  [pdf, other

    cs.CV cs.AI

    Knowledge-Aware Neuron Interpretation for Scene Classification

    Authors: Yong Guan, Freddy Lecue, Jiaoyan Chen, Ru Li, Jeff Z. Pan

    Abstract: Although neural models have achieved remarkable performance, they still encounter doubts due to the intransparency. To this end, model prediction explanation is attracting more and more attentions. However, current methods rarely incorporate external knowledge and still suffer from three limitations: (1) Neglecting concept completeness. Merely selecting concepts may not sufficient for prediction.… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: Accepted to AAAI2024

  33. arXiv:2401.12377  [pdf, other

    cs.AR

    ACS: Concurrent Kernel Execution on Irregular, Input-Dependent Computational Graphs

    Authors: Sankeerth Durvasula, Adrian Zhao, Raymond Kiguru, Yushi Guan, Zhonghan Chen, Nandita Vijaykumar

    Abstract: GPUs are widely used to accelerate many important classes of workloads today. However, we observe that several important emerging classes of workloads, including simulation engines for deep reinforcement learning and dynamic neural networks, are unable to fully utilize the massive parallelism that GPUs offer. These applications tend to have kernels that are small in size, i.e., have few thread blo… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  34. arXiv:2401.09384  [pdf, other

    cs.GR cs.CV cs.LG

    Diverse Part Synthesis for 3D Shape Creation

    Authors: Yanran Guan, Oliver van Kaick

    Abstract: Methods that use neural networks for synthesizing 3D shapes in the form of a part-based representation have been introduced over the last few years. These methods represent shapes as a graph or hierarchy of parts and enable a variety of applications such as shape sampling and reconstruction. However, current methods do not allow easily regenerating individual shape parts according to user preferen… ▽ More

    Submitted 16 July, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  35. arXiv:2401.05850  [pdf, other

    cs.SD eess.AS

    Contrastive Loss Based Frame-wise Feature disentanglement for Polyphonic Sound Event Detection

    Authors: Yadong Guan, Jiqing Han, Hongwei Song, Wenjie Song, Guibin Zheng, Tieran Zheng, Yongjun He

    Abstract: Overlapping sound events are ubiquitous in real-world environments, but existing end-to-end sound event detection (SED) methods still struggle to detect them effectively. A critical reason is that these methods represent overlapping events using shared and entangled frame-wise features, which degrades the feature discrimination. To solve the problem, we propose a disentangled feature learning fram… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: accepted by icassp2024

  36. GreenFlow: A Computation Allocation Framework for Building Environmentally Sound Recommendation System

    Authors: Xingyu Lu, Zhining Liu, Yanchu Guan, Hongxuan Zhang, Chenyi Zhuang, Wenqi Ma, Yize Tan, Jinjie Gu, Guannan Zhang

    Abstract: Given the enormous number of users and items, industrial cascade recommendation systems (RS) are continuously expanded in size and complexity to deliver relevant items, such as news, services, and commodities, to the appropriate users. In a real-world scenario with hundreds of thousands requests per second, significant computation is required to infer personalized results for each request, resulti… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Journal ref: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence AI for Good. Pages 6103-6111

  37. arXiv:2312.09452  [pdf, other

    eess.SP cs.IT

    Efficient Multi-Pair IoT Communication with Holographically Enhanced Meta-Surfaces Leveraging OAM Beams: Bridging Theory and Prototype

    Authors: Yufei Zhao, Yong Liang Guan, Afkar Mohamed Ismail, Gaohua Ju, Deyu Lin, Yilong Lu, Chau Yuen

    Abstract: Meta-surfaces, also known as Reconfigurable Intelligent Surfaces (RIS), have emerged as a cost-effective, low power consumption, and flexible solution for enabling multiple applications in Internet of Things (IoT). However, in the context of meta-surface-assisted multi-pair IoT communications, significant interference issues often arise amount multiple channels. This issue is particularly pronounc… ▽ More

    Submitted 18 November, 2023; originally announced December 2023.

    Comments: Meta-surface, RIS, Internet-of-Things (IoT), Line-of-Sight (LoS), Orbital Angular Momentum (OAM), holographic communications, multi-user

  38. arXiv:2312.09439  [pdf, other

    eess.SP cs.RO eess.SY

    Smart Roads: Roadside Perception, Vehicle-Road Cooperation and Business Model

    Authors: Rui Chen, Lu Gao, Yutian Liu, Yong Liang Guan, Yan Zhang

    Abstract: Smart roads have become an essential component of intelligent transportation systems (ITS). The roadside perception technology, a critical aspect of smart roads, utilizes various sensors, roadside units (RSUs), and edge computing devices to gather real-time traffic data for vehicle-road cooperation. However, the full potential of smart roads in improving the safety and efficiency of autonomous veh… ▽ More

    Submitted 19 October, 2023; originally announced December 2023.

  39. arXiv:2312.08214  [pdf, other

    cs.IT eess.SP

    A Precoding for ORIS-Assisted MIMO Multi-User VLC System

    Authors: Mahmoud Atashbar, Hamed Alizadeh Ghazijahani, Yong Liang Guan, Zhaojie Yang

    Abstract: In this paper, we study a multi-user visible light communication (VLC) system assisted with optical reflecting intelligent surface (ORIS). Joint precoding and alignment matrices are designed to maximize the average signal-to-interference plus noise ratio (SINR) criteria. Considering the constraints of the constant mean transmission power of LEDs and the power associated with all users, an optimiza… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: 5 pages, 3 figures

  40. arXiv:2312.06677  [pdf, other

    cs.LG cs.AI cs.CL

    Intelligent Virtual Assistants with LLM-based Process Automation

    Authors: Yanchu Guan, Dong Wang, Zhixuan Chu, Shiyu Wang, Feiyue Ni, Ruihua Song, Longfei Li, Jinjie Gu, Chenyi Zhuang

    Abstract: While intelligent virtual assistants like Siri, Alexa, and Google Assistant have become ubiquitous in modern life, they still face limitations in their ability to follow multi-step instructions and accomplish complex goals articulated in natural language. However, recent breakthroughs in large language models (LLMs) show promise for overcoming existing barriers by enhancing natural language proces… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  41. arXiv:2312.04854  [pdf, other

    cs.CL cs.AI

    Learning to Break: Knowledge-Enhanced Reasoning in Multi-Agent Debate System

    Authors: Haotian Wang, Xiyuan Du, Weijiang Yu, Qianglong Chen, Kun Zhu, Zheng Chu, Lian Yan, Yi Guan

    Abstract: Multi-agent debate system (MAD) imitating the process of human discussion in pursuit of truth, aims to align the correct cognition of different agents for the optimal solution. It is challenging to make various agents perform right and highly consistent cognition due to their limited and different knowledge backgrounds (i.e., cognitive islands), which hinders the search for the optimal solution. T… ▽ More

    Submitted 11 July, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: 18 pages, 10 figures, work in progress

  42. arXiv:2312.04598  [pdf, other

    cs.RO

    Formalization of Robot Collision Detection Method based on Conformal Geometric Algebra

    Authors: Yingjie Wu, Guohui Wang, Shanyan Chen, Zhiping Shi, Yong Guan, Ximeng Li

    Abstract: Cooperative robots can significantly assist people in their productive activities, improving the quality of their works. Collision detection is vital to ensure the safe and stable operation of cooperative robots in productive activities. As an advanced geometric language, conformal geometric algebra can simplify the construction of the robot collision model and the calculation of collision distanc… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  43. arXiv:2312.00907  [pdf, other

    cs.LG cs.CE physics.ao-ph physics.comp-ph physics.flu-dyn

    Extreme Event Prediction with Multi-agent Reinforcement Learning-based Parametrization of Atmospheric and Oceanic Turbulence

    Authors: Rambod Mojgani, Daniel Waelchli, Yifei Guan, Petros Koumoutsakos, Pedram Hassanzadeh

    Abstract: Global climate models (GCMs) are the main tools for understanding and predicting climate change. However, due to limited numerical resolutions, these models suffer from major structural uncertainties; e.g., they cannot resolve critical processes such as small-scale eddies in atmospheric and oceanic turbulence. Thus, such small-scale processes have to be represented as a function of the resolved sc… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  44. Multi-mode OAM Convergent Transmission with Co-divergent Angle Tailored by Airy Wavefront

    Authors: Yufei Zhao, Ziyang Wang, Yilong Lu, Yong Liang Guan

    Abstract: Wireless backhaul offers a more cost-effective, time-efficient, and reconfigurable solution than wired backhaul to connect the edge-computing cells to the core network. As the amount of transmitted data increases, the low-rank characteristic of Line-of-Sight (LoS) channel severely limits the growth of channel capacity in the point-to-point backhaul transmission scenario. Orbital Angular Momentum (… ▽ More

    Submitted 18 November, 2023; originally announced December 2023.

    Comments: Airy beam, line-of-sight channel, orbital angular momentum, OAM multi-mode, wireless communication

    Journal ref: IEEE Transactions on Antennas and Propagation (Volume: 71, Issue: 6, June 2023)

  45. arXiv:2311.09105  [pdf, other

    cs.CL

    MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation

    Authors: Xiaozhi Wang, Hao Peng, Yong Guan, Kaisheng Zeng, Jianhui Chen, Lei Hou, Xu Han, Yankai Lin, Zhiyuan Liu, Ruobing Xie, Jie Zhou, Juanzi Li

    Abstract: Understanding events in texts is a core objective of natural language understanding, which requires detecting event occurrences, extracting event arguments, and analyzing inter-event relationships. However, due to the annotation challenges brought by task complexity, a large-scale dataset covering the full process of event understanding has long been absent. In this paper, we introduce MAVEN-Arg,… ▽ More

    Submitted 18 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted at ACL 2024. Camera-ready version

  46. Image Patch-Matching with Graph-Based Learning in Street Scenes

    Authors: Rui She, Qiyu Kang, Sijie Wang, Wee Peng Tay, Yong Liang Guan, Diego Navarro Navarro, Andreas Hartmannsgruber

    Abstract: Matching landmark patches from a real-time image captured by an on-vehicle camera with landmark patches in an image database plays an important role in various computer perception tasks for autonomous driving. Current methods focus on local matching for regions of interest and do not take into account spatial neighborhood relationships among the image patches, which typically correspond to objects… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  47. arXiv:2310.18131  [pdf, other

    cs.CV

    End-to-end Video Gaze Estimation via Capturing Head-face-eye Spatial-temporal Interaction Context

    Authors: Yiran Guan, Zhuoguang Chen, Wenzheng Zeng, Zhiguo Cao, Yang Xiao

    Abstract: In this letter, we propose a new method, Multi-Clue Gaze (MCGaze), to facilitate video gaze estimation via capturing spatial-temporal interaction context among head, face, and eye in an end-to-end learning way, which has not been well concerned yet. The main advantage of MCGaze is that the tasks of clue localization of head, face, and eye can be solved jointly for gaze estimation in a one-step way… ▽ More

    Submitted 29 December, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

    Comments: This paper is accepted by IEEE Signal Processing Letters. The code has been released at https://github.com/zgchen33/MCGaze

  48. arXiv:2310.17945  [pdf, other

    cs.LG cs.AI

    A Comprehensive and Reliable Feature Attribution Method: Double-sided Remove and Reconstruct (DoRaR)

    Authors: Dong Qin, George Amariucai, Daji Qiao, Yong Guan, Shen Fu

    Abstract: The limited transparency of the inner decision-making mechanism in deep neural networks (DNN) and other machine learning (ML) models has hindered their application in several domains. In order to tackle this issue, feature attribution methods have been developed to identify the crucial features that heavily influence decisions made by these black box models. However, many feature attribution metho… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: 16 pages, 22 figures

  49. arXiv:2310.12848  [pdf, other

    cs.CV

    Neural Degradation Representation Learning for All-In-One Image Restoration

    Authors: Mingde Yao, Ruikang Xu, Yuanshen Guan, Jie Huang, Zhiwei Xiong

    Abstract: Existing methods have demonstrated effective performance on a single degradation type. In practical applications, however, the degradation is often unknown, and the mismatch between the model and the degradation will result in a severe performance drop. In this paper, we propose an all-in-one image restoration network that tackles multiple degradations. Due to the heterogeneous nature of different… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

  50. arXiv:2310.10457  [pdf, other

    cs.IT eess.SP

    Flag Sequence Set Design for Low-Complexity Delay-Doppler Estimation

    Authors: Lingsheng Meng, Yong Liang Guan, Yao Ge, Zilong Liu

    Abstract: This paper studies Flag sequences for low-complexity delay-Doppler estimation by exploiting their distinctive peak-curtain ambiguity functions (AFs). Unlike the existing Flag sequence designs that are limited to prime lengths and periodic auto-AFs, we aim to design Flag sequence sets of arbitrary lengths with low (nontrivial) periodic/aperiodic auto- and cross-AFs. Since every Flag sequence consis… ▽ More

    Submitted 2 June, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: 14 pages, 7 figures, 1 table